1. Move the memory zeroization of mmap to munmap to increase mmap
performance
2. Do memory zeroizaiton during the drop of VMManager to guarentee all
allocated memory is zeroized before the next allocation
It turns out taking a lock in every system call is a significant
performance bottleneck. In light of this finding, we replace a mutex in
a critical path of system call with an atomic boolean.
This rewrite serves three purposes:
1. Fix some subtle bugs in the old implementation;
2. Implement mremap using mmap and munmap so that mremap can automatically
enjoy new features (e.g., mprotect and memory permissions) once mmap and
munmap support the feature.
3. Write down the invariants hold by VMManager explictly so that the correctness
of the new implementation can be reason more easily.
Update the occlum.json to align with the gen_enclave_conf design.
Below is the two updated structures:
"metadata": {
"product_id": 0,
"version_number": 0,
"debuggable": true
},
"resource_limits": {
"max_num_of_threads": 32,
"kernel_space_heap_size": "32MB",
"kernel_space_stack_size": "1MB",
"user_space_size": "256MB"
}
Add "untrusted" sections for environment variables defined in Occlum.json. Environment
variable defined in "default" will be shown in libos directly. Environment variable
defined in "untrusted" can be passed from occlum run or PAL layer and can override
the value in "default" and thus is considered "untrusted".
Fix std::alloc::Alloc not found
The lastest Rust changes the trait to std::alloc::AllocRef.
Update the docker files to support sgx 2.9.1
Remove the compilerRT dependency for rust sdk update
Before this commit, the three ECalls of the LibOS enclave do not give
the exact reason on error. In this commit, we modify the enclave entry code
to return the errno and list all possible values of errno in Enclave.edl.
In this commit, we add eight signal-related syscalls
* kill
* tkill
* tgkill
* rt_sigaction
* rt_sigreturn
* rt_sigprocmask
* rt_sigpending
* exit_group
We implement the following major features for signals:
* Generate, mask, and deliver signals
* Support user-defined signal handlers
* Support nested invocation of signal handlers
* Support passing arguments: signum, sigaction, and ucontext
* Support both process-directed and thread-directed signals
* Capture hardware exceptions and convert them to signals
* Deliver fatal signals (like SIGKILL) to kill processes gracefully
But we still have gaps, including but not limited to the points below:
* Convert #PF (page fault) and #GP (general protection) exceptions to signals
* Force delivery of signals via interrupt
* Support simulation mode
When a unix socket only calls function listen, its object is not created
but its status becomes listening. At this time closing the socket would
cause a panic before this commit.
The next generation of Intel CPUs does not support Intel MPX. Enabling MPX
by default crashes the LibOS on startup. So we disable MPX by default. The
long term plan is to turn on/off MPX via compiling options.
This commits improves both readability and correctness of the scheduling-related
system calls. In terms of readability, it extracts all scheduling-related code
ouf of the process/ directory and put it in a sched/ directory. In terms
of correctness, the new scheduling subsystem introduces CpuSet and SchedAgent
types to maintain and manipulate CPU scheduler settings in a secure and robust way.
As a major rewrite to the process/thread subsystem, this commits:
1. Implements threads as a first-class object, which represents a group of OS resources
and a thread of execution;
2. Implements processes as a first-class object that manages threads and maintains
the parent-child relationship between processes;
3. Refactors the code in process subsystem to follow the improved coding style and
conventions emerged in recent commits;
4. Refactors the code in other subsystems to use the new process/thread subsystem.
SEFS depends on version 0.9 of bitvec crate, which has been yanked on crates.io
by the crate author for some reasons. To fix this, we upgrade to the latest
version of bitvec crate.
This commit introduces a unified logging strategy, summarized as below:
1. Use `error!` to mark errors or unexpected conditions, e.g., a
`Result::Err` returned from a system call.
2. Use `warn!` to warn about potentially problematic issues, e.g.,
executing a workaround or fake implementation.
3. Use `info!` to show important events (from users' perspective) in
normal execution, e.g., creating/exiting a process/thread.
4. Use `debug!` to track major events in normal execution, e.g., the
high-level arguments of a system call.
5. Use `trace!` to record the most detailed info, e.g., when a system
call enters and exits the LibOS.
Now one can specify the log level of the LibOS by setting `OCCLUM_LOG_LEVEL`
environment variable. The possible values are "off", "error", "warn",
"info", and "trace".
However, for the sake of security, the log level of a release enclave
(DisableDebug = 1 in Enclave.xml) is always "off" (i.e., no log) regardless of
the log level specified by the untrusted environment.
This commit introduces a system call table, which brings several benefits:
1. The table is a centralized info hub that one can find an answer for every
question about system calls, e.g., what is the number and arguments of a
system call, is it implemented or supported, and if so, what is the
function that actual implements it.
2. System call-related code can be automatically derived from the system call
table through a clever use of macros. In this way, the code avoids repeating
itself.
Before this commit, there are two strange bugs:
1. No backtraces are displayed on panic by Rust; and,
2. Thread local storage in Rust sometimes causes panics.
It turns out that the the root cause of the two bugs are the same: Occlum's
patch to Intel SGX SDK that informs SDK about the stack range of the currnet
LibOS user-level thread. The problem about this patch is that it modifies some
fundamental data structures and Rust SGX SDK does not know the modification.
This causes Rust SGX SDK to panic in certain conditions.
To resolve the conflict for good, this commit gets rid of the patch to Intel
SGX SDK by updating SDK's stack ranges upon user/kernel switch.
1. Use arch_prctl to replace RDFSBASE/WRFSBASE
Ptrace can't get right value if WRFSBASE is called which
will make debugger fail in simulation mode. Use arch_prctl
to replace these instructions in simulation mode.
2. Disable the busy thread in exit_group test
exit_group doesn't have a real implementation yet but test
under SGX simulation mode give core dump for exit_group test.
Disable the busy loop thread and the core dump disappear.
3. Add SDK lib path to LD_LIBRARY_PATH
Linker sometims can't find urts_sim and uae_service_sim when
running. Explicitly add path to LD_LIBRARY_PATH when running
occlum command.
Signed-off-by: sanqian.hcy <sanqian.hcy@antfin.com>
This commits is a dummy implementation of file advisory locks.
Specifically, for regular files, fcntl `F_SETLK` (i.e., acquiring
or releasing locks) always succeeds and fcntl `F_GETLK` (i.e., testing locks)
always returns no locks.
1. Move the system call handling functions into the "syscalls.rs"
2. Split syscall memory safe implementations into small sub-modules
3. Move the unix_socket and io_multiplexing into "net"
4. Remove some unnecessary code
It is slow to allocate big buffers using SGX SDK's malloc. Even worse, it
consumes a large amount of precious trusted memory inside enclaves. This
commit avoids using trusted buffers and allocates untrusted buffers for
sendmsg/recvmsg directly via OCall, thus improving the performance of
sendmsg/recvmsg. Note that this optimization does not affect the security of
network data as it has to be sent/received via OCalls.