The first bug is a race condition when acquiring the lock of a process's
parent. An example code with race condition looks like below:
```rust
let process : ProessRef = current!().process();
let parent : ProcessRef = process.parent();
let parent_guard : SgxMutexGuard<ProesssInner> = parent.inner();
// This assertion may fail because the process's parent may change to another
// process before the lock is acquired
assert!(parent.pid() == process.parent().pid());
```
The second bug is that when a process exits, its children processes are
not transfered to the idle process correctly.
1. Move the memory zeroization of mmap to munmap to increase mmap
performance
2. Do memory zeroizaiton during the drop of VMManager to guarentee all
allocated memory is zeroized before the next allocation
It turns out taking a lock in every system call is a significant
performance bottleneck. In light of this finding, we replace a mutex in
a critical path of system call with an atomic boolean.
This rewrite serves three purposes:
1. Fix some subtle bugs in the old implementation;
2. Implement mremap using mmap and munmap so that mremap can automatically
enjoy new features (e.g., mprotect and memory permissions) once mmap and
munmap support the feature.
3. Write down the invariants hold by VMManager explictly so that the correctness
of the new implementation can be reason more easily.
Update the occlum.json to align with the gen_enclave_conf design.
Below is the two updated structures:
"metadata": {
"product_id": 0,
"version_number": 0,
"debuggable": true
},
"resource_limits": {
"max_num_of_threads": 32,
"kernel_space_heap_size": "32MB",
"kernel_space_stack_size": "1MB",
"user_space_size": "256MB"
}
On lightweight Linux distribution, like alpine, getpwuid()
returns NULL, and errno is ENOENT, this patch fix crash
caused by this situation.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Add "untrusted" sections for environment variables defined in Occlum.json. Environment
variable defined in "default" will be shown in libos directly. Environment variable
defined in "untrusted" can be passed from occlum run or PAL layer and can override
the value in "default" and thus is considered "untrusted".
Fix std::alloc::Alloc not found
The lastest Rust changes the trait to std::alloc::AllocRef.
Update the docker files to support sgx 2.9.1
Remove the compilerRT dependency for rust sdk update
Before this commit, the three ECalls of the LibOS enclave do not give
the exact reason on error. In this commit, we modify the enclave entry code
to return the errno and list all possible values of errno in Enclave.edl.
In this commit, we add eight signal-related syscalls
* kill
* tkill
* tgkill
* rt_sigaction
* rt_sigreturn
* rt_sigprocmask
* rt_sigpending
* exit_group
We implement the following major features for signals:
* Generate, mask, and deliver signals
* Support user-defined signal handlers
* Support nested invocation of signal handlers
* Support passing arguments: signum, sigaction, and ucontext
* Support both process-directed and thread-directed signals
* Capture hardware exceptions and convert them to signals
* Deliver fatal signals (like SIGKILL) to kill processes gracefully
But we still have gaps, including but not limited to the points below:
* Convert #PF (page fault) and #GP (general protection) exceptions to signals
* Force delivery of signals via interrupt
* Support simulation mode
When a unix socket only calls function listen, its object is not created
but its status becomes listening. At this time closing the socket would
cause a panic before this commit.
The next generation of Intel CPUs does not support Intel MPX. Enabling MPX
by default crashes the LibOS on startup. So we disable MPX by default. The
long term plan is to turn on/off MPX via compiling options.
This commits improves both readability and correctness of the scheduling-related
system calls. In terms of readability, it extracts all scheduling-related code
ouf of the process/ directory and put it in a sched/ directory. In terms
of correctness, the new scheduling subsystem introduces CpuSet and SchedAgent
types to maintain and manipulate CPU scheduler settings in a secure and robust way.
As a major rewrite to the process/thread subsystem, this commits:
1. Implements threads as a first-class object, which represents a group of OS resources
and a thread of execution;
2. Implements processes as a first-class object that manages threads and maintains
the parent-child relationship between processes;
3. Refactors the code in process subsystem to follow the improved coding style and
conventions emerged in recent commits;
4. Refactors the code in other subsystems to use the new process/thread subsystem.
SEFS depends on version 0.9 of bitvec crate, which has been yanked on crates.io
by the crate author for some reasons. To fix this, we upgrade to the latest
version of bitvec crate.
This commit introduces a unified logging strategy, summarized as below:
1. Use `error!` to mark errors or unexpected conditions, e.g., a
`Result::Err` returned from a system call.
2. Use `warn!` to warn about potentially problematic issues, e.g.,
executing a workaround or fake implementation.
3. Use `info!` to show important events (from users' perspective) in
normal execution, e.g., creating/exiting a process/thread.
4. Use `debug!` to track major events in normal execution, e.g., the
high-level arguments of a system call.
5. Use `trace!` to record the most detailed info, e.g., when a system
call enters and exits the LibOS.
Now one can specify the log level of the LibOS by setting `OCCLUM_LOG_LEVEL`
environment variable. The possible values are "off", "error", "warn",
"info", and "trace".
However, for the sake of security, the log level of a release enclave
(DisableDebug = 1 in Enclave.xml) is always "off" (i.e., no log) regardless of
the log level specified by the untrusted environment.
This commit introduces a system call table, which brings several benefits:
1. The table is a centralized info hub that one can find an answer for every
question about system calls, e.g., what is the number and arguments of a
system call, is it implemented or supported, and if so, what is the
function that actual implements it.
2. System call-related code can be automatically derived from the system call
table through a clever use of macros. In this way, the code avoids repeating
itself.