Add support for per-process memory size configuration with rlimit
Rlimit are now on the same page of memory space limits defined in Occlum.json. Specific memory size configuration can be set to child process with `prlimit` syscall or using `ulimit` command in shell script.
This commit is contained in:
parent
c43fbfea7f
commit
9b1d694830
@ -152,7 +152,7 @@ Occlum can be configured easily via a config file named `Occlum.json`, which is
|
|||||||
|
|
||||||
### Try Experimental Features
|
### Try Experimental Features
|
||||||
|
|
||||||
Occlum has added several new experimental commands, which provide a more container-like experience to users, as shown below:
|
1. Occlum has added several new experimental commands, which provide a more container-like experience to users, as shown below:
|
||||||
```
|
```
|
||||||
occlum init
|
occlum init
|
||||||
occlum build
|
occlum build
|
||||||
@ -163,6 +163,8 @@ occlum exec <cmd3> <args3>
|
|||||||
occlum stop
|
occlum stop
|
||||||
```
|
```
|
||||||
|
|
||||||
|
2. Occlum has enabled per process resource configuration via `prlimit` syscall (https://man7.org/linux/man-pages//man2/prlimit.2.html) and shell built-in command `ulimit` (https://fishshell.com/docs/current/cmds/ulimit.html). For more info, please read [README.md](demos/fish/README.md) of `demos/fish`.
|
||||||
|
|
||||||
## How to Use?
|
## How to Use?
|
||||||
|
|
||||||
We have built and tested Occlum on Ubuntu 18.04 with or without hardware SGX support (if the CPU does not support SGX, Occlum can be run in the SGX simulation mode). To give Occlum a quick try, one can use the Occlum Docker image by following the steps below:
|
We have built and tested Occlum on Ubuntu 18.04 with or without hardware SGX support (if the CPU does not support SGX, Occlum can be run in the SGX simulation mode). To give Occlum a quick try, one can use the Occlum Docker image by following the steps below:
|
||||||
|
1
demos/fish/.gitignore
vendored
1
demos/fish/.gitignore
vendored
@ -2,3 +2,4 @@ ncurses/
|
|||||||
fish-shell/
|
fish-shell/
|
||||||
busybox/
|
busybox/
|
||||||
occlum-context/
|
occlum-context/
|
||||||
|
occlum-test/
|
||||||
|
@ -5,6 +5,8 @@ This demo will show Occlum's support in shell script.
|
|||||||
Occlum now only supports FISH (the friendly interactive shell, https://github.com/fish-shell/fish-shell) for now
|
Occlum now only supports FISH (the friendly interactive shell, https://github.com/fish-shell/fish-shell) for now
|
||||||
because FISH initially use `posix_spawn()` to create process.
|
because FISH initially use `posix_spawn()` to create process.
|
||||||
|
|
||||||
|
## 1. Run a simple FISH script with BusyBox
|
||||||
|
|
||||||
This shell script works with BusyBox (the Swiss army knife of embedded Linux, https://busybox.net/).
|
This shell script works with BusyBox (the Swiss army knife of embedded Linux, https://busybox.net/).
|
||||||
BusyBox combines tiny versions of many common UNIX utilities into a single small executable. It provides replacements
|
BusyBox combines tiny versions of many common UNIX utilities into a single small executable. It provides replacements
|
||||||
for most of the utilities you usually find in GNU fileutils, shellutils, etc.
|
for most of the utilities you usually find in GNU fileutils, shellutils, etc.
|
||||||
@ -28,13 +30,13 @@ occlum run /bin/fish_script.sh
|
|||||||
As demonstrated here, Occlum supports executing any script file that begins with a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))
|
As demonstrated here, Occlum supports executing any script file that begins with a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))
|
||||||
at its first line by invoking the interpreter program specified with the shebang.
|
at its first line by invoking the interpreter program specified with the shebang.
|
||||||
|
|
||||||
## Step 1:
|
### Step 1:
|
||||||
Downlaod FISH and busybox and build them with Occlum tool chain:
|
Downlaod FISH and busybox and build them with Occlum tool chain:
|
||||||
```
|
```
|
||||||
./download_and_build.sh
|
./download_and_build.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 2:
|
### Step 2:
|
||||||
Run command to prepare context and execute script:
|
Run command to prepare context and execute script:
|
||||||
```
|
```
|
||||||
./run_fish_test.sh
|
./run_fish_test.sh
|
||||||
@ -45,3 +47,47 @@ SGX_MODE=SIM ./run_fish_test.sh
|
|||||||
```
|
```
|
||||||
|
|
||||||
And you should see `Hello world from fish`.
|
And you should see `Hello world from fish`.
|
||||||
|
|
||||||
|
|
||||||
|
## Per-Process Resource Configuration with help of FISH
|
||||||
|
|
||||||
|
Resource configuration for application running in Occlum is done only in `Occlum.json`. And only default size (mmap, heap, stack) can be
|
||||||
|
configured. Since Occlum will claim all the memory space at initializtion, if an application doesn't really need the size as big as defined
|
||||||
|
in `Occlum.json`, the exceeding memory space is wasted. If two applications are running, one of which needs only a small amount of space while
|
||||||
|
the other needs a lot more, it is better to run with per-process resource configuration.
|
||||||
|
|
||||||
|
We achieve this with help of `prlimit` syscall (https://man7.org/linux/man-pages//man2/prlimit.2.html) and FISH shell built-in command
|
||||||
|
`ulimit` (https://fishshell.com/docs/current/cmds/ulimit.html). Thus, the application must be run in shell script. An example could be like this:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
#! /usr/bin/fish
|
||||||
|
ulimit -a
|
||||||
|
|
||||||
|
# ulimit defined below will override configuration in Occlum.json
|
||||||
|
ulimit -Ss 10240 # stack size 10M
|
||||||
|
ulimit -Sd 40960 # heap size 40M
|
||||||
|
ulimit -Sv 102400 # virtual memory size 100M (including heap, stack, mmap size)
|
||||||
|
|
||||||
|
echo "ulimit result:"
|
||||||
|
ulimit -a
|
||||||
|
|
||||||
|
# Run applications with the new resource limits
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
Below steps illustrate this usage:
|
||||||
|
|
||||||
|
### step 1:
|
||||||
|
Run command:
|
||||||
|
```shell
|
||||||
|
./run_per_process_config_test.sh --without-ulimit
|
||||||
|
```
|
||||||
|
|
||||||
|
This test will fail because `ulimit` commands are commented out and the default memory size defined in Occlum.json is too small for application to run.
|
||||||
|
|
||||||
|
### step 2:
|
||||||
|
Run command:
|
||||||
|
```shell
|
||||||
|
./run_per_process_config_test.sh
|
||||||
|
```
|
||||||
|
With the resource limits updated by `ulimit` command, the test can now pass.
|
||||||
|
34
demos/fish/run_per_process_config_test.sh
Executable file
34
demos/fish/run_per_process_config_test.sh
Executable file
@ -0,0 +1,34 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
set -e
|
||||||
|
|
||||||
|
option=$1
|
||||||
|
|
||||||
|
rm -rf occlum-test
|
||||||
|
mkdir occlum-test && cd occlum-test
|
||||||
|
occlum init
|
||||||
|
mkdir -p image/usr/bin
|
||||||
|
cp ../Occlum.json .
|
||||||
|
cp ../fish-shell/build/fish image/usr/bin
|
||||||
|
cp ../busybox/busybox image/usr/bin
|
||||||
|
cp ../test_per_process_config.sh image/bin
|
||||||
|
|
||||||
|
# Set process memory space size to very small values and will fail when running target script using default configuration
|
||||||
|
new_json="$(jq '.process.default_stack_size = "1MB" |
|
||||||
|
.process.default_heap_size = "1MB" |
|
||||||
|
.process.default_mmap_size = "6MB"' Occlum.json)" && \
|
||||||
|
echo "${new_json}" > Occlum.json
|
||||||
|
|
||||||
|
pushd image/bin
|
||||||
|
ln -s /usr/bin/busybox cat
|
||||||
|
ln -s /usr/bin/busybox echo
|
||||||
|
ln -s /usr/bin/busybox awk
|
||||||
|
popd
|
||||||
|
|
||||||
|
# If `--without-ulimit` is specified, run without ulimit command and thus will fail
|
||||||
|
if [[ $1 == "--without-ulimit" ]]; then
|
||||||
|
sed -i '/^ulimit -S/ s/^/# &/g' image/bin/test_per_process_config.sh
|
||||||
|
fi
|
||||||
|
|
||||||
|
occlum build
|
||||||
|
echo -e "\nBuild done. Running fish script ..."
|
||||||
|
occlum run /bin/test_per_process_config.sh
|
13
demos/fish/test_per_process_config.sh
Normal file
13
demos/fish/test_per_process_config.sh
Normal file
@ -0,0 +1,13 @@
|
|||||||
|
#! /usr/bin/fish
|
||||||
|
ulimit -a
|
||||||
|
|
||||||
|
# ulimit defined below will overide configuration in Occlum.json
|
||||||
|
ulimit -Sv 102400 # virtual memory size 100M (including heap, stack, mmap size)
|
||||||
|
ulimit -Ss 10240 # stack size 10M
|
||||||
|
ulimit -Sd 40960 # heap size 40M
|
||||||
|
|
||||||
|
echo "ulimit result:"
|
||||||
|
ulimit -a
|
||||||
|
|
||||||
|
# A high-memory-consumption process
|
||||||
|
/usr/bin/busybox dd if=/dev/zero of=/root/test bs=40M count=2
|
@ -18,10 +18,27 @@ impl ResourceLimits {
|
|||||||
|
|
||||||
impl Default for ResourceLimits {
|
impl Default for ResourceLimits {
|
||||||
fn default() -> ResourceLimits {
|
fn default() -> ResourceLimits {
|
||||||
// TODO: set appropriate limits for resources
|
// Get memory space limit from Occlum.json
|
||||||
|
let cfg_heap_size: u64 = config::LIBOS_CONFIG.process.default_heap_size as u64;
|
||||||
|
let cfg_stack_size: u64 = config::LIBOS_CONFIG.process.default_stack_size as u64;
|
||||||
|
let cfg_mmap_size: u64 = config::LIBOS_CONFIG.process.default_mmap_size as u64;
|
||||||
|
|
||||||
|
let stack_size = rlimit_t::new(cfg_stack_size);
|
||||||
|
|
||||||
|
// Data segment consists of three parts: initialized data, uninitialized data, and heap.
|
||||||
|
// Here we just approximatively consider this equal to the size of heap size.
|
||||||
|
let data_size = rlimit_t::new(cfg_heap_size);
|
||||||
|
// Address space can be approximatively considered equal to the sum of application's
|
||||||
|
// heap, stack and mmap size.
|
||||||
|
let address_space = rlimit_t::new(cfg_heap_size + cfg_stack_size + cfg_mmap_size);
|
||||||
|
|
||||||
let mut rlimits = ResourceLimits {
|
let mut rlimits = ResourceLimits {
|
||||||
rlimits: [Default::default(); RLIMIT_COUNT],
|
rlimits: [Default::default(); RLIMIT_COUNT],
|
||||||
};
|
};
|
||||||
|
*rlimits.get_mut(resource_t::RLIMIT_DATA) = data_size;
|
||||||
|
*rlimits.get_mut(resource_t::RLIMIT_STACK) = stack_size;
|
||||||
|
*rlimits.get_mut(resource_t::RLIMIT_AS) = address_space;
|
||||||
|
|
||||||
rlimits
|
rlimits
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -34,6 +51,13 @@ pub struct rlimit_t {
|
|||||||
}
|
}
|
||||||
|
|
||||||
impl rlimit_t {
|
impl rlimit_t {
|
||||||
|
fn new(cur: u64) -> rlimit_t {
|
||||||
|
rlimit_t {
|
||||||
|
cur: cur,
|
||||||
|
max: u64::max_value(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
pub fn get_cur(&self) -> u64 {
|
pub fn get_cur(&self) -> u64 {
|
||||||
self.cur
|
self.cur
|
||||||
}
|
}
|
||||||
@ -103,6 +127,8 @@ impl resource_t {
|
|||||||
/// (unnecessary) restriction is lifted by our implementation. Nevertheless,
|
/// (unnecessary) restriction is lifted by our implementation. Nevertheless,
|
||||||
/// since the rlimits object is shared between threads in a process, the
|
/// since the rlimits object is shared between threads in a process, the
|
||||||
/// semantic of limiting resource usage on a per-process basisi is preserved.
|
/// semantic of limiting resource usage on a per-process basisi is preserved.
|
||||||
|
///
|
||||||
|
/// Limitation: Current implementation only takes effect on child processes.
|
||||||
pub fn do_prlimit(
|
pub fn do_prlimit(
|
||||||
pid: pid_t,
|
pid: pid_t,
|
||||||
resource: resource_t,
|
resource: resource_t,
|
||||||
@ -119,6 +145,39 @@ pub fn do_prlimit(
|
|||||||
*old_limit = *rlimits.get(resource)
|
*old_limit = *rlimits.get(resource)
|
||||||
}
|
}
|
||||||
if let Some(new_limit) = new_limit {
|
if let Some(new_limit) = new_limit {
|
||||||
|
// Privilege is not granted for setting hard limit
|
||||||
|
if new_limit.get_max() != u64::max_value() {
|
||||||
|
return_errno!(EPERM, "setting hard limit is not permitted")
|
||||||
|
}
|
||||||
|
if new_limit.get_cur() > new_limit.get_max() {
|
||||||
|
return_errno!(EINVAL, "soft limit is greater than hard limit");
|
||||||
|
}
|
||||||
|
|
||||||
|
let mut soft_rlimit_stack_size = rlimits.get(resource_t::RLIMIT_STACK).get_cur();
|
||||||
|
let mut soft_rlimit_data_size = rlimits.get(resource_t::RLIMIT_DATA).get_cur();
|
||||||
|
let mut soft_rlimit_address_space_size = rlimits.get(resource_t::RLIMIT_AS).get_cur();
|
||||||
|
match resource {
|
||||||
|
resource_t::RLIMIT_DATA => {
|
||||||
|
soft_rlimit_data_size = new_limit.get_cur();
|
||||||
|
}
|
||||||
|
resource_t::RLIMIT_STACK => {
|
||||||
|
soft_rlimit_stack_size = new_limit.get_cur();
|
||||||
|
}
|
||||||
|
resource_t::RLIMIT_AS => {
|
||||||
|
soft_rlimit_address_space_size = new_limit.get_cur();
|
||||||
|
}
|
||||||
|
_ => warn!("resource type not supported"),
|
||||||
|
}
|
||||||
|
|
||||||
|
let soft_data_and_stack_size = soft_rlimit_data_size
|
||||||
|
.checked_add(soft_rlimit_stack_size)
|
||||||
|
.ok_or_else(|| errno!(EOVERFLOW, "memory size overflow"))?;
|
||||||
|
|
||||||
|
// Mmap space size can't be zero at least.
|
||||||
|
if soft_rlimit_address_space_size <= soft_data_and_stack_size {
|
||||||
|
return_errno!(EINVAL, "RLIMIT_AS size is too small");
|
||||||
|
}
|
||||||
|
|
||||||
*rlimits.get_mut(resource) = *new_limit;
|
*rlimits.get_mut(resource) = *new_limit;
|
||||||
}
|
}
|
||||||
Ok(())
|
Ok(())
|
||||||
|
@ -1,6 +1,7 @@
|
|||||||
use std::ptr;
|
use std::ptr;
|
||||||
|
|
||||||
use super::super::elf_file::ElfFile;
|
use super::super::elf_file::ElfFile;
|
||||||
|
use crate::misc::{resource_t, rlimit_t};
|
||||||
use crate::prelude::*;
|
use crate::prelude::*;
|
||||||
use crate::vm::{ProcessVM, ProcessVMBuilder};
|
use crate::vm::{ProcessVM, ProcessVMBuilder};
|
||||||
|
|
||||||
@ -8,9 +9,33 @@ pub fn do_init<'a, 'b>(
|
|||||||
elf_file: &'b ElfFile<'a>,
|
elf_file: &'b ElfFile<'a>,
|
||||||
ldso_elf_file: &'b ElfFile<'a>,
|
ldso_elf_file: &'b ElfFile<'a>,
|
||||||
) -> Result<ProcessVM> {
|
) -> Result<ProcessVM> {
|
||||||
let mut process_vm = ProcessVMBuilder::new(vec![elf_file, ldso_elf_file])
|
let mut process_vm = if current!().process().pid() == 0 {
|
||||||
.build()
|
// Parent process is idle process and we can skip checking rlimit because main
|
||||||
.cause_err(|e| errno!(e.errno(), "failed to create process VM"))?;
|
// process will directly use memory configuration in Occlum.json
|
||||||
|
ProcessVMBuilder::new(vec![elf_file, ldso_elf_file])
|
||||||
|
.build()
|
||||||
|
.cause_err(|e| errno!(e.errno(), "failed to create process VM"))?
|
||||||
|
} else {
|
||||||
|
// Parent process is not idle process. Inherit parent process's resource limit.
|
||||||
|
let rlimit = current!().rlimits().lock().unwrap().clone();
|
||||||
|
let child_heap_size = rlimit.get(resource_t::RLIMIT_DATA).get_cur();
|
||||||
|
let child_stack_size = rlimit.get(resource_t::RLIMIT_STACK).get_cur();
|
||||||
|
let child_mmap_size =
|
||||||
|
rlimit.get(resource_t::RLIMIT_AS).get_cur() - child_heap_size - child_stack_size;
|
||||||
|
|
||||||
|
debug!(
|
||||||
|
"new process: heap_size = {:?}, stack_size = {:?}, mmap_size = {:?}",
|
||||||
|
child_heap_size, child_stack_size, child_mmap_size
|
||||||
|
);
|
||||||
|
|
||||||
|
ProcessVMBuilder::new(vec![elf_file, ldso_elf_file])
|
||||||
|
.set_heap_size(child_heap_size as usize)
|
||||||
|
.set_stack_size(child_stack_size as usize)
|
||||||
|
.set_mmap_size(child_mmap_size as usize)
|
||||||
|
.clone()
|
||||||
|
.build()
|
||||||
|
.cause_err(|e| errno!(e.errno(), "failed to create process VM"))?
|
||||||
|
};
|
||||||
|
|
||||||
// Relocate symbols
|
// Relocate symbols
|
||||||
//reloc_symbols(process_base_addr, elf_file)?;
|
//reloc_symbols(process_base_addr, elf_file)?;
|
||||||
|
@ -180,6 +180,7 @@ fn new_process(
|
|||||||
};
|
};
|
||||||
let fs_ref = Arc::new(SgxMutex::new(current_ref.fs().lock().unwrap().clone()));
|
let fs_ref = Arc::new(SgxMutex::new(current_ref.fs().lock().unwrap().clone()));
|
||||||
let sched_ref = Arc::new(SgxMutex::new(current_ref.sched().lock().unwrap().clone()));
|
let sched_ref = Arc::new(SgxMutex::new(current_ref.sched().lock().unwrap().clone()));
|
||||||
|
let rlimit_ref = Arc::new(SgxMutex::new(current_ref.rlimits().lock().unwrap().clone()));
|
||||||
|
|
||||||
// Make the default thread name to be the process's corresponding elf file name
|
// Make the default thread name to be the process's corresponding elf file name
|
||||||
let elf_name = elf_path.rsplit('/').collect::<Vec<&str>>()[0];
|
let elf_name = elf_path.rsplit('/').collect::<Vec<&str>>()[0];
|
||||||
@ -191,6 +192,7 @@ fn new_process(
|
|||||||
.parent(process_ref)
|
.parent(process_ref)
|
||||||
.task(task)
|
.task(task)
|
||||||
.sched(sched_ref)
|
.sched(sched_ref)
|
||||||
|
.rlimits(rlimit_ref)
|
||||||
.fs(fs_ref)
|
.fs(fs_ref)
|
||||||
.files(files_ref)
|
.files(files_ref)
|
||||||
.name(thread_name)
|
.name(thread_name)
|
||||||
|
@ -1,6 +1,7 @@
|
|||||||
use super::super::task::Task;
|
use super::super::task::Task;
|
||||||
use super::super::thread::ThreadId;
|
use super::super::thread::ThreadId;
|
||||||
use super::{ProcessBuilder, ThreadRef};
|
use super::{ProcessBuilder, ThreadRef};
|
||||||
|
use crate::misc::ResourceLimits;
|
||||||
/// Process 0, a.k.a, the idle process.
|
/// Process 0, a.k.a, the idle process.
|
||||||
///
|
///
|
||||||
/// The idle process has no practical use except making process 1 (a.k.a, the init proess)
|
/// The idle process has no practical use except making process 1 (a.k.a, the init proess)
|
||||||
@ -19,11 +20,15 @@ fn create_idle_thread() -> Result<ThreadRef> {
|
|||||||
let dummy_vm = Arc::new(ProcessVM::default());
|
let dummy_vm = Arc::new(ProcessVM::default());
|
||||||
let dummy_task = Task::default();
|
let dummy_task = Task::default();
|
||||||
|
|
||||||
|
// rlimit get from Occlum.json
|
||||||
|
let rlimits = Arc::new(SgxMutex::new(ResourceLimits::default()));
|
||||||
|
|
||||||
// Assemble the idle process
|
// Assemble the idle process
|
||||||
let idle_process = ProcessBuilder::new()
|
let idle_process = ProcessBuilder::new()
|
||||||
.tid(dummy_tid)
|
.tid(dummy_tid)
|
||||||
.vm(dummy_vm)
|
.vm(dummy_vm)
|
||||||
.task(dummy_task)
|
.task(dummy_task)
|
||||||
|
.rlimits(rlimits)
|
||||||
.no_parent(true)
|
.no_parent(true)
|
||||||
.build()?;
|
.build()?;
|
||||||
debug_assert!(idle_process.pid() == 0);
|
debug_assert!(idle_process.pid() == 0);
|
||||||
|
@ -9,7 +9,7 @@ use super::vm_manager::{
|
|||||||
use super::vm_perms::VMPerms;
|
use super::vm_perms::VMPerms;
|
||||||
use std::sync::atomic::{AtomicUsize, Ordering};
|
use std::sync::atomic::{AtomicUsize, Ordering};
|
||||||
|
|
||||||
#[derive(Debug)]
|
#[derive(Debug, Clone)]
|
||||||
pub struct ProcessVMBuilder<'a, 'b> {
|
pub struct ProcessVMBuilder<'a, 'b> {
|
||||||
elfs: Vec<&'b ElfFile<'a>>,
|
elfs: Vec<&'b ElfFile<'a>>,
|
||||||
heap_size: Option<usize>,
|
heap_size: Option<usize>,
|
||||||
|
Loading…
Reference in New Issue
Block a user