[readthedoc] Add doc for occlum llm demo
This commit is contained in:
parent
d2f2c3ca04
commit
e129c3c791
BIN
docs/readthedocs/docs/source/images/occlum-llm.png
Normal file
BIN
docs/readthedocs/docs/source/images/occlum-llm.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 138 KiB |
@ -30,6 +30,7 @@ Table of Contents
|
||||
tutorials/gen_occlum_instance.md
|
||||
tutorials/distributed_pytorch.md
|
||||
tutorials/occlum_ppml.md
|
||||
tutorials/LLM_inference.md
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
18
docs/readthedocs/docs/source/tutorials/LLM_inference.md
Normal file
18
docs/readthedocs/docs/source/tutorials/LLM_inference.md
Normal file
@ -0,0 +1,18 @@
|
||||
# LLM Inference in TEE
|
||||
|
||||
LLM ( Large Language Model) inference in TEE can protect the model, input prompt or output. The key challenges are:
|
||||
|
||||
1. the performance of LLM inference in TEE (CPU)
|
||||
2. can LLM inference run in TEE?
|
||||
|
||||
With the significant LLM inference speed-up brought by [BigDL-LLM](https://github.com/intel-analytics/BigDL/tree/main/python/llm), and the Occlum LibOS, now high-performance and efficient LLM inference in TEE could be realized.
|
||||
|
||||
## Overview
|
||||
|
||||

|
||||
|
||||
Above is the overview chart and flow description.
|
||||
|
||||
For step 3, users could use the Occlum [init-ra AECS](https://occlum.readthedocs.io/en/latest/remote_attestation.html#init-ra-solution) solution which has no invasion to the application.
|
||||
|
||||
More details please refer to [LLM demo](https://github.com/occlum/occlum/tree/master/demos/bigdl-llm).
|
Loading…
Reference in New Issue
Block a user