diff --git a/docs/readthedocs/docs/source/images/occlum-llm.png b/docs/readthedocs/docs/source/images/occlum-llm.png
new file mode 100644
index 00000000..f7a7c29d
Binary files /dev/null and b/docs/readthedocs/docs/source/images/occlum-llm.png differ
diff --git a/docs/readthedocs/docs/source/index.rst b/docs/readthedocs/docs/source/index.rst
index 8531865f..e1d430de 100644
--- a/docs/readthedocs/docs/source/index.rst
+++ b/docs/readthedocs/docs/source/index.rst
@@ -30,6 +30,7 @@ Table of Contents
    tutorials/gen_occlum_instance.md
    tutorials/distributed_pytorch.md
    tutorials/occlum_ppml.md
+   tutorials/LLM_inference.md
 
 .. toctree::
    :maxdepth: 2
diff --git a/docs/readthedocs/docs/source/tutorials/LLM_inference.md b/docs/readthedocs/docs/source/tutorials/LLM_inference.md
new file mode 100644
index 00000000..a769882d
--- /dev/null
+++ b/docs/readthedocs/docs/source/tutorials/LLM_inference.md
@@ -0,0 +1,18 @@
+# LLM Inference in TEE
+
+LLM ( Large Language Model) inference in TEE can protect the model, input prompt or output. The key challenges are:
+
+1. the performance of LLM inference in TEE (CPU)
+2. can LLM inference run in TEE?
+
+With the significant LLM inference speed-up brought by [BigDL-LLM](https://github.com/intel-analytics/BigDL/tree/main/python/llm), and the Occlum LibOS, now high-performance and efficient LLM inference in TEE could be realized.
+
+## Overview
+
+![LLM inference](../images/occlum-llm.png)
+
+Above is the overview chart and flow description.
+
+For step 3, users could use the Occlum [init-ra AECS](https://occlum.readthedocs.io/en/latest/remote_attestation.html#init-ra-solution) solution which has no invasion to the application.
+
+More details please refer to [LLM demo](https://github.com/occlum/occlum/tree/master/demos/bigdl-llm).