[demos] Update tf serving example with resnet model

This commit is contained in:
Zheng, Qi 2023-03-31 10:01:59 +08:00 committed by volcano
parent 6cb368fbbe
commit 91dd93d9a4
11 changed files with 102 additions and 78 deletions

@ -55,9 +55,9 @@ Now users could send inference request with server certificates (`server.crt`).
There are prebuilt docker images could be used for the examples, either in the following docker way or [`kubernates`](./kubernetes/) way. Users could pull them directly and try the example.
```
docker pull occlum/init_ra_server:0.29.2-ubuntu20.04
docker pull occlum/tf_demo:0.29.2-ubuntu20.04
docker pull occlum/tf_demo_client:0.29.2-ubuntu20.04
docker pull occlum/init_ra_server:0.29.5-ubuntu20.04
docker pull occlum/tf_demo:0.29.5-ubuntu20.04
docker pull occlum/tf_demo_client:0.29.5-ubuntu20.04
```
If users want to build or customize the images, please check below part.
@ -66,11 +66,11 @@ If users want to build or customize the images, please check below part.
Our target is to deploy the demo in separated container images, so docker build is necessary steps. Thanks to the `docker run in docker` method, this example build could be done in Occlum development container image.
First, please make sure `docker` is installed successfully in your host. Then start the Occlum container (use version `0.29.2-ubuntu20.04` for example) as below.
First, please make sure `docker` is installed successfully in your host. Then start the Occlum container (use version `latest-ubuntu20.04` for example) as below.
```
$ sudo docker run --rm -itd --network host \
-v $(which docker):/usr/bin/docker -v /var/run/docker.sock:/var/run/docker.sock \
occlum/occlum:0.29.2-ubuntu20.04
occlum/occlum:latest-ubuntu20.04
```
All the following are running in the above container.
@ -101,7 +101,7 @@ For the tensorflow-serving, there is no need rebuild from source, just use the o
Once all content ready, runtime container images build are good to go.
This step builds two container images, `init_ra_server` and `tf_demo`.
```
# ./build_container_images.sh <registry>
# ./build_container_images.sh <registry> <tag>
```
`<registry>` means the docker registry prefix for the generated container images.
@ -128,12 +128,13 @@ usage: run_container.sh [OPTION]...
-p <GRPC Server port> default 50051.
-u <PCCS URL> default https://localhost:8081/sgx/certification/v3/.
-r <registry prefix> the registry for this demo container images.
-g <image tag> the container images tag, default it is "latest".
-h <usage> usage help
```
For example, using PCCS service from aliyun.
```
$ sudo ./run_container.sh -s localhost -p 50051 -u https://sgx-dcap-server.cn-shanghai.aliyuncs.com/sgx/certification/v3/ -r demo
$ sudo ./run_container.sh -s localhost -p 50051 -u https://sgx-dcap-server.cn-shanghai.aliyuncs.com/sgx/certification/v3/ -r demo -g <tag>
```
If everything goes well, the tensorflow serving service would be available by GRPC secure channel `localhost:9000`.
@ -142,12 +143,20 @@ If everything goes well, the tensorflow serving service would be available by GR
There is an example python based [`inference client`](./client/inception_client.py) which sends a picture to tensorflow serving service to do inference with previously generated server certificate.
Install the dependent python packages.
```
# pip3 install -r client/requirements.txt
```
Start the inference request.
```
# cd client
# python3 inception_client.py --server=localhost:9000 --crt ../ssl_configure/server.crt --image cat.jpg
# python3 resnet_client_grpc.py --server=localhost:9000 --crt ../ssl_configure/server.crt --image cat.jpg
```
If everything goes well, you will get the most likely predication class (int value, mapping could be found on https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt) and its probability.
Or you can use the demo client container image to do the inference test.
```
$ docker run --rm --network host <registry>/tf_demo_client:<tag> python3 inception_client.py --server=localhost:9000 --crt server.crt --image cat.jpg
$ docker run --rm --network host <registry>/tf_demo_client:<tag> python3 resnet_client_grpc.py --server=localhost:9000 --crt server.crt --image cat.jpg
```

@ -27,10 +27,11 @@ function build_tf_serving()
# Dump tensorflow/serving container rootfs content
./dump_rootfs.sh -i tensorflow/serving -d ${TF_DIR} -g 2.5.1
pushd ${TF_DIR}
# Download pretrained inception model
rm -rf INCEPTION*
curl -O https://s3-us-west-2.amazonaws.com/tf-test-models/INCEPTION.zip
unzip INCEPTION.zip
# Download pretrained resnet model
rm -rf resnet*
wget https://tfhub.dev/tensorflow/resnet_50/classification/1?tf-hub-format=compressed -O resnet.tar.gz
mkdir -p resnet/123
tar zxf resnet.tar.gz -C resnet/123
popd
}

@ -1,6 +1,8 @@
import grpc
import tensorflow as tf
import argparse, time, grpc, asyncio
import numpy as np
from PIL import Image
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
@ -22,17 +24,22 @@ class benchmark_engine(object):
def __prepare__(self):
for idx in range(self.concurrent_num):
# get image array
with open(self.image, 'rb') as f:
input_name = 'images'
input_shape = [1]
input_data = f.read()
# with open(self.image, 'rb') as f:
# input_name = 'images'
# input_shape = [1]
# input_data = f.read()
# Load the image and convert to RGB
img = Image.open(self.image).convert('RGB')
img = img.resize((224,224), Image.BICUBIC)
img_array = np.array(img)
img_array = img_array.astype(np.float32) /255.0
# create request
request = predict_pb2.PredictRequest()
request.model_spec.name = 'INCEPTION'
request.model_spec.signature_name = 'predict_images'
request.inputs[input_name].CopyFrom(
tf.make_tensor_proto(input_data, shape=input_shape))
request.model_spec.name = 'resnet'
request.model_spec.signature_name = 'serving_default'
request.inputs['input_1'].CopyFrom(
tf.make_tensor_proto(img_array, shape=[1,224,224,3]))
self.request_signatures.append(request)
return None

@ -1,42 +0,0 @@
import grpc
import tensorflow as tf
import argparse
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
def main():
with open(args.crt, 'rb') as f:
creds = grpc.ssl_channel_credentials(f.read())
channel = grpc.secure_channel(args.server, creds)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# Send request
with open(args.image, 'rb') as f:
# See prediction_service.proto for gRPC request/response details.
request = predict_pb2.PredictRequest()
request.model_spec.name = 'INCEPTION'
request.model_spec.signature_name = 'predict_images'
input_name = 'images'
input_shape = [1]
input_data = f.read()
request.inputs[input_name].CopyFrom(
tf.make_tensor_proto(input_data, shape=input_shape))
result = stub.Predict(request, 10.0) # 10 secs timeout
print(result)
print("Inception Client Passed")
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--server', default='localhost:9000',
help='Tenforflow Model Server Address')
parser.add_argument('--crt', default=None, type=str, help='TLS certificate file path')
parser.add_argument('--image', default='Siberian_Husky_bi-eyed_Flickr.jpg',
help='Path to the image')
args = parser.parse_args()
main()

@ -1,4 +1,5 @@
grpcio>=1.34.0
aiohttp>=3.7.0
tensorflow>=2.3.0
tensorflow-serving-api>=2.3.0
tensorflow==2.11
tensorflow-serving-api==2.11
Pillow==9.4

@ -0,0 +1,48 @@
import grpc
import tensorflow as tf
import argparse
import numpy as np
from PIL import Image
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
def main():
with open(args.crt, 'rb') as f:
creds = grpc.ssl_channel_credentials(f.read())
channel = grpc.secure_channel(args.server, creds)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# Load the image and convert to RGB
img = Image.open(args.image).convert('RGB')
img = img.resize((224,224), Image.BICUBIC)
img_array = np.array(img)
img_array = img_array.astype(np.float32) /255.0
# Create a request message for TensorFlow Serving
request = predict_pb2.PredictRequest()
request.model_spec.name = 'resnet'
request.model_spec.signature_name = 'serving_default'
request.inputs['input_1'].CopyFrom(
tf.make_tensor_proto(img_array, shape=[1,224,224,3]))
# Send the request to TensorFlow Serving
result = stub.Predict(request, 10.0)
# Print the predicted class and probability
result = result.outputs['activation_49'].float_val
class_idx = np.argmax(result)
print('Prediction class: ', class_idx)
print('Probability: ', result[int(class_idx)])
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--server', default='localhost:9000',
help='Tenforflow Model Server Address')
parser.add_argument('--crt', default=None, type=str, help='TLS certificate file path')
parser.add_argument('--image', default='Siberian_Husky_bi-eyed_Flickr.jpg',
help='Path to the image')
args = parser.parse_args()
main()

@ -56,12 +56,12 @@ usage: build.sh [OPTION]...
For example, below command generates three container images.
```
# ./build.sh -r demo -g 0.29.2
# ./build.sh -r demo -g 0.29.5
```
* **`demo/init_ra_server:0.29.2`** acts as key broker pod.
* **`demo/tf_demo:0.29.2`** acts as tensorflow serving pod.
* **`demo/tf_demo_client:0.29.2`** acts as client.
* **`demo/init_ra_server:0.29.5`** acts as key broker pod.
* **`demo/tf_demo:0.29.5`** acts as tensorflow serving pod.
* **`demo/tf_demo_client:0.29.5`** acts as client.
## How to test
@ -110,7 +110,7 @@ In default, only one replica for the tensorflow serving pod.
### Try the inference request
```
$ docker run --rm --network host demo/tf_demo_client:0.29.2 python3 inception_client.py --server=localhost:31001 --crt server.crt --image cat.jpg
$ docker run --rm --network host demo/tf_demo_client:0.29.5 python3 resnet_client_grpc.py --server=localhost:31001 --crt server.crt --image cat.jpg
```
If successful, it prints the classification results.
@ -120,7 +120,7 @@ If successful, it prints the classification results.
Below command can do benchmark test for the tensorflow serving service running in Occlum.
```
$ docker run --rm --network host demo/tf_demo_client:0.29.2 python3 benchmark.py --server localhost:31001 --crt server.crt --cnum 4 --loop 10 --image cat.jpg
$ docker run --rm --network host demo/tf_demo_client:0.29.5 python3 benchmark.py --server localhost:31001 --crt server.crt --cnum 4 --loop 10 --image cat.jpg
```
Try scale up the tensorflow serving pods number, better `tps` can be achieved.

@ -29,8 +29,8 @@ spec:
- occlum
- run
- /bin/tensorflow_model_server
- --model_name=INCEPTION
- --model_base_path=/model/INCEPTION/INCEPTION
- --model_name=resnet
- --model_base_path=/models/resnet
- --port=9001
- --ssl_config_file=/etc/tf_ssl.cfg
ports:

@ -11,12 +11,12 @@ pushd occlum_server
occlum run /bin/server ${GRPC_SERVER} &
popd
sleep 3
sleep 10
echo "Start Tensorflow-Serving on backgound ..."
pushd occlum_tf
taskset -c 0,1 occlum run /bin/tensorflow_model_server \
--model_name=INCEPTION --model_base_path=/model/INCEPTION/INCEPTION \
--model_name=resnet --model_base_path=/models/resnet \
--port=9000 --ssl_config_file="/etc/tf_ssl.cfg"
popd

@ -57,5 +57,5 @@ docker run --network host \
--env GRPC_SERVER="${GRPC_SERVER}" \
${registry}/tf_demo:${tag} \
taskset -c 0,1 occlum run /bin/tensorflow_model_server \
--model_name=INCEPTION --model_base_path=/model/INCEPTION/INCEPTION \
--model_name=resnet --model_base_path=/models/resnet \
--port=9000 --ssl_config_file="/etc/tf_ssl.cfg" &

@ -2,10 +2,10 @@ includes:
- base.yaml
targets:
# copy model
- target: /model
- target: /models
copy:
- dirs:
- ${TF_DIR}/INCEPTION
- ${TF_DIR}/resnet
- target: /bin
copy:
- files: