[demos] Update tf serving example with resnet model
This commit is contained in:
		
							parent
							
								
									6cb368fbbe
								
							
						
					
					
						commit
						91dd93d9a4
					
				@ -55,9 +55,9 @@ Now users could send inference request with server certificates (`server.crt`).
 | 
			
		||||
 | 
			
		||||
There are prebuilt docker images could be used for the examples, either in the following docker way or [`kubernates`](./kubernetes/) way. Users could pull them directly and try the example.
 | 
			
		||||
```
 | 
			
		||||
docker pull occlum/init_ra_server:0.29.2-ubuntu20.04
 | 
			
		||||
docker pull occlum/tf_demo:0.29.2-ubuntu20.04
 | 
			
		||||
docker pull occlum/tf_demo_client:0.29.2-ubuntu20.04
 | 
			
		||||
docker pull occlum/init_ra_server:0.29.5-ubuntu20.04
 | 
			
		||||
docker pull occlum/tf_demo:0.29.5-ubuntu20.04
 | 
			
		||||
docker pull occlum/tf_demo_client:0.29.5-ubuntu20.04
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
If users want to build or customize the images, please check below part.
 | 
			
		||||
@ -66,11 +66,11 @@ If users want to build or customize the images, please check below part.
 | 
			
		||||
 | 
			
		||||
Our target is to deploy the demo in separated container images, so docker build is necessary steps. Thanks to the `docker run in docker` method, this example build could be done in Occlum development container image.
 | 
			
		||||
 | 
			
		||||
First, please make sure `docker` is installed successfully in your host. Then start the Occlum container (use version `0.29.2-ubuntu20.04` for example) as below.
 | 
			
		||||
First, please make sure `docker` is installed successfully in your host. Then start the Occlum container (use version `latest-ubuntu20.04` for example) as below.
 | 
			
		||||
```
 | 
			
		||||
$ sudo docker run --rm -itd --network host \
 | 
			
		||||
        -v $(which docker):/usr/bin/docker -v /var/run/docker.sock:/var/run/docker.sock \
 | 
			
		||||
        occlum/occlum:0.29.2-ubuntu20.04
 | 
			
		||||
        occlum/occlum:latest-ubuntu20.04
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
All the following are running in the above container.
 | 
			
		||||
@ -101,7 +101,7 @@ For the tensorflow-serving, there is no need rebuild from source, just use the o
 | 
			
		||||
Once all content ready, runtime container images build are good to go.
 | 
			
		||||
This step builds two container images, `init_ra_server` and `tf_demo`.
 | 
			
		||||
```
 | 
			
		||||
# ./build_container_images.sh <registry>
 | 
			
		||||
# ./build_container_images.sh <registry> <tag>
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
`<registry>` means the docker registry prefix for the generated container images.
 | 
			
		||||
@ -128,12 +128,13 @@ usage: run_container.sh [OPTION]...
 | 
			
		||||
    -p <GRPC Server port> default 50051.
 | 
			
		||||
    -u <PCCS URL> default https://localhost:8081/sgx/certification/v3/.
 | 
			
		||||
    -r <registry prefix> the registry for this demo container images.
 | 
			
		||||
    -g <image tag> the container images tag, default it is "latest".
 | 
			
		||||
    -h <usage> usage help
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
For example, using PCCS service from aliyun.
 | 
			
		||||
```
 | 
			
		||||
$ sudo ./run_container.sh -s  localhost -p 50051 -u https://sgx-dcap-server.cn-shanghai.aliyuncs.com/sgx/certification/v3/ -r demo
 | 
			
		||||
$ sudo ./run_container.sh -s  localhost -p 50051 -u https://sgx-dcap-server.cn-shanghai.aliyuncs.com/sgx/certification/v3/ -r demo -g <tag>
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
If everything goes well, the tensorflow serving service would be available by GRPC secure channel `localhost:9000`.
 | 
			
		||||
@ -142,12 +143,20 @@ If everything goes well, the tensorflow serving service would be available by GR
 | 
			
		||||
 | 
			
		||||
There is an example python based [`inference client`](./client/inception_client.py) which sends a picture to tensorflow serving service to do inference with previously generated server certificate.
 | 
			
		||||
 | 
			
		||||
Install the dependent python packages.
 | 
			
		||||
```
 | 
			
		||||
# pip3 install -r client/requirements.txt
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Start the inference request.
 | 
			
		||||
```
 | 
			
		||||
# cd client
 | 
			
		||||
# python3 inception_client.py --server=localhost:9000 --crt ../ssl_configure/server.crt --image cat.jpg
 | 
			
		||||
# python3 resnet_client_grpc.py --server=localhost:9000 --crt ../ssl_configure/server.crt --image cat.jpg
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
If everything goes well, you will get the most likely predication class (int value, mapping could be found on https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt) and its probability.
 | 
			
		||||
 | 
			
		||||
Or you can use the demo client container image to do the inference test.
 | 
			
		||||
```
 | 
			
		||||
$ docker run --rm --network host <registry>/tf_demo_client:<tag> python3 inception_client.py --server=localhost:9000 --crt server.crt --image cat.jpg
 | 
			
		||||
$ docker run --rm --network host <registry>/tf_demo_client:<tag> python3 resnet_client_grpc.py --server=localhost:9000 --crt server.crt --image cat.jpg
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
@ -27,10 +27,11 @@ function build_tf_serving()
 | 
			
		||||
    # Dump tensorflow/serving container rootfs content
 | 
			
		||||
    ./dump_rootfs.sh -i tensorflow/serving -d ${TF_DIR} -g 2.5.1
 | 
			
		||||
    pushd ${TF_DIR}
 | 
			
		||||
    # Download pretrained inception model
 | 
			
		||||
    rm -rf INCEPTION*
 | 
			
		||||
    curl -O https://s3-us-west-2.amazonaws.com/tf-test-models/INCEPTION.zip
 | 
			
		||||
    unzip INCEPTION.zip
 | 
			
		||||
    # Download pretrained resnet model
 | 
			
		||||
    rm -rf resnet*
 | 
			
		||||
    wget https://tfhub.dev/tensorflow/resnet_50/classification/1?tf-hub-format=compressed -O resnet.tar.gz
 | 
			
		||||
    mkdir -p resnet/123
 | 
			
		||||
    tar zxf resnet.tar.gz -C resnet/123
 | 
			
		||||
    popd
 | 
			
		||||
}
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
@ -1,6 +1,8 @@
 | 
			
		||||
import grpc
 | 
			
		||||
import tensorflow as tf
 | 
			
		||||
import argparse, time, grpc, asyncio
 | 
			
		||||
import numpy as np
 | 
			
		||||
from PIL import Image
 | 
			
		||||
 | 
			
		||||
from tensorflow_serving.apis import predict_pb2
 | 
			
		||||
from tensorflow_serving.apis import prediction_service_pb2_grpc
 | 
			
		||||
@ -22,17 +24,22 @@ class benchmark_engine(object):
 | 
			
		||||
    def __prepare__(self):
 | 
			
		||||
        for idx in range(self.concurrent_num):
 | 
			
		||||
            # get image array
 | 
			
		||||
            with open(self.image, 'rb') as f:
 | 
			
		||||
                input_name = 'images'
 | 
			
		||||
                input_shape = [1]
 | 
			
		||||
                input_data = f.read()
 | 
			
		||||
            # with open(self.image, 'rb') as f:
 | 
			
		||||
            #     input_name = 'images'
 | 
			
		||||
            #     input_shape = [1]
 | 
			
		||||
            #     input_data = f.read()
 | 
			
		||||
 | 
			
		||||
            # Load the image and convert to RGB
 | 
			
		||||
            img = Image.open(self.image).convert('RGB')
 | 
			
		||||
            img = img.resize((224,224), Image.BICUBIC)
 | 
			
		||||
            img_array = np.array(img)
 | 
			
		||||
            img_array = img_array.astype(np.float32) /255.0
 | 
			
		||||
            # create request
 | 
			
		||||
            request = predict_pb2.PredictRequest()
 | 
			
		||||
            request.model_spec.name = 'INCEPTION'
 | 
			
		||||
            request.model_spec.signature_name = 'predict_images'
 | 
			
		||||
            request.inputs[input_name].CopyFrom(
 | 
			
		||||
                tf.make_tensor_proto(input_data, shape=input_shape))
 | 
			
		||||
            request.model_spec.name = 'resnet'
 | 
			
		||||
            request.model_spec.signature_name = 'serving_default'
 | 
			
		||||
            request.inputs['input_1'].CopyFrom(
 | 
			
		||||
                tf.make_tensor_proto(img_array, shape=[1,224,224,3]))
 | 
			
		||||
            
 | 
			
		||||
            self.request_signatures.append(request)
 | 
			
		||||
        return None
 | 
			
		||||
 | 
			
		||||
@ -1,42 +0,0 @@
 | 
			
		||||
import grpc
 | 
			
		||||
import tensorflow as tf
 | 
			
		||||
import argparse
 | 
			
		||||
 | 
			
		||||
from tensorflow_serving.apis import predict_pb2
 | 
			
		||||
from tensorflow_serving.apis import prediction_service_pb2_grpc
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
def main():
 | 
			
		||||
  with open(args.crt, 'rb') as f:
 | 
			
		||||
    creds = grpc.ssl_channel_credentials(f.read())
 | 
			
		||||
  channel = grpc.secure_channel(args.server, creds)
 | 
			
		||||
  stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
 | 
			
		||||
  # Send request
 | 
			
		||||
  with open(args.image, 'rb') as f:
 | 
			
		||||
    # See prediction_service.proto for gRPC request/response details.
 | 
			
		||||
    request = predict_pb2.PredictRequest()
 | 
			
		||||
    request.model_spec.name = 'INCEPTION'
 | 
			
		||||
    request.model_spec.signature_name = 'predict_images'
 | 
			
		||||
 | 
			
		||||
    input_name = 'images'
 | 
			
		||||
    input_shape = [1]
 | 
			
		||||
    input_data = f.read()
 | 
			
		||||
    request.inputs[input_name].CopyFrom(
 | 
			
		||||
      tf.make_tensor_proto(input_data, shape=input_shape))
 | 
			
		||||
 | 
			
		||||
    result = stub.Predict(request, 10.0)  # 10 secs timeout
 | 
			
		||||
    print(result)
 | 
			
		||||
 | 
			
		||||
  print("Inception Client Passed")
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
if __name__ == '__main__':
 | 
			
		||||
  parser = argparse.ArgumentParser()
 | 
			
		||||
  parser.add_argument('--server', default='localhost:9000',
 | 
			
		||||
                      help='Tenforflow Model Server Address')
 | 
			
		||||
  parser.add_argument('--crt', default=None, type=str, help='TLS certificate file path')
 | 
			
		||||
  parser.add_argument('--image', default='Siberian_Husky_bi-eyed_Flickr.jpg',
 | 
			
		||||
                      help='Path to the image')
 | 
			
		||||
  args = parser.parse_args()
 | 
			
		||||
 | 
			
		||||
  main()
 | 
			
		||||
@ -1,4 +1,5 @@
 | 
			
		||||
grpcio>=1.34.0
 | 
			
		||||
aiohttp>=3.7.0
 | 
			
		||||
tensorflow>=2.3.0
 | 
			
		||||
tensorflow-serving-api>=2.3.0
 | 
			
		||||
tensorflow==2.11
 | 
			
		||||
tensorflow-serving-api==2.11
 | 
			
		||||
Pillow==9.4
 | 
			
		||||
 | 
			
		||||
							
								
								
									
										48
									
								
								example/client/resnet_client_grpc.py
									
									
									
									
									
										Normal file
									
								
							
							
								
								
								
								
								
									
									
								
							
						
						
									
										48
									
								
								example/client/resnet_client_grpc.py
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,48 @@
 | 
			
		||||
import grpc
 | 
			
		||||
import tensorflow as tf
 | 
			
		||||
import argparse
 | 
			
		||||
import numpy as np
 | 
			
		||||
from PIL import Image
 | 
			
		||||
 | 
			
		||||
from tensorflow_serving.apis import predict_pb2
 | 
			
		||||
from tensorflow_serving.apis import prediction_service_pb2_grpc
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
def main():
 | 
			
		||||
  with open(args.crt, 'rb') as f:
 | 
			
		||||
    creds = grpc.ssl_channel_credentials(f.read())
 | 
			
		||||
  channel = grpc.secure_channel(args.server, creds)
 | 
			
		||||
  stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
 | 
			
		||||
 | 
			
		||||
  # Load the image and convert to RGB
 | 
			
		||||
  img = Image.open(args.image).convert('RGB')
 | 
			
		||||
  img = img.resize((224,224), Image.BICUBIC)
 | 
			
		||||
  img_array = np.array(img)
 | 
			
		||||
  img_array = img_array.astype(np.float32) /255.0
 | 
			
		||||
 | 
			
		||||
  # Create a request message for TensorFlow Serving
 | 
			
		||||
  request = predict_pb2.PredictRequest()
 | 
			
		||||
  request.model_spec.name = 'resnet'
 | 
			
		||||
  request.model_spec.signature_name = 'serving_default'
 | 
			
		||||
  request.inputs['input_1'].CopyFrom(
 | 
			
		||||
    tf.make_tensor_proto(img_array, shape=[1,224,224,3]))
 | 
			
		||||
 | 
			
		||||
  # Send the request to TensorFlow Serving
 | 
			
		||||
  result = stub.Predict(request, 10.0)
 | 
			
		||||
 | 
			
		||||
  # Print the predicted class and probability
 | 
			
		||||
  result = result.outputs['activation_49'].float_val
 | 
			
		||||
  class_idx = np.argmax(result)
 | 
			
		||||
  print('Prediction class: ', class_idx)
 | 
			
		||||
  print('Probability: ', result[int(class_idx)])
 | 
			
		||||
 | 
			
		||||
if __name__ == '__main__':
 | 
			
		||||
  parser = argparse.ArgumentParser()
 | 
			
		||||
  parser.add_argument('--server', default='localhost:9000',
 | 
			
		||||
                      help='Tenforflow Model Server Address')
 | 
			
		||||
  parser.add_argument('--crt', default=None, type=str, help='TLS certificate file path')
 | 
			
		||||
  parser.add_argument('--image', default='Siberian_Husky_bi-eyed_Flickr.jpg',
 | 
			
		||||
                      help='Path to the image')
 | 
			
		||||
  args = parser.parse_args()
 | 
			
		||||
 | 
			
		||||
  main()
 | 
			
		||||
@ -56,12 +56,12 @@ usage: build.sh [OPTION]...
 | 
			
		||||
 | 
			
		||||
For example, below command generates three container images.
 | 
			
		||||
```
 | 
			
		||||
# ./build.sh -r demo -g 0.29.2
 | 
			
		||||
# ./build.sh -r demo -g 0.29.5
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
* **`demo/init_ra_server:0.29.2`** acts as key broker pod.
 | 
			
		||||
* **`demo/tf_demo:0.29.2`** acts as tensorflow serving pod.
 | 
			
		||||
* **`demo/tf_demo_client:0.29.2`** acts as client.
 | 
			
		||||
* **`demo/init_ra_server:0.29.5`** acts as key broker pod.
 | 
			
		||||
* **`demo/tf_demo:0.29.5`** acts as tensorflow serving pod.
 | 
			
		||||
* **`demo/tf_demo_client:0.29.5`** acts as client.
 | 
			
		||||
 | 
			
		||||
## How to test
 | 
			
		||||
 | 
			
		||||
@ -110,7 +110,7 @@ In default, only one replica for the tensorflow serving pod.
 | 
			
		||||
### Try the inference request
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
$ docker run --rm --network host demo/tf_demo_client:0.29.2 python3 inception_client.py --server=localhost:31001 --crt server.crt --image cat.jpg
 | 
			
		||||
$ docker run --rm --network host demo/tf_demo_client:0.29.5 python3 resnet_client_grpc.py --server=localhost:31001 --crt server.crt --image cat.jpg
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
If successful, it prints the classification results.
 | 
			
		||||
@ -120,7 +120,7 @@ If successful, it prints the classification results.
 | 
			
		||||
Below command can do benchmark test for the tensorflow serving service running in Occlum.
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
$ docker run --rm --network host demo/tf_demo_client:0.29.2 python3 benchmark.py --server localhost:31001 --crt server.crt --cnum 4 --loop 10 --image cat.jpg
 | 
			
		||||
$ docker run --rm --network host demo/tf_demo_client:0.29.5 python3 benchmark.py --server localhost:31001 --crt server.crt --cnum 4 --loop 10 --image cat.jpg
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Try scale up the tensorflow serving pods number, better `tps` can be achieved.
 | 
			
		||||
 | 
			
		||||
@ -29,8 +29,8 @@ spec:
 | 
			
		||||
        - occlum
 | 
			
		||||
        - run
 | 
			
		||||
        - /bin/tensorflow_model_server
 | 
			
		||||
        - --model_name=INCEPTION
 | 
			
		||||
        - --model_base_path=/model/INCEPTION/INCEPTION
 | 
			
		||||
        - --model_name=resnet
 | 
			
		||||
        - --model_base_path=/models/resnet
 | 
			
		||||
        - --port=9001
 | 
			
		||||
        - --ssl_config_file=/etc/tf_ssl.cfg
 | 
			
		||||
        ports:
 | 
			
		||||
 | 
			
		||||
@ -11,12 +11,12 @@ pushd occlum_server
 | 
			
		||||
occlum run /bin/server ${GRPC_SERVER} &
 | 
			
		||||
popd
 | 
			
		||||
 | 
			
		||||
sleep 3
 | 
			
		||||
sleep 10
 | 
			
		||||
 | 
			
		||||
echo "Start Tensorflow-Serving on backgound ..."
 | 
			
		||||
 | 
			
		||||
pushd occlum_tf
 | 
			
		||||
taskset -c 0,1 occlum run /bin/tensorflow_model_server \
 | 
			
		||||
        --model_name=INCEPTION --model_base_path=/model/INCEPTION/INCEPTION \
 | 
			
		||||
        --model_name=resnet --model_base_path=/models/resnet \
 | 
			
		||||
        --port=9000 --ssl_config_file="/etc/tf_ssl.cfg"
 | 
			
		||||
popd
 | 
			
		||||
 | 
			
		||||
@ -57,5 +57,5 @@ docker run --network host \
 | 
			
		||||
        --env GRPC_SERVER="${GRPC_SERVER}" \
 | 
			
		||||
        ${registry}/tf_demo:${tag} \
 | 
			
		||||
        taskset -c 0,1 occlum run /bin/tensorflow_model_server \
 | 
			
		||||
        --model_name=INCEPTION --model_base_path=/model/INCEPTION/INCEPTION \
 | 
			
		||||
        --model_name=resnet --model_base_path=/models/resnet \
 | 
			
		||||
        --port=9000 --ssl_config_file="/etc/tf_ssl.cfg" &
 | 
			
		||||
 | 
			
		||||
@ -2,10 +2,10 @@ includes:
 | 
			
		||||
  - base.yaml
 | 
			
		||||
targets:
 | 
			
		||||
  # copy model
 | 
			
		||||
  - target: /model
 | 
			
		||||
  - target: /models
 | 
			
		||||
    copy: 
 | 
			
		||||
      - dirs:
 | 
			
		||||
        - ${TF_DIR}/INCEPTION
 | 
			
		||||
        - ${TF_DIR}/resnet
 | 
			
		||||
  - target: /bin
 | 
			
		||||
    copy: 
 | 
			
		||||
      - files:
 | 
			
		||||
 | 
			
		||||
		Loading…
	
		Reference in New Issue
	
	Block a user