本文档基于openvino_2022.2。
一.简介
OpenVINO™ 工具包是一个综合工具包,用于快速开发解决各种任务的应用程序和解决方案,包括模拟人类视觉、自动语音识别、自然语言处理、推荐系统等。
2018年发布,开源、商用免费。
1.OpenVINO™ 工具包
- 在边缘启用基于 CNN 的深度学习推理
- 支持跨英特尔® CPU、英特尔® 集成显卡、英特尔® 神经计算棒 2 和英特尔® 视觉加速器设计与英特尔® Movidius™ VPU 的异构执行
- 通过易于使用的计算机视觉功能库和预先优化的内核加快上市时间
- 包括对计算机视觉标准的优化调用,包括 OpenCV* 和 OpenCL™
2.OpenVINO™ 工具包组件
-
深度学习模型优化器
- 跨平台的命令行工具包,支持导入来自主流的深度学习框架的模型,模型文件可能来自tensorflow、pytorch、caffe、MXNet、ONNX等深度学习框架与工具生成。模型优化器支持对导入模型的转换、优化、导出中间格式文件。
-
深度学习推理引擎
- 一组统一的 C++/Python API函数,允许在许多硬件类型上进行高性能推理,包括英特尔® CPU、英特尔® 集成显卡、英特尔® 神经计算棒 2、采用英特尔® Movidius™ 视觉处理单元 (VPU) 的英特尔® 视觉加速器设计。
推理引擎示例:一组简单的控制台应用程序,演示如何在您的应用程序中使用推理引擎。
-
深度学习工作台(DL Workbench)
- 基于 Web 的图形环境,是官方的 OpenVINO™ 图形界面,旨在使预训练深度学习计算机视觉和自然语言处理模型的生成变得更加容易。
训练后优化工具:用于校准模型然后以 INT8 精度执行它的工具。
附加工具:一组用于处理模型的工具,包括Benchmark App、Cross Check Tool、Compile tool。
-
Open Model Zoo
- 包括针对各种视觉问题的深度学习解决方案,包括对象识别、人脸识别、姿势估计、文本检测和动作识别
- 附加工具:一组用于处理模型的工具,包括Accuracy Checker Utility和Model Downloader。
- 预训练模型文档:Open Model Zoo github仓库中提供的预训练模型文档。
- Tensorflow预训练模型库
Deep Learning Streamer (DL Streamer):基于 GStreamer 的流分析框架,用于构建媒体分析组件的图形。DL Streamer 可以通过英特尔® Distribution of OpenVINO™ 工具包安装程序进行安装。
OpenCV:为英特尔® 硬件编译的 OpenCV 社区版本
英特尔® 媒体 SDK:(仅在面向 Linux 的英特尔® OpenVINO™ 工具套件分发版中)
3.OpenVINO™ 工具包工作流程
- 支持部署设备
- Intel® CPU (e.g. Intel® Core™ i7-1165G7)
- dGPU (e.g. Intel® Iris® Xe MAX) 集成显卡
- iGPU (e.g. Intel® UHD Graphics 620 (iGPU)) 独立显卡
- Intel® Movidius™ Myriad™ X VPU (e.g. Intel® Neural Compute Stick 2 (Intel® NCS2))
- GNA (处理器集成的高斯和神经加速器):旨在提供人工智能语音和音频应用程序,例如神经噪声消除。
二、安装OpenVINO组件
1.环境依赖
- 操作系统
- Ubuntu 18.04 long-term support (LTS), 64-bit
- Ubuntu 20.04 long-term support (LTS), 64-bit
- 硬件设备
- 6th to 12th generation Intel® Core™ processors and Intel® Xeon® processors
- 3rd generation Intel® Xeon® Scalable processor (formerly code named Cooper Lake)
- Intel® Xeon® Scalable processor (formerly Skylake and Cascade Lake)
- Intel Atom® processor with support for Intel® Streaming SIMD Extensions 4.1 (Intel® SSE4.1)
- Intel Pentium® processor N4200/5, N3350/5, or N3450/5 with Intel® HD Graphics
- Intel® Iris® Xe MAX Graphics
- Intel® Neural Compute Stick 2
- Intel® Vision Accelerator Design with Intel® Movidius™ VPUs
2.下载与安装
到Intel® Distribution of OpenVINO™ Toolkit下载选择下载openvino development tools或openvino runtime。
1)PIP 安装OpenVINO Development Tools
安装OpenVINO Development Tools会一并安装OpenVINO Runtime。
# Step 1: Create and activate virtual environment
python3 -m venv openvino_env
source openvino_env/bin/activate
# Step 2: Upgrade pip to latest version
python -m pip install --upgrade pip
# Step 3: Download and install the package
pip install openvino-dev[ONNX,tensorflow2,mxnet,kaldi,caffe,pytorch]==2022.2.0
# 在当前目录会出现openvino_env文件夹
$ tree openvino_env/ -L 2
openvino_env/
├── bin
│ ├── accuracy_check
│ ├── activate
│ ├── activate.csh
│ ├── activate.fish
│ ├── Activate.ps1
│ ├── backend-test-tools
│ ├── benchmark_app # 评估模型
│ ├── check-model
│ ├── check-node
│ ├── convert_annotation
│ ├── convert-caffe2-to-onnx
│ ├── convert-onnx-to-caffe2
│ ├── cpuinfo
│ ├── easy_install
│ ├── easy_install-3.8
│ ├── estimator_ckpt_converter
│ ├── f2py
│ ├── f2py3
│ ├── f2py3.8
│ ├── google-oauthlib-tool
│ ├── huggingface-cli
│ ├── imagecodecs
│ ├── imageio_download_bin
│ ├── imageio_remove_bin
│ ├── import_pb_to_tensorboard
│ ├── lsm2bin
│ ├── markdown_py
│ ├── mo # Model Optimizer
│ ├── nib-conform
│ ├── nib-convert
│ ├── nib-dicomfs
│ ├── nib-diff
│ ├── nib-ls
│ ├── nib-nifti-dx
│ ├── nib-roi
│ ├── nib-stats
│ ├── nib-tck2trk
│ ├── nib-trk2tck
│ ├── nltk
│ ├── normalizer
│ ├── omz_converter # Open Model Zoo工具:预训练模型转IR文件
│ ├── omz_data_downloader # Open Model Zoo工具:下载数据
│ ├── omz_downloader # Open Model Zoo工具:下载预训练模型
│ ├── omz_info_dumper
│ ├── omz_quantizer
│ ├── opt_in_out
│ ├── parrec2nii
│ ├── pip
│ ├── pip3
│ ├── pip3.10
│ ├── pip3.8
│ ├── pot # Post-training Optimization Tool
│ ├── pydicom
│ ├── pyrsa-decrypt
│ ├── pyrsa-encrypt
│ ├── pyrsa-keygen
│ ├── pyrsa-priv2pub
│ ├── pyrsa-sign
│ ├── pyrsa-verify
│ ├── python -> python3
│ ├── python3 -> /usr/bin/python3
│ ├── saved_model_cli
│ ├── skivi
│ ├── tensorboard
│ ├── tflite_convert
│ ├── tf_upgrade_v2
│ ├── tiff2fsspec
│ ├── tiffcomment
│ ├── tifffile
│ ├── toco
│ ├── toco_from_protos
│ ├── tqdm
│ ├── transformers-cli
│ └── wheel
├── include
├── lib
│ └── python3.8
├── lib64 -> lib
├── pyvenv.cfg
└── share
├── doc
└── python-wheels
├── appdirs-1.4.3-py2.py3-none-any.whl
├── CacheControl-0.12.6-py2.py3-none-any.whl
├── certifi-2019.11.28-py2.py3-none-any.whl
├── chardet-3.0.4-py2.py3-none-any.whl
├── colorama-0.4.3-py2.py3-none-any.whl
├── contextlib2-0.6.0-py2.py3-none-any.whl
├── distlib-0.3.0-py2.py3-none-any.whl
├── distro-1.4.0-py2.py3-none-any.whl
├── html5lib-1.0.1-py2.py3-none-any.whl
├── idna-2.8-py2.py3-none-any.whl
├── ipaddr-2.2.0-py2.py3-none-any.whl
├── lockfile-0.12.2-py2.py3-none-any.whl
├── msgpack-0.6.2-py2.py3-none-any.whl
├── packaging-20.3-py2.py3-none-any.whl
├── pep517-0.8.2-py2.py3-none-any.whl
├── pip-20.0.2-py2.py3-none-any.whl
├── pkg_resources-0.0.0-py2.py3-none-any.whl
├── progress-1.5-py2.py3-none-any.whl
├── pyparsing-2.4.6-py2.py3-none-any.whl
├── requests-2.22.0-py2.py3-none-any.whl
├── retrying-1.3.3-py2.py3-none-any.whl
├── setuptools-44.0.0-py2.py3-none-any.whl
├── six-1.14.0-py2.py3-none-any.whl
├── toml-0.10.0-py2.py3-none-any.whl
├── urllib3-1.25.8-py2.py3-none-any.whl
├── webencodings-0.5.1-py2.py3-none-any.whl
└── wheel-0.34.2-py2.py3-none-any.whl
$ pip list
Package Version
---------------------------- -----------
...
google-auth 2.13.0
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
...
keras 2.9.0
...
mxnet 1.7.0.post2
...
numpy 1.23.1
onnx 1.11.0
opencv-python 4.6.0.66
openvino 2022.2.0
openvino-dev 2022.2.0
openvino-telemetry 2022.1.1
...
scikit-image 0.19.3
scikit-learn 0.24.2
scipy 1.5.4
...
tensorboard 2.9.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow 2.9.1
tensorflow-estimator 2.9.0
tensorflow-io-gcs-filesystem 0.27.0
...
torch 1.8.1
torchvision 0.9.1
tqdm 4.64.1
transformers 4.23.1
...
2)Docker 安装OpenVINO Development Tools
通过docker hub获取镜像。
# Intel CPU
docker run -it --rm openvino/ubuntu18_dev
# Intel GPU
docker run -it --rm --device /dev/dri openvino/ubuntu18_dev
# NCS2(單個VPU)
docker run -it --rm --device-cgroup-rule='c 189:* rmw' -v /dev/bus/usb:/dev/bus/usb openvino/ubuntu18_dev
# HDDL(多個VPU)
docker run -it --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp openvino/ubuntu18_dev
容器说明
# 默认进入工作目录,如/opt/intel/openvino_2022.2.0.7713
$ tree -L 2
.
|-- docs
| |-- OpenVINO-GetStarted-online.html
| |-- OpenVINO-Install-Linux-online.html
| |-- OpenVINO-OpenVX-documentation.html
| |-- OpenVINO-documentation-online.html
| |-- licensing
|-- extras
| |-- opencv
|-- install_dependencies
| |-- 97-myriad-usbboot.rules
| |-- install_NCS_udev_rules.sh
| |-- install_NEO_OCL_driver.sh
| |-- install_openvino_dependencies.sh
|-- licensing
| |-- DockerImage_readme.txt
| |-- third-party-programs-docker-dev.txt
| |-- third-party-programs-docker-runtime.txt
|-- python
| |-- python3.6
| |-- python3.7
| |-- python3.8
| |-- python3.9
|-- runtime
| |-- 3rdparty
| |-- cmake
| |-- include
| |-- lib
| |-- version.txt
|-- samples
| |-- c
| | |-- CMakeLists.txt
| | |-- build_samples.sh
| | |-- common
| | |-- hello_classification
| | |-- hello_nv12_input_classification
| |-- cpp
| | |-- CMakeLists.txt
| | |-- benchmark_app
| | |-- build
| | |-- build_samples.sh
| | |-- classification_sample_async
| | |-- common
| | |-- hello_classification
| | |-- hello_nv12_input_classification
| | |-- hello_query_device
| | |-- hello_reshape_ssd
| | |-- model_creation_sample
| | |-- samples_bin
| | |-- speech_sample
| | |-- thirdparty
| |-- python
| |-- classification_sample_async
| |-- hello_classification
| |-- hello_query_device
| |-- hello_reshape_ssd
| |-- model_creation_sample
| |-- requirements.txt
| |-- setup.cfg
| |-- speech_sample
|-- setupvars.sh
|-- tools
|-- cl_compiler
|-- compile_tool
|-- deployment_manager
|-- requirements.txt
|-- requirements_caffe.txt
|-- requirements_kaldi.txt
|-- requirements_mxnet.txt
|-- requirements_onnx.txt
|-- requirements_pytorch.txt
|-- requirements_tensorflow.txt
|-- requirements_tensorflow2.txt
$ ls /usr/local/bin/omz*
/usr/local/bin/omz_converter /usr/local/bin/omz_downloader /usr/local/bin/omz_quantizer
/usr/local/bin/omz_data_downloader /usr/local/bin/omz_info_dumper
OpenVINO™ 工具套件组件对比
2021 | 2022 |
---|---|
Inference Engine Runtime | 进化为OpenVINO™ Runtime |
Samples | 保留,进行了精简,移除了与OMZ demo中重复的示例,且只保留用于理解API用法的示例 |
Dev tools,含MO, POT, DLWB,以及OMZ中的下载、转换等工具[注2] | 不再默认包含,需要单独通过pip进行安装 |
非Dev tools,含deployment manager, compile_tool等 | 保留 |
OpenCV | 不再默认包含,需要通过单独提供的脚本下载和安装 |
DL Workbench的下载安装脚本 | 从安装包中移除,单独通过pip安装 |
DL Streamer | 从安装包中移除,单独通过APT进行安装 |
Media SDK | Media SDK进化为One VPL[注3],从安装包中移除 |
Demo应用(来自于OMZ) | 从安装包中移除 |
3)Docker安装dl workbench
https://docs.openvino.ai/latest/workbench_docs_Workbench_DG_Run_Locally.html#windows
# Manage Docker as a non-root user
$ sudo groupadd docker
$ sudo usermod -aG docker $USER
$ newgrp docker # activate the changes to groups
$ docker ps
$ docker pull openvino/workbench:latest
$ docker run -p 0.0.0.0:5665:5665 --name workbench -it --rm openvino/workbench:latest
waiting for server to start..... done
server started
waiting for server to shut down..... done
server stopped
[Workbench] PostgreSQL init process complete.
[Workbench] PostgreSQL applying migrations...
waiting for server to start..... done
server started
打开浏览器,输入http://127.0.0.1:5665.
三、使用OpenVINO组件
1.使用openvino-dev容器
基于openvino development tools。
1)容器基础使用
# 初始化openvino环境变量
$ source /opt/intel/openvino/setupvars.sh
# 初始化openvino-opencv环境变量,否则无法拉流
$ source /opt/intel/openvino/extras/opencv/setupvars.sh
# 查看设备信息
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_query_device
$ python3 hello_query_device.py
[ INFO ] Available devices:
[ INFO ] CPU :
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] AVAILABLE_DEVICES:
[ INFO ] RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
[ INFO ] RANGE_FOR_STREAMS: 1, 8
[ INFO ] FULL_DEVICE_NAME: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
[ INFO ] OPTIMIZATION_CAPABILITIES: WINOGRAD, FP32, FP16, INT8, BIN, EXPORT_IMPORT
[ INFO ] CACHE_DIR:
[ INFO ] NUM_STREAMS: 1
[ INFO ] AFFINITY: Affinity.CORE
[ INFO ] INFERENCE_NUM_THREADS: 0
[ INFO ] PERF_COUNT: False
[ INFO ] INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ] PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] GPU :
[ INFO ] SUPPORTED_PROPERTIES:
[ INFO ] AVAILABLE_DEVICES: 0
[ INFO ] RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 2, 1
[ INFO ] RANGE_FOR_STREAMS: 1, 2
[ INFO ] OPTIMAL_BATCH_SIZE: 1
[ INFO ] MAX_BATCH_SIZE: 1
[ INFO ] FULL_DEVICE_NAME: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU)
[ INFO ] DEVICE_UUID: UNSUPPORTED TYPE
[ INFO ] DEVICE_TYPE: Type.INTEGRATED
[ INFO ] DEVICE_GOPS: UNSUPPORTED TYPE
[ INFO ] OPTIMIZATION_CAPABILITIES: FP32, BIN, FP16, INT8
[ INFO ] GPU_DEVICE_TOTAL_MEM_SIZE: UNSUPPORTED TYPE
[ INFO ] GPU_UARCH_VERSION: 12.0.0
[ INFO ] GPU_EXECUTION_UNITS_COUNT: 96
[ INFO ] GPU_MEMORY_STATISTICS: UNSUPPORTED TYPE
[ INFO ] PERF_COUNT: False
[ INFO ] MODEL_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_HOST_TASK_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_QUEUE_PRIORITY: Priority.MEDIUM
[ INFO ] GPU_QUEUE_THROTTLE: Priority.MEDIUM
[ INFO ] GPU_ENABLE_LOOP_UNROLLING: True
[ INFO ] CACHE_DIR:
[ INFO ] PERFORMANCE_HINT: PerformanceMode.UNDEFINED
[ INFO ] COMPILATION_NUM_THREADS: 8
[ INFO ] NUM_STREAMS: 1
[ INFO ] PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ] DEVICE_ID: 0
2)运行openvino样例
直接运行python样例:
# CPU
docker run -it --rm <image_name>
/bin/bash -c "cd ~ && omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 /opt/intel/openvino/samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp CPU"
# GPU
docker run -itu root:root --rm --device /dev/dri:/dev/dri <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp GPU"
# MYRIAD
docker run -itu root:root --rm --device-cgroup-rule='c 189:\* rmw' -v /dev/bus/usb:/dev/bus/usb <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp MYRIAD"
# HDDL
docker run -itu root:root --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp -v /dev/shm:/dev/shm <image_name>
/bin/bash -c "omz_downloader --name googlenet-v1 --precisions FP16 && omz_converter --name googlenet-v1 --precision FP16 && curl -O https://storage.openvinotoolkit.org/data/test_data/images/car_1.bmp && umask 000 && python3 samples/python/hello_classification/hello_classification.py public/googlenet-v1/FP16/googlenet-v1.xml car_1.bmp HDDL"
编译运行C++样例:
# 容器中
$ cd /opt/intel/openvino_2022.2.0.7713/samples/cpp
# 编译样例
$ ./build_samples.sh
$ tree samples_bin/
samples_bin/
|-- benchmark_app
|-- classification_sample_async
|-- hello_classification
|-- hello_nv12_input_classification
|-- hello_query_device
|-- hello_reshape_ssd
|-- model_creation_sample
|-- speech_sample
3)命令行使用
# 查看可获取的预训练模型
$ omz_downloader --print_all
Sphereface
aclnet
aclnet-int8
action-recognition-0001
age-gender-recognition-retail-0013
alexnet
......
yolo-v3-onnx
yolo-v3-tf
yolo-v3-tiny-onnx
yolo-v3-tiny-tf
yolo-v4-tf
yolo-v4-tiny-tf
yolof
yolox-tiny
# 测试openvino运行模型
$ cd /opt/intel/openvino_2022.2.0.7713/samples/python/hello_classification/
# 1.下载预训练模型
$ omz_downloader --name alexnet
$ tree public/alexnet/
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig
# 2.转化模型
$ omz_converter --name alexnet
$ tree public/alexnet/
|-- FP16
| |-- alexnet.bin
| |-- alexnet.mapping
| -- alexnet.xml
|-- FP32
| |-- alexnet.bin
| |-- alexnet.mapping
| -- alexnet.xml
|-- alexnet.caffemodel
|-- alexnet.prototxt
|-- alexnet.prototxt.orig
# 3.运行模型
$ curl -O https://storage.openvinotoolkit.org/data/test_data/images/banana.jpg
$ python3 hello_classification.py public/alexnet/FP16/alexnet.xml banana.jpg CPU/GPU/AUTO
# 4.模型基准测试
$ benchmark_app -m public/alexnet/FP16/alexnet.xml -i banana.jpg -d CPU/GPU -niter 128 -api sync/async
Latency:
Median: 34.38 ms
AVG: 34.60 ms
MIN: 19.57 ms
MAX: 69.10 ms
Throughput: 115.30 FPS
# 容器资源占用
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
f8e127db77d8 openvino-ubuntu18_dev 786.64% 1.324GiB / 7.383GiB 17.93% 0B / 0B 483MB / 74.4MB 19
# 查看Intel GPU消耗
sudo apt-get install -y intel-gpu-tools
sudo intel_gpu_top
intel-gpu-top - 0/ 0 MHz; 100% RC6; ----- (null); 0 irqs/s
IMC reads: ------ (null)/s
IMC writes: ------ (null)/s
ENGINE BUSY MI_SEMA MI_WAIT
Render/3D/0 99.65% | | 0% 0%
Blitter/0 0.00% | | 0% 0%
Video/0 0.00% | | 0% 0%
Video/1 0.00% | | 0% 0%
VideoEnhance/0 0.00% | | 0% 0%
2.使用openvino_notebooks样例
https://github.com/openvinotoolkit/openvino_notebooks/blob/main/README_cn.md
1)容器中安装jupyter
参考远程服务器(ubuntu20.04)+docker容器内jupyter远程使用
基于上述的openvino-dev容器环境中安装jupyter和启动jupyter-notebook。
apt-get update
apt-get install vim
pip install jupyter
# 生成jupyter notebook的配置文件
jupyter-notebook --generate-config
# 修改配置文件
vim ~/.jupyter/jupyter_notebook_config.py
# 允许通过任意绑定服务器的ip访问
c.NotebookApp.ip = '*'
# 用于访问的端口
c.NotebookApp.port = 8888 #注意这里与前面开出的容器端口要一致
# 不自动打开浏览器
c.NotebookApp.open_browser = False
#允许远程访问
c.NotebookApp.allow_remote_access = True
# 启动jupyter
$ jupyter notebook -ip 0.0.0.0 --allow-root --port 8888 --no-browser
......
http://127.0.0.1:8888/?token=xxx
使用浏览器访问notebook,输入token登录:如 http://192.168.1.10:8888/
2)下载并使用openvino_notebooks工程样例
apt-get install git
cd ~; git clone https://github.com/openvinotoolkit/openvino_notebooks
在浏览器的notebook中打开样例的ipynb文件即可,如:openvino_notebooks/notebooks/001-hello-world/001-hello-world.ipynb。
四、模型处理
OpenVINO™ 支持多种模型格式,并允许将它们转换为自己的 OpenVINO IR。
1.OpenVINO模型处理工具
https://docs.openvino.ai/latest/omz_tools_downloader.html
- mo:模型优化器可以将预训练深度学习模型:TensorFlow、PyTorch、PaddlePaddle、MXNet、Caffe、Kaldi 或 ONNX 转换为 OpenVINO 中间表示格式 (IR)。
- .xml - 描述整个模型拓扑,每个阶层,相连性和参数值。
- .bin - 包含每层已经训练好的权值和偏移值。
- 包含功能:
- Convert(转换)
- Optimize(优化)
- Conversion weights and offsets(转换权重与偏置)
- pot:训练后优化工具可以在推理过程中将权重和激活从浮点精度量化到整数精度(例如,8 位)。
- 不同的硬件平台支持不同的整数精度和量化参数,POT 通过引入“目标设备”的概念来抽象这种复杂性。
- 需要一个未标注的数据集进行量化。
- Open Model Zoo工具:针对Open Model Zoo的模型进行一键化处理。
- omz_downloader:从在线资源下载模型文件。
- omz_converter:将其他格式模型装换为IR格式模型。
- omz_quantizer:将 IR 格式的全精度模型量化为低精度版本。
- omz_info_dumper:以稳定的机器可读格式打印有关模型的信息。
- omz_data_downloader:从安装位置复制数据集的数据。
- benchmark_app:Benchmark C++ Tool 在支持的设备上评估深度学习推理性能。
2.各类模型格式转换
OpenVINO IR(中间表示):OpenVINO™ 的专有格式。
ONNX、PaddlePaddle:直接支持的格式,OpenVINO 提供 C++ 和 Python API 用于将它们直接导入 OpenVINO Runtime,无需任何事先转换。
TensorFlow、PyTorch、MXNet、Caffe、Kaldi:间接支持的格式,它们需要转换为前面列出的格式之一。使用模型优化器执行从这些格式到 OpenVINO IR 的转换。在某些情况下,需要使用其他转换器作为中介。
1)mo参数说明
# 可选参数:
--framework {onnx,mxnet,tf,kaldi,caffe,paddle}
# 与框架无关的参数:
--input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
--model_name MODEL_NAME, -n MODEL_NAME
输出IR文件名
--output_dir OUTPUT_DIR, -o OUTPUT_DIR
--input_shape INPUT_SHAPE
模型输入节点的shape,也可以使用--input参数设置input_shape
--scale SCALE, -s SCALE
原始网络中所有的input会除以这个值。
--reverse_input_channels
转换通道,从RGB→BGR
--log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
Logger level
--input INPUT
带""的字符串,用逗号分隔的输入节点信息,包括名称、形状、数据类型等。
--output OUTPUT
指定模型的输出节点
--mean_values MEAN_VALUES, -ms MEAN_VALUES
对输入图像的每一个通道设置mean值
--scale_values SCALE_VALUES
对输入图像的每一个通道设置scale值
--source_layout SOURCE_LAYOUT
Layout of the input or output of the model in the framework. Layout can be specified in
the short form, e.g. nhwc, or in complex form, e.g. "[n,h,w,c]". Example for many names:
"in_name1([n,h,w,c]),in_name2(nc),out_name1(n),out_name2(nc)". Layout can be partially
defined, "?" can be used to specify undefined layout for one dimension, "..." can be used
to specify undefined layout for multiple dimensions, for example "?c??", "nc...", "n...c",
etc.
--target_layout TARGET_LAYOUT
Same as --source_layout, but specifies target layout that will be in the model after
processing by ModelOptimizer.
--layout LAYOUT Combination of --source_layout and --target_layout. Can't be used with either of them. If
model has one input it is sufficient to specify layout of this input, for example --layout
nhwc. To specify layouts of many tensors, names must be provided, for example: --layout
"name1(nchw),name2(nc)". It is possible to instruct ModelOptimizer to change layout, for
example: --layout "name1(nhwc->nchw),name2(cn->nc)". Also "*" in long layout form can be
used to fuse dimensions, for example "[n,c,...]->[n*c,...]".
--data_type {FP16,FP32,half,float}
数据类型,该参数决定了模型的精度。
--transform TRANSFORM
Apply additional transformations. Usage: "--transform
transformation_name1[args],transformation_name2..." where [args] is key=value pairs
separated by semicolon. Examples: "--transform LowLatency2" or "--transform
LowLatency2[use_const_initializer=False]" or "--transform "MakeStateful[param_res_names={'
input_name_1':'output_name_1','input_name_2':'output_name_2'}]"" Available
transformations: "LowLatency2", "MakeStateful"
--disable_fusing [DEPRECATED] Turn off fusing of linear operations to Convolution.
--disable_resnet_optimization
[DEPRECATED] Turn off ResNet optimization.
--finegrain_fusing FINEGRAIN_FUSING
[DEPRECATED] Regex for layers/operations that won't be fused. Example: --finegrain_fusing
Convolution1,.*Scale.*
--enable_concat_optimization
[DEPRECATED] Turn on Concat optimization.
--extensions EXTENSIONS
Paths or a comma-separated list of paths to libraries (.so or .dll) with extensions. For
the legacy MO path (if `--use_legacy_frontend` is used), a directory or a comma-separated
list of directories with extensions are supported. To disable all extensions including
those that are placed at the default location, pass an empty string.
--batch BATCH, -b BATCH
Input batch size
--version Version of Model Optimizer
--silent Prevent any output messages except those that correspond to log level equals ERROR, that
can be set with the following option: --log_level. By default, log level is already ERROR.
--freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
Replaces input layer with constant node with provided value, for example:
"node_name->True". It will be DEPRECATED in future releases. Use --input option to specify
a value for freezing.
--static_shape Enables IR generation for fixed input shape (folding `ShapeOf` operations and shape-
calculating sub-graphs to `Constant`). Changing model input shape using the OpenVINO
Runtime API in runtime may fail for such an IR.
--disable_weights_compression
[DEPRECATED] Disable compression and store weights with original precision.
--progress Enable model conversion progress display.
--stream_output Switch model conversion progress display to a multiline mode.
--transformations_config TRANSFORMATIONS_CONFIG
Use the configuration file with transformations description. File can be specified as
relative path from the current directory, as absolute path or as arelative path from the
mo root directory
--use_new_frontend Force the usage of new Frontend of Model Optimizer for model conversion into IR. The new
Frontend is C++ based and is available for ONNX* and PaddlePaddle* models. Model optimizer
uses new Frontend for ONNX* and PaddlePaddle* by default that means `--use_new_frontend`
and `--use_legacy_frontend` options are not specified.
--use_legacy_frontend
Force the usage of legacy Frontend of Model Optimizer for model conversion into IR. The
legacy Frontend is Python based and is available for TensorFlow*, ONNX*, MXNet*, Caffe*,
and Kaldi* models.
2)转换ONNX模型
mo --input_model <INPUT_MODEL>.onnx
3)转换PaddlePaddle模型
mo --input_model <INPUT_MODEL>.pdmodel
# 示例
mo --input_model=yolov3.pdmodel --input=image,im_shape,scale_factor --input_shape=[1,3,608,608],[1,2],[1,2] --reverse_input_channels --output=save_infer_model/scale_0.tmp_1,save_infer_model/scale_1.tmp_1
4)转换PyTorch模型
PyTorch模型先导出ONNX模型,再转为OpenVINO IR。
import torch
# Instantiate your model. This is just a regular PyTorch model that will be exported in the following steps.
model = SomeModel()
# Evaluate the model to switch some operations from training mode to inference.
model.eval()
# Create dummy input for the model. It will be used to run the model inside export function.
dummy_input = torch.randn(1, 3, 224, 224)
# Call the export function
torch.onnx.export(model, (dummy_input, ), 'model.onnx')
- 从 PyTorch 1.8.1 版开始,并非所有 PyTorch 操作都可以导出到默认使用的 ONNX opset 9。当导出到默认 opset 9 不起作用时,建议将模型导出到 opset 11 或更高版本。
5)转换Caffe模型
mo --input_model <INPUT_MODEL>.caffemodel
# 针对Caffe 的特定参数:
--input_proto INPUT_PROTO,-d INPUT_PROTO
包含拓扑的部署就绪 prototxt 文件
结构和层属性
--caffe_parser_path CAFFE_PARSER_PATH
从 caffe.proto 生成的 python Caffe 解析器的路径
-k K 指定自定义层映射文件 CustomLayersMapping.xml
--disable_omitting_optional
禁用忽略可选属性(用于自定义图层)。如果要转移自定义层的所有属性到 IR,请使用此选项。默认行为是将具有默认值的属性和用户定义的属性传递给 IR。
--enable_flattening_nested_params
启用展平可选参数(用于自定义图层)。如果要将自定义层的属性传输到具有展平嵌套参数的 IR,请使用此选项。默认行为是在不展平嵌套参数的情况下传输属性。
# 示例
mo --input_model bvlc_alexnet.caffemodel --input_proto bvlc_alexnet.prototxt
# 如果caffemodel与prototxt在相同路径,则指定input_model即可。
mo --input_model bvlc_alexnet.caffemodel -k CustomLayersMapping.xml --disable_omitting_optional --enable_flattening_nested_params
6)转换TensorFlow模型
-
针对TensorFlow*的特定参数
--input_model_is_text TensorFlow*: treat the input model file as a text protobuf format. If not specified, the Model Optimizer treats it as a binary file by default. --input_checkpoint INPUT_CHECKPOINT TensorFlow*: variables file to load. --input_meta_graph INPUT_META_GRAPH Tensorflow*: a file with a meta-graph of the model before freezing --saved_model_dir SAVED_MODEL_DIR TensorFlow*: directory with a model in SavedModel format of TensorFlow 1.x or 2.x version. --saved_model_tags SAVED_MODEL_TAGS Group of tag(s) of the MetaGraphDef to load, in string format, separated by ','. For tag- set contains multiple tags, all tags must be passed in. --tensorflow_custom_operations_config_update TENSORFLOW_CUSTOM_OPERATIONS_CONFIG_UPDATE TensorFlow*: update the configuration file with node name patterns with input/output nodes information. --tensorflow_use_custom_operations_config TENSORFLOW_USE_CUSTOM_OPERATIONS_CONFIG Use the configuration file with custom operation description. --tensorflow_object_detection_api_pipeline_config TENSORFLOW_OBJECT_DETECTION_API_PIPELINE_CONFIG TensorFlow*: path to the pipeline configuration file used to generate model created with help of Object Detection API. --tensorboard_logdir TENSORBOARD_LOGDIR TensorFlow*: dump the input graph to a given directory that should be used with TensorBoard. --tensorflow_custom_layer_libraries TENSORFLOW_CUSTOM_LAYER_LIBRARIES TensorFlow*: comma separated list of shared libraries with TensorFlow* custom operations implementation. --disable_nhwc_to_nchw [DEPRECATED] Disables the default translation from NHWC to NCHW. Since 2022.1 this option is deprecated and used only to maintain backward compatibility with previous releases.
-
针对TensorFlow 1 Models
# Converting Frozen Model Format mo --input_model <INPUT_MODEL>.pb # Converting Non-Frozen Model Formats # 1.Checkpoint存储格式:包含inference_graph.pb和checkpoint_file.ckpt文件 mo --input_model <INFERENCE_GRAPH>.pb --input_checkpoint <INPUT_CHECKPOINT> # 2.MetaGraph储存格式:包含model_name.meta, model_name.index, model_name.data-00000-of-00001和checkpoint_file.ckpt【可选】 mo --input_meta_graph <INPUT_META_GRAPH>.meta # 3.SavedModel储存格式:一个文件夹中包含.pb文件,variables、assets 和 assets.extra子文件夹 mo --saved_model_dir <SAVED_MODEL_DIRECTORY>
-
导出Frozen Model Format
import tensorflow as tf from tensorflow.python.framework import graph_io frozen = tf.compat.v1.graph_util.convert_variables_to_constants(sess, sess.graph_def, ["name_of_the_output_node"]) graph_io.write_graph(frozen, './', 'inference_graph.pb', as_text=False)
-
-
针对TensorFlow 2 Models
-
SavedModel储存格式:一个文件夹中包含.pb文件和 variables 、assets子文件夹
mo --saved_model_dir <SAVED_MODEL_DIRECTORY>
-
Keras H5储存格式,需要先将其序列化为SavedModel格式。
import tensorflow as tf model = tf.keras.models.load_model('model.h5') tf.saved_model.save(model,'model')
-
7)转换Mxnet模型
-
针对TensorFlow*的特定参数
Mxnet-specific parameters: --input_symbol INPUT_SYMBOL Symbol file (for example, model-symbol.json) that contains a topology structure and layer attributes --nd_prefix_name ND_PREFIX_NAME Prefix name for args.nd and argx.nd files. --pretrained_model_name PRETRAINED_MODEL_NAME Name of a pretrained MXNet model without extension and epoch number. This model will be merged with args.nd and argx.nd files --save_params_from_nd Enable saving built parameters file from .nd files --legacy_mxnet_model Enable MXNet loader to make a model compatible with the latest MXNet version. Use only if your model was trained with MXNet version lower than 1.0.0 --enable_ssd_gluoncv Enable pattern matchers replacers for converting gluoncv ssd topologies.
3.训练后优化
https://docs.openvino.ai/latest/pot_compression_cli_README.html
# Basic usage for DefaultQuantization
pot -q default -m <path_to_xml> -w <path_to_bin> --ac-config <path_to_AC_config_yml>
# Basic usage for AccuracyAwareQauntization
pot -q accuracy_aware -m <path_to_xml> -w <path_to_bin> --ac-config <path_to_AC_config_yml> --max-drop 0.01
五、OpenVINO推理
Integrate OpenVINO™ with Your Application
1.使用openvino.runtime api开发
1)同步推理流程
-
创建Core对象;
from openvino.runtime import Core, Type, Layout core = Core() # 查看可用设备【可选】 devices = ie.available_devices for device in devices: device_name = ie.get_property(device, "FULL_DEVICE_NAME") print(f"{device}: {device_name}")
CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
GNA: GNA_SW
GPU: Intel(R) Iris(R) Xe Graphics [0x9a49] (iGPU) -
载入并编译模型;
# 读取模型文件,model_path为 .xml files 或 .onnx file model = core.read_model(model_path) # 获取模型输入输出信息【可选】 input_layer = model.input(0) output_layer = model.output(0) print(f"input precision: {input_layer.element_type}") print(f"input shape: {input_layer.shape}") print(f"output precision: {output_layer.element_type}") print(f"output shape: {output_layer.shape}") # 集成预处理步骤到模型【可选】 # 参考:使用openvino.preprocess api开发 # 将模型文件编译到指定的设备:device_name='CPU/GPU', config可选 compiled_model = core.compile_model(model, device_name, config)
input precision: <Type: 'float32'>
input shape: {1, 3, 224, 224}output precision: <Type: 'float32'>
output shape: {1, 1001} -
执行同步推理获得结果;
# 方法1:程序预处理, blob为模型输入数据 import numpy as np image = cv2.imread(image_path) N, C, H, W = input_layer.shape resized_image = cv2.resize(src=image, dsize=(W, H)) input_data = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0).astype(np.float32) result = compiled_model([input_data])[output_layer] # 阻塞推理 # 方法2: 模型集成预处理 image = cv2.imread(image_path) # Add N dimension input_tensor = np.expand_dims(image, 0) results = compiled_model.infer_new_request({0: input_tensor}) # 阻塞推理
2)异步推理流程
加载模型步骤与前面一致
-
执行异步推理获得结果
from openvino.runtime import AsyncInferQueue, Core, InferRequest, Layout, Type # Read input images images = [cv2.imread(image_path) for image_path in args.input] # Add N dimension input_tensors = [np.expand_dims(image, 0) for image in resized_images] # create async queue with optimal number of infer requests infer_queue = AsyncInferQueue(compiled_model) infer_queue.set_callback(completion_callback) for i, input_tensor in enumerate(input_tensors): # 执行异步推理 infer_queue.start_async({0: input_tensor}, args.input[i]) # 非阻塞 # 等待推理结束 infer_queue.wait_all()
或
... # 创建一个推理请求负责处理当前帧 infer_request_curr = net.create_infer_request() # 创建一个推理请求负责处理下一帧 infer_request_next = net.create_infer_request() # Get the current frame,采集当前帧图像 frame_curr = cv2.imread("./data/images/bus.jpg") # Preprocess the frame,对当前帧做预处理 letterbox_img_curr, _, _ = letterbox(frame_curr, auto=False) # Normalization + Swap RB + Layout from HWC to NCHW blob = Tensor(cv2.dnn.blobFromImage(letterbox_img_curr, 1/255.0, swapRB=True)) # 将数据传入模型的指定输入节点 infer_request_curr.set_tensor(input_node, blob) # 调用start_sync(),以非阻塞方式启动当前帧推理计算 infer_request_curr.start_async() while True: # 下一帧推理请求数据blob准备 # 将数据传入下一帧推理请求 infer_request_next.set_tensor(input_node, blob) # 调用start_sync(),以非阻塞的方式启动下一帧推理计算 infer_request_next.start_async() # 等待当前帧推理请求结束 infer_request_curr.wait() # 从 output_node获取当前帧推理结果 infer_result = infer_request_curr.get_tensor(output_node) # Postprocess the inference result data = torch.tensor(infer_result.data) # 交换当前帧推理请求和下一帧推理请求 infer_request_curr, infer_request_next = infer_request_next, infer_request_curr
2.使用openvino.preprocess api开发
OpenVINO™ 2022.1之后的预处理API可以将所有预处理步骤都集成到在执行图中,这样dGPU、VPU或iGPU都能进行数据预处理,无需依赖CPU。
1)数据预处理的典型操作
- 改变输入数据的形状:[720, 1280,3] → [1, 3, 640, 640]
- 改变输入数据的精度:U8 → f32
- 改变输入数据的颜色通道顺序:BGR → RGB
- 改变输入数据的布局(layout):HWC → NCHW
- 归一化数据:减去均值(mean),除以标准差(std)
2)OpenVINO预处理API主要流程
-
实例化PrePostProcessor对象
from openvino.runtime import Core, Type, Layout from openvino.preprocess import PrePostProcessor, ColorFormat, ResizeAlgorithm core = Core() model = core.read_model(model_path) ppp = PrePostProcessor(model)
-
声明输入张量的信息
image = cv2.imread(image_path) # Add N dimension input_tensor = np.expand_dims(image, 0) # 例如:input_tensor.shape = [1,640,640,3] ppp.input().tensor() \ .set_shape([1,640,640,3]) \ # 图像的尺寸,按照'NHWC'的顺序写 .set_color_format(ColorFormat.BGR) \ .set_element_type(Type.u8) \ .set_layout(Layout('NHWC'))
-
指定模型的数据布局(layout)
# 模型输入的数据布局为NCHW ppp.input().model().set_layout(Layout('NCHW')) # 模型输出的数据布局为NHWC【可选】 ppp.output().model().set_layout(Layout('NHWC'))
-
声明输出张量的信息
ppp.output().tensor() \ .set_element_type(Type.f32) # 输出张量的精度为f32 .set_layout(Layout('NHWC')) # 可选
-
定义预处理的具体步骤
# 或 自定义前处理步骤 ppp.input().preprocess() \ .convert_element_type(Type.f32) \ .convert_color(ColorFormat.RGB) \ # 将输入图像从BGR格式转化为RGB格式 .resize(ResizeAlgorithm.RESIZE_LINEAR, 224, 224) # 例如模型输入尺寸是[1,3,224,224] .mean([0.0, 0.0, 0.0]) \ .scale([255.0, 255.0, 255.0]) \ .convert_layout([0, 3, 1, 2]) # 将'NHWC'转化为'NCHW'
-
OpenVINO支持的前处理操作步骤
- convert_color、convert_element_type、convert_layout、crop、mean、resize、reverse_channels、scale、custom
-
OpenVINO支持的前处理操作步骤
-
定义后处理的具体步骤【可选】
ppp.output().postprocess() \ .convert_element_type(Type.f32) \ ..convert_layout([0, 3, 1, 2]) # 将'NHWC'转化为'NCHW'
-
OpenVINO支持的后处理操作步骤
- convert_element_type、convert_layout、custom
-
OpenVINO支持的后处理操作步骤
-
将预处理步骤集成到模型
model = ppp.build()
<img src="https://img-blog.csdnimg.cn/42d7745f14e1434cbaee46725eac5420.png" style="zoom: 80%;" />
-
将集成了预处理步骤的模型导出【可选】
from openvino.offline_transformations import serialize serialize(model, 'xxx.xml', 'xxx.bin')
3.Auto-Device及Automatic Batching插件
OpenVINOTM 2022.1中AUTO插件和自动批处理的最佳实践
1)Auto-Device
AUTO Device (简称 Automatic device selection) 是一个构建在CPU/GPU插件之上的虚拟插件,它不绑定到特定类型的设备,它可以是受支持的CPU、GPU、VPU(视觉处理单元)或 GNA(高斯神经加速器协处理器)或这些设备的组合。
优点:
- 根据深度学习模型和所选设备的特性以最佳配置使用它们。
- 使 GPU 实现更快的首次推理延迟:GPU 插件需要在开始推理之前在运行时进行在线模型编译。当选择独立或集成GPU时,“AUTO”插件开始会首先利用CPU进行推理,以隐藏此GPU模型编译时间。
- 使用简单,开发者只需将compile_model()方法的device_name参数指定为“AUTO”即可。
设备切换逻辑:
- AUTO插件会依据设备优先级: dGPU > iGPU > VPU > CPU 来选择最佳计算设备。当自动插件选择 GPU 作为最佳设备时,会发生推理设备切换,以隐藏首次推理延迟。
不同设备支持的精度
SupportedDevice | Supportedmodel precision |
---|---|
dGPU(e.g. Intel® Iris® Xe MAX) | FP32, FP16, INT8, BIN |
iGPU(e.g. Intel® UHD Graphics 620 (iGPU)) | FP32, FP16, BIN |
Intel® Movidius™ Myriad™ X VPU(e.g. Intel® Neural Compute Stick 2 (Intel® NCS2)) | FP16 |
Intel® CPU(e.g. Intel® Core™ i7-1165G7) | FP32, FP16, INT8, BIN |
2)Automatic Batching
自动批处理(Automatic Batching) 将用户程序发出的多个异步推理请求组合起来,将它们视为多批次推理请求,并将批推理结果拆解后,返回给各推理请求。
当compile_model()方法的config参数设置为{“PERFORMANCE_HINT”: ”THROUGHPUT”}时,OpenVINOTM Runtime会自动启动自动批处理执行。
-
PERFORMANCE_HINT 应用场景 是否启动Auto Batching? THROUGHPUT 非实时的大批量推理计算任务 是 LATENCY 实时或近实时应用任务 否
compiled_model = core.compile_model(model="xxx.onnx", device_name="AUTO", \
config={"PERFORMANCE_HINT": "THROUGHPUT", 'ALLOW_AUTO_BATCHING': 'YES'})
4.C++推理示例
#include <iterator>
#include <memory>
#include <sstream>
#include <string>
#include <vector>
// clang-format off
#include "openvino/openvino.hpp"
#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/classification_results.h"
#include "samples/slog.hpp"
#include "format_reader_ptr.h"
// clang-format on
/**
* @brief Main with support Unicode paths, wide strings
*/
int tmain(int argc, tchar* argv[]) {
try {
// -------- Step 1. Initialize OpenVINO Runtime Core --------
ov::Core core;
// -------- Step 2. Read a model --------
std::shared_ptr<ov::Model> model = core.read_model(model_path);
printInputAndOutputsInfo(*model);
// -------- Step 3. Set up input
// Read input image to a tensor and set it to an infer request
// without resize and layout conversions
FormatReader::ReaderPtr reader(image_path.c_str());
if (reader.get() == nullptr) {
std::stringstream ss;
ss << "Image " + image_path + " cannot be read!";
throw std::logic_error(ss.str());
}
ov::element::Type input_type = ov::element::u8;
ov::Shape input_shape = {1, reader->height(), reader->width(), 3};
std::shared_ptr<unsigned char> input_data = reader->getData();
// just wrap image data by ov::Tensor without allocating of new memory
ov::Tensor input_tensor = ov::Tensor(input_type, input_shape, input_data.get());
const ov::Layout tensor_layout{"NHWC"};
// -------- Step 4. Configure preprocessing --------
ov::preprocess::PrePostProcessor ppp(model);
// 1) Set input tensor information:
// - input() provides information about a single model input
// - reuse precision and shape from already available `input_tensor`
// - layout of data is 'NHWC'
ppp.input().tensor().set_shape(input_shape).set_element_type(input_type).set_layout(tensor_layout);
// 2) Adding explicit preprocessing steps:
// - convert layout to 'NCHW' (from 'NHWC' specified above at tensor layout)
// - apply linear resize from tensor spatial dims to model spatial dims
ppp.input().preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
// 4) Here we suppose model has 'NCHW' layout for input
ppp.input().model().set_layout("NCHW");
// 5) Set output tensor information:
// - precision of tensor is supposed to be 'f32'
ppp.output().tensor().set_element_type(ov::element::f32);
// 6) Apply preprocessing modifying the original 'model'
model = ppp.build();
// -------- Step 5. Loading a model to the device --------
ov::CompiledModel compiled_model = core.compile_model(model, device_name);
// -------- Step 6. Create an infer request --------
ov::InferRequest infer_request = compiled_model.create_infer_request();
// -----------------------------------------------------------------------------------------------------
// -------- Step 7. Prepare input --------
infer_request.set_input_tensor(input_tensor);
// -------- Step 8. Do inference synchronously --------
infer_request.infer();
// -------- Step 9. Process output
const ov::Tensor& output_tensor = infer_request.get_output_tensor();
// Print classification results
ClassificationResult classification_result(output_tensor, {image_path});
classification_result.show();
// -----------------------------------------------------------------------------------------------------
} catch (const std::exception& ex) {
std::cerr << ex.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
推理模式
-
自动设备选择 (AUTO):检测可用设备,选择最适合该任务的设备,并配置其优化设置。这样可以编写一次应用程序并将其部署到任何地方。
-
从 CPU 开始执行推理,继续将模型加载到最适合该目的的设备,并在准备好时将任务转移给它。
- 使用CPU可以减少首次推理延时。
-
多设备执行 (MULTI)
异构执行 (HETERO):允许在多个设备上执行一个模型的推理。
自动批处理执行(Auto-batching):通过将推理请求分组在一起来提高设备利用率,而无需用户进行编程。