# ESP-DL [[English]](./README.md) ESP-DL 是由乐鑫官方针对乐鑫系列芯片 [ESP32](https://www.espressif.com/en/products/socs/esp32)、[ESP32-S2](https://www.espressif.com/en/products/socs/esp32-s2)、[ESP32-S3](https://www.espressif.com/en/products/socs/esp32-s3) 和 [ESP32-C3](https://www.espressif.com/en/products/socs/esp32-c3) 所提供的高性能深度学习开发库。更多文档请查看 [ESP-DL 用户指南](https://docs.espressif.com/projects/esp-dl/zh_CN/latest/esp32/index.html) ## 概述 ESP-DL 为**神经网络推理**、**图像处理**、**数学运算**以及一些**深度学习模型**提供 API,通过 ESP-DL 能够快速便捷地将乐鑫各系列芯片产品用于人工智能应用。 ESP-DL 无需借助任何外围设备,因此可作为一些项目的组件,例如可将其作为 **[ESP-WHO](https://github.com/espressif/esp-who)** 的一个组件,该项目包含数个项目级图像应用实例。下图展示了 ESP-DL 的组成及作为组件时在项目中的位置。 <p align="center"> <img width="%" src="./docs/_static/architecture_cn.drawio.svg"> </p> ## 入门指南 安装并入门 ESP-DL,请参考[快速入门](./docs/en/get_started.md)。 > 请使用 ESP-IDF 在 release/v5.0 分支上的[最新版本](https://github.com/espressif/esp-idf/tree/release/v5.0)。 ## 尝试模型库中的模型 ESP-DL 在 [模型库](./include/model_zoo) 中提供了一些模型的 API,如人脸检测、人脸识别、猫脸检测等。您可以使用下表中开箱即用的模型。 | 项目 | API 实例 | | -------------------- | ------------------------------------------------------------ | | 人脸检测 | [ESP-DL/examples/human_face_detect](examples/human_face_detect) | | 人脸识别 | [ESP-DL/examples/face_recognition](examples/face_recognition) | | 猫脸检测 | [ESP-DL/examples/cat_face_detect](examples/cat_face_detect) | ## 部署你的模型 我们推荐使用 TVM 来部署你的模型,具体可参考 [ESP-DL/tutorial/tvm_example](tutorial/tvm_example)。 ## 反馈 如果您在使用中发现了错误或者需要新的功能,请提交相关 [issue](https://github.com/espressif/esp-dl/issues),我们会优先实现最受期待的功能。
# 猫脸检测 [[English]](./README.md) 本项目为猫脸检测接口的示例。猫脸检测接口的输入图片为静态图片,检测结果的置信度分数和坐标值可显示在终端中,检测结果的图片可通过工具显示在 PC 屏上。 项目所在文件夹结构如下: ```shell cat_face_detect/ ├── CMakeLists.txt ├── image.jpg ├── main │ ├── app_main.cpp │ ├── CMakeLists.txt │ └── image.hpp ├── README.md ├── README_cn.md └── result.png ``` ## 运行示例 1. 打开终端,进入猫脸检测示例所在文件夹 esp-dl/examples/cat_face_detect: ```shell cd ~/esp-dl/examples/cat_face_detect ``` 2. 设定目标芯片: ```shell idf.py set-target [SoC] ``` 将 [SoC] 替换为您的目标芯片,如 esp32、esp32s2、esp32s3。 3. 烧录程序,运行 IDF 监视器获取检测结果的分数值和坐标值: ```shell idf.py flash monitor ... ... [0] score: 1.709961, box: [122, 2, 256, 117] ``` 4. 存放在 [examples/tool/](../tool/) 目录下的显示工具 `display_image.py`,可方便您更直观地查看检测结果的图片。根据[工具](../tool/README_cn.md)介绍使用显示工具,运行如下命令: ```shell python display_image.py -i ../cat_face_detect/image.jpg -b "(122, 2, 256, 117)" ``` PC 屏上会显示当前示例检测结果的图片,如下图所示: <p align="center"> <img width="%" src="./result.png"> </p> ## 自定义输入图片 示例中 [./main/image.hpp](./main/image.hpp) 是预设的输入图片。您可根据[工具](../tool/README_cn.md)介绍,使用存放在 [example/tool/](../tool/) 目录下的转换工具 `convert_to_u8.py`,将自定义图片转换成 C/C++ 的形式,替换预设图片。 1. 将自定义图片存放至 ./examples/cat_face_detect 目录下,使用 [examples/tool/convert_to_u8.py](../tool/convert_to_u8.py) 把图片转换为 hpp 格式: ```shell # 假设当前仍在目录 cat_face_detect 下 python ../tool/convert_to_u8.py -i ./image.jpg -o ./main/image.hpp ``` 2. 参考[运行示例](#运行示例)中的步骤,烧录固件,打印检测结果的置信度分数和坐标值,显示检测结果的图片。 ## 延时情况 | 芯片 | 耗时 | | :------: | ---------: | | ESP32 | 149,765 us | | ESP32-S2 | 416,590 us | | ESP32-S3 | 18,909 us | > 以上数据基于示例的默认配置。
# 颜色检测 [[English]](./README.md) 本项目为颜色检测接口的示例。颜色检测接口的输入图片为一张带有不同色块的静态图片。 输出是录入颜色、检测颜色、分割颜色, 删除颜色等接口功能的运行结果,显示在终端中。 项目所在文件夹结构如下: ```shell color_detect/ ├── CMakeLists.txt ├── rgby.jpg ├── main │ ├── app_main.cpp │ ├── CMakeLists.txt │ └── image.hpp ├── partitions.csv └── README.md └── README_cn.md ``` ## 运行示例 1. 打开终端,进入颜色检测示例所在文件夹 esp-dl/examples/color_detect: ```shell cd ~/esp-dl/examples/color_detect ``` 2. 设定目标芯片: ```shell idf.py set-target [SoC] ``` 将 [SoC] 替换为您的目标芯片,如 esp32、esp32s2、esp32s3。 由于 ESP32-S3 芯片在 AI 应用上的运行速度远快于其他芯片,我们更推荐您使用 ESP32-S3 芯片。 3. 烧录程序,运行 IDF 监视器获取各功能的运行结果: ```shell idf.py flash monitor ... ... the information of registered colors: name: red, thresh: 0, 10, 203, 255, 197, 255 name: green, thresh: 54, 62, 221, 255, 197, 255 name: blue, thresh: 96, 114, 179, 255, 230, 255 name: yellow, thresh: 19, 32, 214, 255, 247, 255 RGB888 | color detection result: color 0: detected box :2 center: (46, 14) box: (0, 0), (94, 30) area: 768 center: (14, 110) box: (0, 96), (30, 126) area: 256 color 1: detected box :2 center: (110, 30) box: (96, 0), (126, 62) area: 512 center: (30, 46) box: (0, 32), (62, 62) area: 512 color 2: detected box :2 center: (88, 68) box: (64, 32), (126, 94) area: 768 center: (14, 78) box: (0, 64), (30, 94) area: 256 color 3: detected box :1 center: (70, 102) box: (32, 64), (126, 126) area: 1024 RGB565 | color detection result: color 0: detected box :2 center: (46, 14) box: (0, 0), (94, 30) area: 768 center: (14, 110) box: (0, 96), (30, 126) area: 256 color 1: detected box :2 center: (110, 30) box: (96, 0), (126, 62) area: 512 center: (30, 46) box: (0, 32), (62, 62) area: 512 color 2: detected box :2 center: (88, 68) box: (64, 32), (126, 94) area: 768 center: (14, 78) box: (0, 64), (30, 94) area: 256 color 3: detected box :1 center: (70, 102) box: (32, 64), (126, 126) area: 1024 remained colors num: 3 Blue, Yellow | color detection result: color 0: detected box :2 center: (88, 68) box: (64, 32), (126, 94) area: 768 center: (14, 78) box: (0, 64), (30, 94) area: 256 color 1: detected box :1 center: (70, 102) box: (32, 64), (126, 126) area: 1024 Blue, Yellow | color segmentation result: color 0: detected box :2 box_index: 0, start col: 32, end col: 47, row: 16, area: 768 box_index: 0, start col: 32, end col: 47, row: 17, area: 768 box_index: 0, start col: 32, end col: 47, row: 18, area: 768 box_index: 0, start col: 32, end col: 47, row: 19, area: 768 box_index: 0, start col: 32, end col: 47, row: 20, area: 768 box_index: 0, start col: 32, end col: 47, row: 21, area: 768 box_index: 0, start col: 32, end col: 47, row: 22, area: 768 box_index: 0, start col: 32, end col: 47, row: 23, area: 768 box_index: 0, start col: 32, end col: 47, row: 24, area: 768 box_index: 0, start col: 32, end col: 47, row: 25, area: 768 box_index: 0, start col: 32, end col: 47, row: 26, area: 768 box_index: 0, start col: 32, end col: 47, row: 27, area: 768 box_index: 0, start col: 32, end col: 47, row: 28, area: 768 box_index: 0, start col: 32, end col: 47, row: 29, area: 768 box_index: 0, start col: 32, end col: 47, row: 30, area: 768 box_index: 0, start col: 32, end col: 47, row: 31, area: 768 box_index: 1, start col: 0, end col: 15, row: 32, area: 256 box_index: 0, start col: 32, end col: 63, row: 32, area: 768 box_index: 1, start col: 0, end col: 15, row: 33, area: 256 box_index: 0, start col: 32, end col: 63, row: 33, area: 768 box_index: 1, start col: 0, end col: 15, row: 34, area: 256 box_index: 0, start col: 32, end col: 63, row: 34, area: 768 box_index: 1, start col: 0, end col: 15, row: 35, area: 256 box_index: 0, start col: 32, end col: 63, row: 35, area: 768 box_index: 1, start col: 0, end col: 15, row: 36, area: 256 box_index: 0, start col: 32, end col: 63, row: 36, area: 768 box_index: 1, start col: 0, end col: 15, row: 37, area: 256 box_index: 0, start col: 32, end col: 63, row: 37, area: 768 box_index: 1, start col: 0, end col: 15, row: 38, area: 256 box_index: 0, start col: 32, end col: 63, row: 38, area: 768 box_index: 1, start col: 0, end col: 15, row: 39, area: 256 box_index: 0, start col: 32, end col: 63, row: 39, area: 768 box_index: 1, start col: 0, end col: 15, row: 40, area: 256 box_index: 0, start col: 32, end col: 63, row: 40, area: 768 box_index: 1, start col: 0, end col: 15, row: 41, area: 256 box_index: 0, start col: 32, end col: 63, row: 41, area: 768 box_index: 1, start col: 0, end col: 15, row: 42, area: 256 box_index: 0, start col: 32, end col: 63, row: 42, area: 768 box_index: 1, start col: 0, end col: 15, row: 43, area: 256 box_index: 0, start col: 32, end col: 63, row: 43, area: 768 box_index: 1, start col: 0, end col: 15, row: 44, area: 256 box_index: 0, start col: 32, end col: 63, row: 44, area: 768 box_index: 1, start col: 0, end col: 15, row: 45, area: 256 box_index: 0, start col: 32, end col: 63, row: 45, area: 768 box_index: 1, start col: 0, end col: 15, row: 46, area: 256 box_index: 0, start col: 32, end col: 63, row: 46, area: 768 box_index: 1, start col: 0, end col: 15, row: 47, area: 256 box_index: 0, start col: 32, end col: 63, row: 47, area: 768 color 1: detected box :1 box_index: 0, start col: 16, end col: 31, row: 32, area: 1024 box_index: 0, start col: 16, end col: 31, row: 33, area: 1024 box_index: 0, start col: 16, end col: 31, row: 34, area: 1024 box_index: 0, start col: 16, end col: 31, row: 35, area: 1024 box_index: 0, start col: 16, end col: 31, row: 36, area: 1024 box_index: 0, start col: 16, end col: 31, row: 37, area: 1024 box_index: 0, start col: 16, end col: 31, row: 38, area: 1024 box_index: 0, start col: 16, end col: 31, row: 39, area: 1024 box_index: 0, start col: 16, end col: 31, row: 40, area: 1024 box_index: 0, start col: 16, end col: 31, row: 41, area: 1024 box_index: 0, start col: 16, end col: 31, row: 42, area: 1024 box_index: 0, start col: 16, end col: 31, row: 43, area: 1024 box_index: 0, start col: 16, end col: 31, row: 44, area: 1024 box_index: 0, start col: 16, end col: 31, row: 45, area: 1024 box_index: 0, start col: 16, end col: 31, row: 46, area: 1024 box_index: 0, start col: 16, end col: 31, row: 47, area: 1024 box_index: 0, start col: 16, end col: 63, row: 48, area: 1024 box_index: 0, start col: 16, end col: 63, row: 49, area: 1024 box_index: 0, start col: 16, end col: 63, row: 50, area: 1024 box_index: 0, start col: 16, end col: 63, row: 51, area: 1024 box_index: 0, start col: 16, end col: 63, row: 52, area: 1024 box_index: 0, start col: 16, end col: 63, row: 53, area: 1024 box_index: 0, start col: 16, end col: 63, row: 54, area: 1024 box_index: 0, start col: 16, end col: 63, row: 55, area: 1024 box_index: 0, start col: 16, end col: 63, row: 56, area: 1024 box_index: 0, start col: 16, end col: 63, row: 57, area: 1024 box_index: 0, start col: 16, end col: 63, row: 58, area: 1024 box_index: 0, start col: 16, end col: 63, row: 59, area: 1024 box_index: 0, start col: 16, end col: 63, row: 60, area: 1024 box_index: 0, start col: 16, end col: 63, row: 61, area: 1024 box_index: 0, start col: 16, end col: 63, row: 62, area: 1024 box_index: 0, start col: 16, end col: 63, row: 63, area: 1024 ```
# 人脸识别 [[English]](./README.md) 本项目为人脸识别接口的示例。人脸识别接口的输入图片为一张带有人脸的静态图片,输出是录入人脸、识别人脸、删除人脸等接口功能的运行结果,显示在终端中。 该接口提供了 16 位量化与 8 位量化两个版本的模型。16 位量化的模型相比于 8 位量化的模型,精度更高,但是占用内存更多,运行速度也更慢。您可以根据实际使用场景挑选合适的模型。 项目所在文件夹结构如下: ```shell face_recognition/ ├── CMakeLists.txt ├── image.jpg ├── main │ ├── app_main.cpp │ ├── CMakeLists.txt │ └── image.hpp ├── partitions.csv └── README.md └── README_cn.md ``` ## 运行示例 1. 打开终端,进入人脸检测示例所在文件夹 esp-dl/examples/face_recognition ```shell cd ~/esp-dl/examples/face_recognition ``` 2. 设定目标芯片: ```shell idf.py set-target [SoC] ``` 将 [SoC] 替换为您的目标芯片,如 esp32、esp32s2、esp32s3。 由于 ESP32-S3 芯片在 AI 应用上的运行速度远快于其他芯片,我们更推荐您使用 ESP32-S3 芯片。 3. 烧录程序,运行 IDF 监视器获取各功能的运行结果: ```shell idf.py flash monitor ... ... E (1907) MFN: Flash is empty enroll id ... name: Sandra, id: 1 name: Jiong, id: 2 recognize face ... [recognition result] id: 1, name: Sandra, similarity: 0.728666 [recognition result] id: 2, name: Jiong, similarity: 0.827225 recognizer information ... recognizer threshold: 0.55 input shape: 112, 112, 3 face id information ... number of enrolled ids: 2 id: 1, name: Sandra id: 2, name: Jiong delete id ... number of remaining ids: 1 [recognition result] id: -1, name: unknown, similarity: 0.124767 enroll id ... name: Jiong, id: 2 write 2 ids to flash. recognize face ... [recognition result] id: 1, name: Sandra, similarity: 0.758815 [recognition result] id: 2, name: Jiong, similarity: 0.722041 ``` ## 其他设置 1. [./main/app_main.cpp](./main/app_main.cpp) 开头处的宏定义 `QUANT_TYPE`,可定义模型的量化类型。 - `QUANT_TYPE` = 0:使用 8 位量化模型,识别精度低于 16 位模型,但速度更快,内存占用更少。 - `QUANT_TYPE` = 1:使用 16 位量化模型,识别精度与浮点模型一致。 您可根据实际使用场景挑选合适的模型。 2. [./main/app_main.cpp](./main/app_main.cpp) 开头处的宏定义 `USE_FACE_DETECTOR`,可定义人脸关键点 (landmark) 坐标的获得方式。 - `USE_FACE_DETECTOR` = 0:使用存放在 ./image.hpp 中的关键点坐标。 - `USE_FACE_DETECTOR` = 1:使用人脸检测模型获得关键点坐标。 请注意关键点坐标顺序为: ``` left_eye_x, left_eye_y, mouth_left_x, mouth_left_y, nose_x, nose_y, right_eye_x, right_eye_y, mouth_right_x, mouth_right_y ``` ## 延时情况 | SoC | 8 位 | 16 位 | | :------: | --------: | -------: | | ESP32 | 13,301 ms | 5,041 ms | | ESP32-S3 | 287 ms | 554 ms |
# 人脸检测 [[English]](./README.md) 本项目为人脸检测接口的示例。人脸识别接口的输入图片为静态图片,检测结果的置信度分数和坐标值可显示在终端中,检测结果的图片可通过工具显示在 PC 屏上。 项目所在文件夹结构如下: ```shell human_face_detect/ ├── CMakeLists.txt ├── image.jpg ├── main │ ├── app_main.cpp │ ├── CMakeLists.txt │ └── image.hpp ├── partitions.csv ├── README.md ├── README_cn.md └── result.png ``` ## 运行示例 1. 打开终端,进入人脸检测示例所在文件夹 esp-dl/examples/human_face_detect: ```shell cd ~/esp-dl/examples/human_face_detect ``` 2. 设定目标芯片: ```shell idf.py set-target [SoC] ``` 将 [SoC] 替换为您的目标芯片,如 esp32、esp32s2、esp32s3。 3. 烧录固件,打印检测结果的分数值和坐标值: ```shell idf.py flash monitor ... ... [0] score: 0.987580, box: [137, 75, 246, 215] left eye: (157, 131), right eye: (199, 133) nose: (170, 163) mouth left: (158, 177), mouth right: (193, 180) ``` 4. 存放在 [example/tool/](../tool/) 目录下的显示工具 `display_image.py`,可方便您更直观地查看检测结果的图片。根据[工具](../tool/README_cn.md)介绍使用显示工具,运行如下命令: ```shell python display_image.py -i ../human_face_detect/image.jpg -b "(137, 75, 246, 215)" -k "(157, 131, 199, 133, 170, 163, 158, 177, 193, 180)" ``` PC 屏上会显示当前示例检测结果的图片,如下图所示: <p align="center"> <img width="%" src="./result.png"> </p> ## 其他设置 [./main/app_main.cpp](./main/app_main.cpp) 开头处的宏定义 `TWO_STAGE`,可定义目标检测的算法。如注释所述: - `TWO_STAGE` = 1:检测器为 two-stage(两阶段),检测结果更加精确(支持人脸关键点),但速度较慢。 - `TWO_STAGE` = 0:检测器为 one-stage(单阶段),检测结果精确度稍差(不支持人脸关键点),但速度较快。 您可自行体验两者差异。 ## 自定义输入图片 示例中 [./main/image.hpp](./main/image.hpp) 是预设的输入图片。您可根据[工具](../tool/README_cn.md)介绍,使用存放在 [example/tool/](../tool/) 目录下的转换工具 `convert_to_u8.py`,将自定义图片转换成 C/C++ 的形式,替换预设图片。 1. 将自定义图片存放至 ./examples/human_face_detect 目录下,使用 [examples/tool/convert_to_u8.py](../tool/convert_to_u8.py) 把图片转换为 hpp 格式: ```shell # 假设当前仍在目录 human_face_detect 下 python ../tool/convert_to_u8.py -i ./image.jpg -o ./main/image.hpp ``` 2. 参考[运行示例](#运行示例)中的步骤,烧录固件,打印检测结果的置信度分数和坐标值,显示检测结果的图片。 ## 延时情况 | 芯片 | `TWO_STAGE` = 1 | `TWO_STAGE` = 0 | | :------: | --------------: | --------------: | | ESP32 | 415,246 us | 154,687 us | | ESP32-S2 | 1,052,363 us | 309,159 us | | ESP32-S3 | 56,303 us | 16,614 us | > 以上数据基于示例的默认配置。
idf.py add-dependency "espressif/esp-dl^2.0.0"
To create a project from this example, run:
idf.py create-project-from-example "espressif/esp-dl^2.0.0:cat_face_detect"
To create a project from this example, run:
idf.py create-project-from-example "espressif/esp-dl^2.0.0:color_detect"
To create a project from this example, run:
idf.py create-project-from-example "espressif/esp-dl^2.0.0:face_recognition"
To create a project from this example, run:
idf.py create-project-from-example "espressif/esp-dl^2.0.0:human_face_detect"