espressif/esp-nn

1.2.1

Latest
uploaded 1 day ago
Optimized NN (Neural Network) functions for Espressif chips

readme

# ESP-NN

The library contains optimised NN (Neural Network) functions for various Espressif chips.

* Supported platforms:
   * TensorFlow Lite Micro (TFLite Micro). Repo can be found [here](https://github.com/espressif/tflite-micro-esp-examples)

* Supported ESP chips include:
   * ESP32-S3 (Assembly versions optimised to benefit from vector instructions of ESP32-S3)
   * ESP32-P4 (Optimised using PIE/QACC SIMD instructions)
   * ESP32 (Generic optimisations)
   * ESP32-C3 (Generic optimisations)

## Performance

### Kernelwise performance for s8 versions:

  * Kernelwise performance on ESP32-P4 chip
    * Numbers are ticks taken for kernel to execute
    * Chip config: 360MHz, SPI-RAM: HEX 200MHz, L2-Cache: 128KB

    | Function        | ANSI C  | Optimized | Opt Ratio | Data info   | Memory    |
    | ----------------| --------|---------|---------|-------------|-----------|
    | elementwise_add | 187971  | 173104  |   --    | size = 1615 | External  |
    | elementwise_mul | 79898   | 71245   |   --    | size = 1615 | External  |
    | convolution     | 4005512 | 572459  |  7.00   | input(10,10), filter(64x1x1x64), pad(0,0), stride(1,1) | External |
    | convolution     | 249389  | 98319   | 2.54    | input(8,8), filter(16x1x1x16), pad(0,0), stride(1,1) | External |
    | convolution     | 816975  | 533318  | 1.53    | input(10,10), filter(64x3x3x3), pad(0,0), stride(1,1) | External |
    | depthwise conv  | 962834  | 482389  | 2.00    | input (16, 16), pad(0,0), stride(1,1) filter: 1x3x3x16 | External |
    | depthwise conv  | 1365066 | 703989  | 1.94    | input (12, 12), pad(1,1), stride(1,1)  filter: 8x5x5x4 | External |
    | max pool        | 601843  | 592189  |   --    | input(16,16), filter (1x3x3x16) | Internal |
    | avg pool        | 392947  | 380527  |   --    | input(16,16), filter (1x3x3x16) | Internal |
    | fully connected | 7692   | 7616     |   --    | len: 271, ch = 3 | Internal |
    | prelu (relu6)   | 22487   | 18963   |   --    | size, 1615  | Internal  |


  * Kernelwise performance on ESP32-S3 chip
    * Numbers are ticks taken for kernel to execute
    * Chip config: 240MHz, SPI: QPI 80MHz, Data cache: 64KB

    | Function        | ANSI C   | Optimized | Opt Ratio | Data info   | Memory    |
    | ----------------| ---------|-----------|-----------|-------------|-----------|
    | elementwise_add | 281337   | 74440     | 3.78      | size = 1615 | External  |
    | elementwise_mul | 122703   | 35002     | 3.51      | size = 1615 | External  |
    | convolution     | 4712500  | 331008    | 14.24     | input(10,10), filter(64x1x1x64), pad(0,0), stride(1,1) | External |
    | convolution     | 312754   | 39022     | 8.01      | input(8,8), filter(16x1x1x16), pad(0,0), stride(1,1) | External |
    | convolution     | 2193289  | 394842    | 5.55      | input(8,8), filter(64x3x3x3), pad(0,0), stride(1,1) | External |
    | depthwise conv  | 1159831  | 184176    | 6.30      | input(18,18), pad(0,0), stride(1,1), filter: 1x3x3x16 | External |
    | depthwise conv  | 1671363  | 372435    | 4.49      | input(12,12), pad(1,1), stride(1,1), filter: 8x5x5x4 | External |
    | max pool        | 376294   | 48069     | 7.83      | input(16,16), filter(1x3x3x16) | Internal |
    | avg pool        | 427293   | 118052    | 3.62      | input(16,16), filter(1x3x3x16) | Internal |
    | fully connected | 8443     | 1078      | 7.83      | len: 271, ch = 3 | Internal |
    | softmax         | 15209    | 11107     | 1.37      | h: 8, w: 32 | Internal  |
    | prelu (relu6)   | 1125     | 98        | 11.48     | size: 1615  | Internal  |


### Model-level performance:

  * **Person Detection** (Visual Wake Words, INT8 quantized — from [esp-tflite-micro](https://github.com/espressif/esp-tflite-micro))
    * Numbers are time (ms) for `invoke()` call, using internal memory

    | Chip     | CPU Freq | without ESP-NN | with ESP-NN |
    | -------- | -------- | -------------- | ----------- |
    | ESP32-P4 | 360MHz   | 1395ms         | 73ms        |
    | ESP32-S3 | 240MHz   | 2300ms         | 54ms        |
    | ESP32    | 240MHz   | 4084ms         | 380ms       |
    | ESP32-C3 | 160MHz   | 3355ms         | 426ms       |

  * **MobileNetV3 Small** (INT8 quantized, 224x224x3, 1000 classes)

    | Chip     | CPU Freq | without ESP-NN | with ESP-NN |
    | -------- | -------- | -------------- | ----------- |
    | ESP32-S3 | 240MHz   | 26000ms        | 1434ms      |
    | ESP32-P4 | 360MHz   | 11600ms        | 1305ms      |

> **Note**:
  - The above is time taken for execution of the `invoke()` call
  - SPIRAM used for TensorArena.
  - Person detection on ESP32-S3 with internal RAM: 47ms
  - ESP32-P4 optimisation is work in progress
  - `Without ESP-NN` case is when `esp-nn` is completely disabled by removing below flag from [CMakeLists.txt](CMakeLists.txt):
    ```cmake
      # enable ESP-NN optimizations by Espressif
      target_compile_options(${COMPONENT_LIB} PRIVATE -DESP_NN)
    ```


## Configuration

  * To configure, please use `idf.py menuconfig` and under `ESP-NN` select `NN_OPTIMIZATIONS`
  * There are two options presented:
     * Optimized versions
     * ANSI C

  * Default selection is for `Optimized versions`. For ESP32-S3 and ESP32-P4, assembly versions are automatically selected, whereas for other chips (viz., ESP32, ESP32-C3), generic optimisations are selected.
  * For debugging purposes, you may want to select `ANSI C` reference versions.


## Contributing

If you encounter an issue with ESP-NN, or wish to submit a feature request, please use the Issues section on the Github.

For general questions related to this library, please use the esp32.com forum.

Please check [CONTRIBUTING.md](CONTRIBUTING.md) for further information if you'd like to contribute to ESP-NN.

## Copyrights and License

All original source code in this repository is Copyright (C) 2020-2021 Espressif Systems. This source code is licensed under the Apache License 2.0 as described in the file LICENSE.

Links

Supports all targets

License: Apache-2.0

To add this component to your project, run:

idf.py add-dependency "espressif/esp-nn^1.2.1"

download archive

Stats

  • Archive size
    Archive size ~ 146.60 KB
  • Downloaded in total
    Downloaded in total 270.7k times
  • Downloaded this version
    This version: 807 times

Badge

espressif/esp-nn version: 1.2.1
|