# PicoTTS This component provides an ESP-IDF port of the popular PicoTTS Text-to-Speech engine. While Espressif provides an Chinese language TTS, to date there has been no support for other languages. PicoTTS fills this gap, and provides Text-To-Speech for the following languages: - English (UK) - English (US) - German - French - Italian - Spanish ## Requirements The Text-to-Speech engine is quite resource intensive. While the code size is only around 175KB, language resources occupy another 750-1400KB of flash depending on language, and the engine uses just over 1.1MB of RAM while initialised. As such an ESP32-S3 with sufficient amount of PSRAM and flash is a recommended target. This component does not provide any board-specific audio support. The TTS engine generates 16bit/16KHz samples, and leaves it to the user to direct those to the correct audio device. ## Getting started Using the PicoTTS component is straight forward. Effectively the steps are: - Initialise the engine - Register a callback function to receive the speech samples - Send text to the engine - Eventually, shut down the engine In code, this can look like: ``` #include "picotts.h" #define TTS_TASK_PRIORITY 5 #define TTS_CORE 1 void my_sample_cb(int16_t *buf, unsigned count) { esp_codec_dev_write(speaker_codec_dev, buf, count*2); } if (picotts_init(TTS_TASK_PRIORITY, my_sample_cb, TTS_CORE)) { static const msg[] = "Hello, world"; picotts_add(msg, sizeof(msg)); // Include the \0 to tell TTS to go // Do other stuff, or at least wait until the msg has been spoken picotts_shutdown(); } ``` API documentation can be found in the [picotts.h](include/picotts.h) header file. ## Resource handling The PicoTTS engine relies on two resource blobs, a Text Analysis (TA) resource and a Signal Generator (SG) resource. In upstream PicoTTS, these are loaded into RAM from files on disk. As RAM is a very precious resource on a microcontroller, this component has replaced the resource loading routines such that they can be accessed directly from memory-mapped flash instead. This reduces the RAM foot-print from 2.5MB down to 1.1MB. There are two options on how to bundle the resource files onto flash. The default, and arguably the easiest, is to embed the resource files directly into the application binary. The one downside to this approach is that application size grows significantly, and may present an issue with firmware upgrades. You will definitely use a much larger application partition than usual. Alternatively, the resource files can be placed in dedicated flash partitions and accessed from there instead. These partitions have to be flashed independently from the application partition, but the advantage is that the language resources are no longer directly coupled to the application binary. Which approach is best will depend on the specific project circumstances. ### Custom paritions for language resources When this component is configured to load its language resources from partitions rather than having them directly embedded into the application binary itself, you will need to add partition entries to hold the Text Analysis (TA) and Signal Generator (SG) resources. Example entries for `partitions.csv`: ``` picotts_ta, 0x40, 0x0, , 640K, picotts_sg, 0x40, 0x1, , 820K, ``` You are free to use any valid partition type and subtype. This component loads purely by the partition name. The partition names may be changed via Kconfig if so desired. The partition sizes may be shrunk to better match the language you're using. What's show here are the maximum partition sizes to fit any language bundle. ### Flashing language resources to partitions When configured for loading language resources from partitions, the following commands can be used to flash them to the correct locations. These commands assume the partition names are "picotts\_ta" and "picotts\_sg", respectively. ``` parttool.py write_partition --partition-name picotts_ta --input build/esp-idf/picotts/picotts_ta.bin parttool.py write_partition --partition-name picotts_sg --input build/esp-idf/picotts/picotts_sg.bin ``` ## Examples The [boot\_greeting](examples/boot_greeting/README.md) example is written for ESP-BOX and uses this component to issue a greeting upon boot.
idf.py add-dependency "jmattsson/picotts^1.1.0"