# esp-idf-zstd
[](https://github.com/rderr/esp-idf-zstd/actions/workflows/build.yml)
[](LICENSE)
[Zstandard](https://github.com/facebook/zstd) (zstd) compression library packaged as an ESP-IDF component. Wraps upstream zstd v1.5.7 as a git submodule.
## Memory usage
zstd was designed for desktop and server workloads. On ESP32 it works but uses substantially more RAM than zlib or heatshrink.
Measured on an ESP32-S3 compressing a 73-byte JSON payload with a 10.7 KB trained dictionary:
- Default level 3 + CDict: ~220 KB working heap, ~2.5–3.0x ratio
- Level 1 + CDict: ~180 KB working heap, ~2.1x ratio
- Aggressively tuned (level -3, windowLog=14, hashLog=12, pledged srcSize): ~65 KB working heap, ~1.8x ratio
- Below ~65 KB the match finder runs out of slots to use the dictionary and the compressor effectively stops working
For comparison, zlib at minimum tuning uses ~10–15 KB of working heap and achieves comparable ratios on small dictionary-aware payloads; heatshrink runs in 1–4 KB.
The `ZSTD_DCtx` struct alone is 50+ KB at default settings, so even decompression-only deployments are memory-heavy. Plan for a 150–250 KB heap window at default settings, or ~65–100 KB if you tune aggressively.
### When to pick which codec
| Situation | Recommended |
|---|---|
| Payloads >1 KB, ≥150 KB free heap | zstd — best ratio + speed |
| Speed-critical, frequent compression, RAM available | zstd at level 1 — ~3–5× faster than zlib |
| Small payloads (<200 B), tuning effort acceptable | zstd with aggressive tuning (usable but not clearly better than zlib) |
| Small payloads, <150 KB free heap | zlib + primed dictionary (miniz is already in ESP-IDF) |
| Very tight RAM (<32 KB free) | heatshrink |
## Features
- Full zstd v1.5.7 compression and decompression
- Kconfig toggles to disable either direction (saves code size)
- Minify mode for smallest possible code-size footprint
- Trained-dictionary support for small structured payloads (within the RAM budget above)
- Host-side Python dictionary trainer included
## Installation
### ESP-IDF Component Manager (recommended)
Add to your project's `idf_component.yml`:
```yaml
dependencies:
esp-idf-zstd:
version: "*"
```
Then run:
```bash
idf.py reconfigure
```
### Manual
Clone into your project's `components/` directory:
```bash
cd your_project/components
git clone --recursive https://github.com/rderr/esp-idf-zstd.git
```
## Quick Start
**Do not call zstd from the main task.** `ZSTD_compress` and `ZSTD_decompress` use 6–12 KB of stack — the default ESP-IDF main task stack (3,584 bytes) will overflow silently and crash later at a context switch. Always run zstd on a dedicated worker task with a 16 KB stack (trim after measuring with `uxTaskGetStackHighWaterMark`).
```c
#include "zstd.h"
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
static void zstd_task(void *arg)
{
size_t bound = ZSTD_compressBound(src_size);
void *dst = malloc(bound);
size_t compressed_size = ZSTD_compress(dst, bound, src, src_size, 1);
// ... decompress similarly ...
free(dst);
vTaskDelete(NULL);
}
void app_main(void)
{
xTaskCreate(zstd_task, "zstd", 16384, NULL, 5, NULL);
}
```
See [`examples/basic`](examples/basic/) for a complete working example with stack measurement, and [`examples/dictionary`](examples/dictionary/) for the dictionary-based pattern.
## Configuration
Run `idf.py menuconfig` and navigate to **Component config > Zstandard (zstd)**:
| Option | Default | Description |
|---|---|---|
| `ZSTD_COMPRESSION` | y | Include compression support |
| `ZSTD_DECOMPRESSION` | y | Include decompression support |
| `ZSTD_MINIFY` | n | Enable all size optimizations (smaller code, slower) |
| `ZSTD_STRIP_ERROR_STRINGS` | n | Remove error message strings to save flash |
| `ZSTD_NO_INLINE` | n | Disable inlining to reduce code size |
### Memory Considerations
See the [Memory budget](#memory-budget--read-this-first) callout at the top for the high-level numbers. This section covers the runtime tuning knobs in detail.
#### Tuning zstd memory usage
The component exposes the full zstd advanced API. Key knobs:
```c
ZSTD_CCtx *cctx = ZSTD_createCCtx();
ZSTD_CCtx_setParameter(cctx, ZSTD_c_compressionLevel, 1); // 1, or negative
ZSTD_CCtx_setParameter(cctx, ZSTD_c_windowLog, 14); // 16 KB window
ZSTD_CCtx_setParameter(cctx, ZSTD_c_hashLog, 12); // 16 KB hash table
ZSTD_CCtx_setPledgedSrcSize(cctx, expected_payload_size); // single-shot only
```
| Parameter | Recommendation | Effect |
|---|---|---|
| `ZSTD_c_compressionLevel` | 1 to 3, or negative for tighter RAM | Lower = less working memory |
| `ZSTD_c_windowLog` | `ceil(log2(dictSize))` minimum, default ~17 | Sets the history window; must accommodate dict |
| `ZSTD_c_hashLog` | 12+ when using a dict, 6 minimum | Match-finder table size — biggest single RAM knob |
| `setPledgedSrcSize` | Set to actual payload size | Lets zstd size internal buffers to expected input |
#### Measured tuning sweep
Compressing a 73-byte JSON payload with a 10.7 KB trained dictionary on an ESP32-S3:
| Configuration | Working heap | Compression ratio |
|---|---|---|
| Default level 3 + CDict | ~220 KB | ~2.5-3.0x |
| Level 1 + CDict + windowLog=14 | ~180 KB | ~2.1x |
| Level 1 + CCtx-loaded dict + windowLog=14 | ~180 KB | ~2.1x |
| Level -3 + windowLog=14 + hashLog=12 + pledgedSrcSize | **~65 KB** | **~1.8x** |
| Level -3 + hashLog=6 (too aggressive) | ~48 KB | ~1.1x — defeats the dict |
The level -3 / hashLog=12 row is the embedded "knee" — below that, the match finder is too small to use the dictionary effectively. **Above that, the ratio gains are modest while RAM grows quickly.** Pick the row that fits your application's RAM budget and ratio needs.
### Stack Requirements
**`ZSTD_compress` and `ZSTD_decompress` use several KB of stack** — typically 8–12 KB at compression level 1, more at higher levels. The default ESP-IDF main task stack (3584 bytes) is too small and will overflow silently. The crash is reported later, often as a "stack overflow in task main" at the next context switch.
The right fix is to call zstd from a dedicated worker task with a measured stack budget — not to bloat the main task. The included examples follow this pattern:
```c
#define ZSTD_TASK_STACK_BYTES 16384
static void zstd_task(void *arg)
{
// ... ZSTD_compress / ZSTD_decompress here ...
UBaseType_t hw = uxTaskGetStackHighWaterMark(NULL);
ESP_LOGI(TAG, "Stack free: %u bytes", (unsigned)(hw * sizeof(StackType_t)));
vTaskDelete(NULL);
}
void app_main(void)
{
xTaskCreate(zstd_task, "zstd", ZSTD_TASK_STACK_BYTES, NULL, 5, NULL);
}
```
After running once, check the logged high-water mark and trim `ZSTD_TASK_STACK_BYTES` to the peak you actually observed plus a safety margin (1–2 KB).
To reduce stack usage further:
- Use a **negative compression level** (`-1` to `-7`) — uses less stack, less heap, less CPU, at the cost of compression ratio
- Enable `CONFIG_ZSTD_NO_INLINE=y` — less aggressive inlining keeps frame sizes smaller
- Enable `CONFIG_ZSTD_MINIFY=y` — also reduces code size at a moderate speed cost
## Dictionary Compression
Standard zstd struggles with small payloads (< 1 KB) because there isn't enough data to build effective compression tables. A **trained dictionary** solves this by front-loading the compressor with patterns learned from representative sample data.
This is especially useful for:
- **Small JSON messages**: Sensor readings, API responses, device config — a 200-byte JSON payload that barely compresses normally can achieve 3-5x compression with a dictionary
- **MQTT / telemetry**: IoT devices send structurally identical messages where only values change
- **Log entries**: Repetitive formats with fixed prefixes and templates
- **Any small, structured data** where messages share common patterns
### Building a Dictionary
A Python tool is included in `tools/dictbuilder/`:
```bash
# Install dependency
pip install zstandard
# Collect 100+ representative samples into a directory, then:
python tools/dictbuilder/build_dictionary.py samples/ -o main/zstd_dictionary.h
# Or as binary for EMBED_FILES:
python tools/dictbuilder/build_dictionary.py samples/ -o main/dict.bin --format binary
```
The tool prints evaluation stats showing compression with and without the dictionary so you can verify the benefit.
See [tools/dictbuilder/README.md](tools/dictbuilder/README.md) for full usage and tips.
### Using a Dictionary in Firmware
**Option 1: C header (simple)**
```c
#include "zstd_dictionary.h" // Generated by build_dictionary.py
ZSTD_CDict *cdict = ZSTD_createCDict(zstd_dictionary, zstd_dictionary_size, 3);
ZSTD_CCtx *cctx = ZSTD_createCCtx();
size_t result = ZSTD_compress_usingCDict(cctx, dst, dst_cap, src, src_size, cdict);
```
**Option 2: Binary embedded in flash**
In `CMakeLists.txt`:
```cmake
target_add_binary_data(${COMPONENT_LIB} "dict.bin" BINARY)
```
In code:
```c
extern const uint8_t dict_start[] asm("_binary_dict_bin_start");
extern const uint8_t dict_end[] asm("_binary_dict_bin_end");
ZSTD_CDict *cdict = ZSTD_createCDict(dict_start, dict_end - dict_start, 3);
```
**Important**: The decompressor needs the same dictionary. If a server decompresses the data, deploy the same dictionary file to your backend.
## Examples
- **[basic](examples/basic/)** — Simple compress/decompress round-trip
- **[dictionary](examples/dictionary/)** — Dictionary-based compression of small JSON payloads
Build an example:
```bash
cd examples/basic
idf.py set-target esp32
idf.py build flash monitor
```
## License
BSD-3-Clause (same as upstream zstd). See [LICENSE](LICENSE).
a468c3424f869ece2a7df2ce344cbd40c7cccb5a
idf.py add-dependency "rderr/esp-idf-zstd^1.5.7"