videosdk/iot-sdk

# IoT SDK At Video SDK, we're building tools to help developers bring **real-time collaboration** to IoT and embedded devices. With the IoT SDK, you can integrate **live audio and video communication, meeting management, device-to-cloud connectivity, and session handling** directly into ESP32-S3 boards. ## Features - **Real-time audio** — publish the on-board microphone and subscribe to remote audio, using **PCMA (G.711 A-law)**. - **Real-time video** — publish camera frames and render remote frames as hardware **JPEG** over the WebRTC data channel (Korvo-2). - **Connection-state callbacks** — get notified when signaling connects or drops so your app can rejoin. - **Meeting management** — create a room, join, and leave. - **Runtime speaker volume** control (Korvo-2). ## Supported boards | Board | Audio publish | Audio subscribe | Video publish | Video subscribe | Speaker volume | |-------|:---:|:---:|:---:|:---:|:---:| | **ESP32-S3-Korvo-2 v3.0** | ✅ | ✅ | ✅ | ✅ | ✅ | | **XIAO ESP32-S3 (Sense)** | ✅ | ❌ | ✅ | ❌ | ❌ | > The XIAO is **send-only** — it has no speaker or display, so `startSubscribeAudio()` and `startSubscribeVideo()` return `DEVICE_NOT_SUPPORTED`. Board selection is compile-time (see [Configure](#4-configure-menuconfig)). ## Prerequisites - **ESP-IDF 5.4+** (required for the camera / JPEG stack). - A valid [Video SDK account](https://app.videosdk.live/) and an auth token. ## Use the IoT SDK component ### 1. Set up ESP-IDF Follow **Step 1** of the VideoSDK [ESP-IDF setup guide](https://docs.videosdk.live/iot/guide/video-and-audio-calling-api-sdk/quickstart/quick-start#step-1-setup-for-esp-idf) to install the toolchain. You do **not** need to run the project-creation commands — once the environment is ready, continue from Step 2 below. ```bash # In every new shell, activate the ESP-IDF environment source ~/esp/esp-idf/export.sh ``` ### 2. Add the IoT SDK component Declare the component in your project's `main/idf_component.yml`. Either reference the published component from the registry: ```yaml dependencies: videosdk/iot-sdk: "*" # or pin a specific version, e.g. "0.2.2" ``` ### 3. Add the required dependencies The component pulls in its own dependencies automatically, but your **application** also needs the shared IDF example/networking components. Add these to your `main/idf_component.yml`: ```yaml dependencies: idf: version: ">=5.4.0" mdns: "*" espressif/esp_audio_codec: "~2.3.0" espressif/esp_codec_dev: "~1.3.4" espressif/esp_audio_effects: "~1.1.0" espressif/esp_capture: "^0.7.6" espressif/esp_video_codec: "~0.5.2" espressif/esp_jpeg: "^1.3.1" espressif/esp32-camera: "^2.0.15" espressif/esp_websocket_client: "^1.2.0" sepfy/srtp: "^2.3.0" sepfy/usrsctp: "^0.9.5" ``` ### 4. Configure (menuconfig) Set your board target, Wi-Fi, and VideoSDK credentials: ```bash idf.py set-target esp32s3 idf.py menuconfig ``` - **VideoSDK IoT SDK → Audio hardware board** — select **ESP32-S3-Korvo-2** or **ESP32-S3-XIAO**. - **VideoSDK Configuration** - **Auth token (JWT)** → `CONFIG_VIDEOSDK_TOKEN` - **Meeting / room ID** → `CONFIG_VIDEOSDK_MEETING_ID` - **Speaker output volume (0–100)** → `CONFIG_SPEAKER_VOLUME` (Korvo-2 only) - **Example Connection Configuration** → Wi-Fi SSID and password. - **Component config → mbedTLS** → enable **Support DTLS** and **Support TLS**. - **Partition Table** → enable **Custom partition table CSV**. - **Serial flasher config → Flash size** → set to match your board (e.g. 8 MB). > The token and meeting ID are read from menuconfig (stored in `sdkconfig`) — **never hardcode a real token in source or commit it**. ### 5. Build & flash ```bash idf.py build idf.py -p <PORT> flash monitor ``` ## Usage Include the single public header and drive the API from **one task** (it is not thread-safe): ```c #include "videosdk.h" #include "sdkconfig.h" void app_main(void) { // ... init NVS, netif, event loop, and connect Wi-Fi first ... // 1. Initialize the session (call exactly once, first). init_config_t cfg = { .meetingID = CONFIG_VIDEOSDK_MEETING_ID, .token = CONFIG_VIDEOSDK_TOKEN, .displayName = "ESP32-Device", .audioCodec = AUDIO_CODEC_PCMA, .videoCodec = VIDEO_CODEC_JPEG, }; if (init(&cfg) != RESULT_OK) { return; } // 2. Join the call. startPublishAudio is the foundation — call it first; // video/subscribe calls reuse its transport. (Korvo-2 for video + subscribe.) startPublishAudio(""); // empty publisherId => a random one is generated startPublishVideo(); startSubscribeAudio(); startSubscribeVideo(); // 3. Let the SDK's internal tasks run. while (1) { vTaskDelay(pdMS_TO_TICKS(10)); } // Later: leave() stops all publish/subscribe streams. } ``` ## API reference All declarations live in [`include/videosdk.h`](include/videosdk.h). ### `init_config_t` | Field | Type | Notes | |-------|------|-------| | `meetingID` | `char *` | Room / meeting ID. **Not copied — keep alive for the whole session.** | | `token` | `char *` | VideoSDK JWT auth token (same lifetime note). | | `displayName` | `char *` | Name shown in the meeting. | | `audioCodec` | `audio_codec_t` | `AUDIO_CODEC_PCMA` (G.711 A-law) — the only supported audio codec. | | `videoCodec` | `video_codec_t` | `VIDEO_CODEC_NONE` (audio only) or `VIDEO_CODEC_JPEG`. | ### Functions | Function | Description | |----------|-------------| | `create_meeting_result_t create_meeting(char *token)` | Create a room. `room_id` is malloc'd — **caller must `free()` it**. | | `result_t init(init_config_t *cfg)` | Initialize the session and board. Call **once, first**. | | `result_t startPublishAudio(char *publisherId)` | Mic → data channel. Foundation call; empty `publisherId` = random. | | `result_t startPublishVideo(void)` | Camera JPEG → data channel. Call **after** `startPublishAudio`. *Korvo-2 only.* | | `result_t startSubscribeAudio(void)` | Remote audio → speaker. Call **after** `startPublishAudio`. *Korvo-2 only.* | | `result_t startSubscribeVideo(void)` | Remote JPEG → LCD. Call **after** `startSubscribeAudio`. *Korvo-2 only.* | | `result_t stopPublishAudio()` | Stop publishing audio. | | `result_t stopSubscribeAudio()` | Stop subscribing to audio. | | `void setSpeakerVolume(int volume)` | Set playback volume 0–100 (clamped). *Korvo-2 only.* | | `void setConnectionStateHandler(connection_state_cb_t cb, void *user)` | Register/clear the signaling connection-state handler. | | `result_t leave()` | Leave the meeting; stops all publish/subscribe streams. | ### Result codes `RESULT_OK` (0) means success. Errors are in the `3001`–`3024` range (e.g. `DEVICE_NOT_SUPPORTED`, `AUDIO_CODEC_INIT_FAILED`, `DTLS_HANDSHAKE_FAILED`, `INIT_NOT_CALLED`, `DUPLICATE_ID`) — see the header for the full list and meanings. ### Contract & threading - Call all functions from a **single task** (e.g. `app_main`) — the API is **not thread-safe**. - Call order: `init()` → (optional) register callbacks → `start{Publish,Subscribe}{Audio,Video}()` → `leave()`. - The first publish or subscribe call brings up its transport; later calls reuse it. - Callbacks run on internal SDK tasks — **do not block** in them, and copy any buffer you need to keep. ## Documentation - For more details, see the [VideoSDK Documentation](https://docs.videosdk.live/iot/guide/video-and-audio-calling-api-sdk/concept-and-architecture).

Readme

Links

Target

Tags

Stats

Badge