videosdk/iot-sdk

0.2.2

Latest
uploaded 20 hours ago
IoTSdk is a lightweight ESP-IDF component that enables ESP32-based IoT devices to participate in real-time meetings through the VideoSDK platform. It provides seamless integration for joining meetings and publishing/subscribing to audio and video (JPEG over the data channel).

Readme

# IoT SDK

At Video SDK, we're building tools to help developers bring **real-time collaboration** to IoT and embedded devices. With the IoT SDK, you can integrate **live audio and video communication, meeting management, device-to-cloud connectivity, and session handling** directly into ESP32-S3 boards.

## Features

- **Real-time audio** — publish the on-board microphone and subscribe to remote audio, using **PCMA (G.711 A-law)**.
- **Real-time video** — publish camera frames and render remote frames as hardware **JPEG** over the WebRTC data channel (Korvo-2).
- **Connection-state callbacks** — get notified when signaling connects or drops so your app can rejoin.
- **Meeting management** — create a room, join, and leave.
- **Runtime speaker volume** control (Korvo-2).

## Supported boards

| Board | Audio publish | Audio subscribe | Video publish | Video subscribe | Speaker volume |
|-------|:---:|:---:|:---:|:---:|:---:|
| **ESP32-S3-Korvo-2 v3.0** | ✅ | ✅ | ✅ | ✅ | ✅ |
| **XIAO ESP32-S3 (Sense)** | ✅ | ❌ | ✅ | ❌ | ❌ |

> The XIAO is **send-only** — it has no speaker or display, so `startSubscribeAudio()` and `startSubscribeVideo()` return `DEVICE_NOT_SUPPORTED`. Board selection is compile-time (see [Configure](#4-configure-menuconfig)).

## Prerequisites

- **ESP-IDF 5.4+** (required for the camera / JPEG stack).
- A valid [Video SDK account](https://app.videosdk.live/) and an auth token.

## Use the IoT SDK component

### 1. Set up ESP-IDF

Follow **Step 1** of the VideoSDK [ESP-IDF setup guide](https://docs.videosdk.live/iot/guide/video-and-audio-calling-api-sdk/quickstart/quick-start#step-1-setup-for-esp-idf) to install the toolchain. You do **not** need to run the project-creation commands — once the environment is ready, continue from Step 2 below.

```bash
# In every new shell, activate the ESP-IDF environment
source ~/esp/esp-idf/export.sh
```

### 2. Add the IoT SDK component

Declare the component in your project's `main/idf_component.yml`. Either reference the published component from the registry:

```yaml
dependencies:
  videosdk/iot-sdk: "*"   # or pin a specific version, e.g. "0.2.2"
```

### 3. Add the required dependencies

The component pulls in its own dependencies automatically, but your **application** also needs the shared IDF example/networking components. Add these to your `main/idf_component.yml`:

```yaml
dependencies:
  idf:
    version: ">=5.4.0"
  mdns: "*"
  espressif/esp_audio_codec: "~2.3.0"
  espressif/esp_codec_dev: "~1.3.4"
  espressif/esp_audio_effects: "~1.1.0"
  espressif/esp_capture: "^0.7.6"
  espressif/esp_video_codec: "~0.5.2"
  espressif/esp_jpeg: "^1.3.1"
  espressif/esp32-camera: "^2.0.15"
  espressif/esp_websocket_client: "^1.2.0"
  sepfy/srtp: "^2.3.0"
  sepfy/usrsctp: "^0.9.5"
```

### 4. Configure (menuconfig)

Set your board target, Wi-Fi, and VideoSDK credentials:

```bash
idf.py set-target esp32s3
idf.py menuconfig
```

- **VideoSDK IoT SDK → Audio hardware board** — select **ESP32-S3-Korvo-2** or **ESP32-S3-XIAO**.
- **VideoSDK Configuration**
  - **Auth token (JWT)** → `CONFIG_VIDEOSDK_TOKEN`
  - **Meeting / room ID** → `CONFIG_VIDEOSDK_MEETING_ID`
  - **Speaker output volume (0–100)** → `CONFIG_SPEAKER_VOLUME` (Korvo-2 only)
- **Example Connection Configuration** → Wi-Fi SSID and password.
- **Component config → mbedTLS** → enable **Support DTLS** and **Support TLS**.
- **Partition Table** → enable **Custom partition table CSV**.
- **Serial flasher config → Flash size** → set to match your board (e.g. 8 MB).

> The token and meeting ID are read from menuconfig (stored in `sdkconfig`) — **never hardcode a real token in source or commit it**.

### 5. Build & flash

```bash
idf.py build
idf.py -p <PORT> flash monitor
```

## Usage

Include the single public header and drive the API from **one task** (it is not thread-safe):

```c
#include "videosdk.h"
#include "sdkconfig.h"

void app_main(void)
{
    // ... init NVS, netif, event loop, and connect Wi-Fi first ...

    // 1. Initialize the session (call exactly once, first).
    init_config_t cfg = {
        .meetingID   = CONFIG_VIDEOSDK_MEETING_ID,
        .token       = CONFIG_VIDEOSDK_TOKEN,
        .displayName = "ESP32-Device",
        .audioCodec  = AUDIO_CODEC_PCMA,   
        .videoCodec  = VIDEO_CODEC_JPEG,   
    };
    if (init(&cfg) != RESULT_OK) {
        return;
    }

    // 2. Join the call. startPublishAudio is the foundation — call it first;
    //    video/subscribe calls reuse its transport. (Korvo-2 for video + subscribe.)
    startPublishAudio("");   // empty publisherId => a random one is generated
    startPublishVideo();
    startSubscribeAudio();
    startSubscribeVideo();

    // 3. Let the SDK's internal tasks run.
    while (1) {
        vTaskDelay(pdMS_TO_TICKS(10));
    }

    // Later: leave() stops all publish/subscribe streams.
}


```

## API reference

All declarations live in [`include/videosdk.h`](include/videosdk.h).

### `init_config_t`

| Field | Type | Notes |
|-------|------|-------|
| `meetingID` | `char *` | Room / meeting ID. **Not copied — keep alive for the whole session.** |
| `token` | `char *` | VideoSDK JWT auth token (same lifetime note). |
| `displayName` | `char *` | Name shown in the meeting. |
| `audioCodec` | `audio_codec_t` | `AUDIO_CODEC_PCMA` (G.711 A-law) — the only supported audio codec. |
| `videoCodec` | `video_codec_t` | `VIDEO_CODEC_NONE` (audio only) or `VIDEO_CODEC_JPEG`. |

### Functions

| Function | Description |
|----------|-------------|
| `create_meeting_result_t create_meeting(char *token)` | Create a room. `room_id` is malloc'd — **caller must `free()` it**. |
| `result_t init(init_config_t *cfg)` | Initialize the session and board. Call **once, first**. |
| `result_t startPublishAudio(char *publisherId)` | Mic → data channel. Foundation call; empty `publisherId` = random. |
| `result_t startPublishVideo(void)` | Camera JPEG → data channel. Call **after** `startPublishAudio`. *Korvo-2 only.* |
| `result_t startSubscribeAudio(void)` | Remote audio → speaker. Call **after** `startPublishAudio`. *Korvo-2 only.* |
| `result_t startSubscribeVideo(void)` | Remote JPEG → LCD. Call **after** `startSubscribeAudio`. *Korvo-2 only.* |
| `result_t stopPublishAudio()` | Stop publishing audio. |
| `result_t stopSubscribeAudio()` | Stop subscribing to audio. |
| `void setSpeakerVolume(int volume)` | Set playback volume 0–100 (clamped). *Korvo-2 only.* |
| `void setConnectionStateHandler(connection_state_cb_t cb, void *user)` | Register/clear the signaling connection-state handler. |
| `result_t leave()` | Leave the meeting; stops all publish/subscribe streams. |

### Result codes

`RESULT_OK` (0) means success. Errors are in the `3001`–`3024` range (e.g. `DEVICE_NOT_SUPPORTED`, `AUDIO_CODEC_INIT_FAILED`, `DTLS_HANDSHAKE_FAILED`, `INIT_NOT_CALLED`, `DUPLICATE_ID`) — see the header for the full list and meanings.

### Contract & threading

- Call all functions from a **single task** (e.g. `app_main`) — the API is **not thread-safe**.
- Call order: `init()` → (optional) register callbacks → `start{Publish,Subscribe}{Audio,Video}()` → `leave()`.
- The first publish or subscribe call brings up its transport; later calls reuse it.
- Callbacks run on internal SDK tasks — **do not block** in them, and copy any buffer you need to keep.

## Documentation

- For more details, see the [VideoSDK Documentation](https://docs.videosdk.live/iot/guide/video-and-audio-calling-api-sdk/concept-and-architecture).

Links

Target

To add this component to your project, run:

idf.py add-dependency "videosdk/iot-sdk^0.2.2"

download archive

Stats

  • Archive size
    Archive size ~ 2.16 MB
  • Downloaded in total
    Downloaded in total 20 times
  • Downloaded this version
    This version: 0 times

Badge

videosdk/iot-sdk version: 0.2.2
|