Skip to content

Deployment

Deploy a quantized TFLite model to the STM32N6570-DK development board using ST's X-CUBE-AI toolchain.

Prerequisites

Tool Version Download
X-CUBE-AI 10.2.0+ ST website
STM32CubeProgrammer 2.20+ ST website
STM32CubeIDE 1.19+ ST website
ARM GNU Toolchain 14.3+ ARM Developer

Overview

The deployment pipeline has three stages:

flowchart LR
    A[".tflite\nquantized model"] --> B["stedgeai generate\nN6-optimized binary"]
    B --> C["n6_loader.py\nserial flash to board"]
    C --> D["stedgeai validate\non-device inference"]
    D --> E["Validation report\ncosine sim + latency"]

stm32_model_validation

(Image source: STM32ai)

Step 1: Install X-CUBE-AI

unzip x-cube-ai-linux-v10.2.0.zip X-CUBE-AI.10.2.0
cd X-CUBE-AI.10.2.0
unzip stedgeai-linux-10.2.0.zip

Directory structure after extraction:

X-CUBE-AI.10.2.0/
├── Utilities/
│   └── linux/
│       └── stedgeai          # CLI tool
├── Middlewares/
└── Projects/

Step 2: Install ARM GNU Toolchain

wget https://developer.arm.com/-/media/Files/downloads/gnu/14.3.rel1/binrel/arm-gnu-toolchain-14.3.rel1-x86_64-arm-none-eabi.tar.xz
tar xf arm-gnu-toolchain-14.3.rel1-x86_64-arm-none-eabi.tar.xz
export PATH=$PWD/arm-gnu-toolchain-14.3.rel1-x86_64-arm-none-eabi/bin:$PATH

Verify:

arm-none-eabi-gcc --version

Step 3: Install STM32CubeProgrammer

Download and run the installer:

./SetupSTM32CubeProgrammer-2.20.0.linux

Add to PATH and configure permissions:

export PATH=$PATH:/path/to/STM32Cube/STM32CubeProgrammer/bin

sudo usermod -aG plugdev $USER
sudo usermod -aG dialout $USER

# Install udev rules
sudo cp /path/to/STM32CubeProgrammer/Drivers/rules/*.* /etc/udev/rules.d/
sudo udevadm control --reload-rules && sudo udevadm trigger

Unplug and replug the board (or reboot) to apply the new rules.

Verify:

STM32_Programmer_CLI --list

Step 4: Generate model files

Navigate to the X-CUBE-AI utilities directory and run:

./stedgeai generate \
  --model /path/to/checkpoints/my_model_quantized.tflite \
  --target stm32n6 \
  --st-neural-art \
  --output /path/to/birdnet-stm32/validation/st_ai_output \
  --workspace /path/to/birdnet-stm32/validation/st_ai_ws \
  --verbose

Analyze first

Run stedgeai analyze instead of generate to get detailed model metrics (size, memory, per-layer info) without generating output files. Always analyze new model architectures to verify N6 NPU operator compatibility.

The output includes network_generate_report.txt with model size and compute requirements.

Step 5: Configure and flash the board

Set board to DEV mode

  1. Disconnect the board from USB.
  2. Set BOOT0 to right.
  3. Set BOOT1 to left.
  4. Set JP2 to position 1-2.
  5. Reconnect the board.

Set STM32N6570-DK to dev mode

(Image source: ST Community)

Create configuration files

Copy the example config and fill in your local paths:

cp config.example.json config.json

Edit config.json with your machine-local paths:

{
  "compiler_type": "gcc",
  "cubeide_path": "/path/to/stm32cubeide",
  "x_cube_ai_path": "/path/to/X-CUBE-AI.10.2.0",
  "model_path": "checkpoints/best_model_quantized.tflite",
  "output_dir": "validation/st_ai_output",
  "workspace_dir": "validation/st_ai_ws",
  "n6_loader_config": "config_n6l.json"
}

Create config_n6l.json in the project root (required by ST's n6_loader):

{
  "network.c": "/path/to/birdnet-stm32/validation/st_ai_output/network.c",
  "project_path": "/path/to/X-CUBE-AI.10.2.0/Projects/STM32N6570-DK/Applications/NPU_Validation",
  "project_build_conf": "N6-DK",
  "skip_external_flash_programming": false,
  "skip_ram_data_programming": false,
  "objcopy_binary_path": "/usr/bin/arm-none-eabi-objcopy"
}

Warning

Both config files contain machine-local paths. They are listed in .gitignore — do not commit them. Use config.example.json as a reference template.

Run the full deploy pipeline

The CLI reads all paths from config.json and runs generate → flash → validate:

python -m birdnet_stm32 deploy

You can override any path via CLI arguments:

python -m birdnet_stm32 deploy --x_cube_ai_path /path/to/X-CUBE-AI.10.2.0

Or via environment variables:

export X_CUBE_AI_PATH=/path/to/X-CUBE-AI.10.2.0
python -m birdnet_stm32 deploy

Priority order: CLI arguments > environment variables > config.json values.

Verify the board is connected:

ls /dev/ttyACM*

You may need serial port permissions:

sudo chmod a+rw /dev/ttyACM0

Step 6: Validate on-device

The deploy command runs validation automatically. To run validation separately with additional options (e.g., --valinput for specific test data):

/path/to/X-CUBE-AI.10.2.0/Utilities/linux/stedgeai validate \
  --model checkpoints/my_model_quantized.tflite \
  --target stm32n6 \
  --mode target \
  --desc serial:921600 \
  --output /path/to/birdnet-stm32/validation/st_ai_output \
  --workspace /path/to/birdnet-stm32/validation/st_ai_ws \
  --valinput /path/to/checkpoints/my_model_quantized_validation_data.npz \
  --classifier \
  --verbose

The validation runs inference on the physical board and compares results to the reference model. Results are saved to network_validate_report.txt in the output directory.

Demo application

The demo application is under development. The planned pipeline:

  1. Record audio using the on-board microphone.
  2. Run FFT on 512-sample frames, accumulating into a ring buffer.
  3. Run inference every second on the last 3 seconds of audio.
  4. Map prediction scores to labels using labels.txt.
  5. Log top-5 predictions to the serial console.

Board test

The board-test command runs a standalone inference test on the STM32N6570-DK. The firmware reads WAV files from the SD card, computes the STFT on the Cortex-M55, runs the model on the NPU, and streams results over UART. This verifies the entire on-device pipeline end-to-end.

python -m birdnet_stm32 board-test --config config.json

SD card preparation

  1. Format a micro-SD card as FAT32.
  2. Create an audio/ directory at the root.
  3. Copy mono or stereo 16-bit PCM .wav files into audio/. Each file should be at least as long as the model's chunk duration (default 3 s). The sample rate must match the model's (printed in _model_config.json).
  4. Insert the card into the STM32N6570-DK slot.

Board-test arguments

Argument Default Description
--model_path (from config) Path to quantized .tflite model
--model_config (inferred) Path to _model_config.json
--labels (inferred) Path to _labels.txt
--serial_port /dev/ttyACM0 Serial port for UART capture
--top_k 5 Top-K predictions per file
--score_threshold 0.01 Minimum score to display
--config config.json Deploy configuration JSON
--timeout 300 Max seconds to wait for firmware response
--save_results None Save results summary to a CSV file

Board test is standalone

The board-test command deploys real firmware that does all processing on the board: read WAV from SD card → compute STFT on Cortex-M55 → run NPU inference → write results to SD card + serial. Do NOT precompute spectrograms on the host — that defeats the purpose of an integration test.

Firmware documentation

For detailed documentation on the board firmware — hardware specs, build system, configuration, source module reference, UART protocol, and troubleshooting — see the Firmware section.

Further reading