Deployment¶

Deploy a quantized TFLite model to the STM32N6570-DK development board using ST's X-CUBE-AI toolchain.

Prerequisites¶

Tool	Version	Download
X-CUBE-AI	10.2.0+	ST website
STM32CubeProgrammer	2.20+	ST website
STM32CubeIDE	1.19+	ST website
ARM GNU Toolchain	14.3+	ARM Developer

Overview¶

The deployment pipeline has three stages:

flowchart LR
    A[".tflite\nquantized model"] --> B["stedgeai generate\nN6-optimized binary"]
    B --> C["n6_loader.py\nserial flash to board"]
    C --> D["stedgeai validate\non-device inference"]
    D --> E["Validation report\ncosine sim + latency"]

stm32_model_validation

(Image source: STM32ai)

Step 1: Install X-CUBE-AI¶

unzip x-cube-ai-linux-v10.2.0.zip X-CUBE-AI.10.2.0
cd X-CUBE-AI.10.2.0
unzip stedgeai-linux-10.2.0.zip

Directory structure after extraction:

X-CUBE-AI.10.2.0/
├── Utilities/
│   └── linux/
│       └── stedgeai          # CLI tool
├── Middlewares/
└── Projects/

Step 2: Install ARM GNU Toolchain¶

wget https://developer.arm.com/-/media/Files/downloads/gnu/14.3.rel1/binrel/arm-gnu-toolchain-14.3.rel1-x86_64-arm-none-eabi.tar.xz
tar xf arm-gnu-toolchain-14.3.rel1-x86_64-arm-none-eabi.tar.xz
export PATH=$PWD/arm-gnu-toolchain-14.3.rel1-x86_64-arm-none-eabi/bin:$PATH

Verify:

arm-none-eabi-gcc --version

Step 3: Install STM32CubeProgrammer¶

Download and run the installer:

./SetupSTM32CubeProgrammer-2.20.0.linux

Add to PATH and configure permissions:

export PATH=$PATH:/path/to/STM32Cube/STM32CubeProgrammer/bin

sudo usermod -aG plugdev $USER
sudo usermod -aG dialout $USER

# Install udev rules
sudo cp /path/to/STM32CubeProgrammer/Drivers/rules/*.* /etc/udev/rules.d/
sudo udevadm control --reload-rules && sudo udevadm trigger

Unplug and replug the board (or reboot) to apply the new rules.

Verify:

STM32_Programmer_CLI --list

Step 4: Generate model files¶

Navigate to the X-CUBE-AI utilities directory and run:

./stedgeai generate \
  --model /path/to/checkpoints/my_model_quantized.tflite \
  --target stm32n6 \
  --st-neural-art \
  --output /path/to/birdnet-stm32/validation/st_ai_output \
  --workspace /path/to/birdnet-stm32/validation/st_ai_ws \
  --verbose

Analyze first

Run stedgeai analyze instead of generate to get detailed model metrics (size, memory, per-layer info) without generating output files. Always analyze new model architectures to verify N6 NPU operator compatibility.

The output includes network_generate_report.txt with model size and compute requirements.

Step 5: Configure and flash the board¶

Set board to DEV mode¶

Disconnect the board from USB.
Set BOOT0 to right.
Set BOOT1 to left.
Set JP2 to position 1-2.
Reconnect the board.

Set STM32N6570-DK to dev mode

(Image source: ST Community)

Create configuration files¶

Copy the example config and fill in your local paths:

cp config.example.json config.json

Edit config.json with your machine-local paths:

{
  "compiler_type": "gcc",
  "cubeide_path": "/path/to/stm32cubeide",
  "x_cube_ai_path": "/path/to/X-CUBE-AI.10.2.0",
  "model_path": "checkpoints/best_model_quantized.tflite",
  "output_dir": "validation/st_ai_output",
  "workspace_dir": "validation/st_ai_ws",
  "n6_loader_config": "config_n6l.json"
}

Create config_n6l.json in the project root (required by ST's n6_loader):

{
  "network.c": "/path/to/birdnet-stm32/validation/st_ai_output/network.c",
  "project_path": "/path/to/X-CUBE-AI.10.2.0/Projects/STM32N6570-DK/Applications/NPU_Validation",
  "project_build_conf": "N6-DK",
  "skip_external_flash_programming": false,
  "skip_ram_data_programming": false,
  "objcopy_binary_path": "/usr/bin/arm-none-eabi-objcopy"
}

Warning

Both config files contain machine-local paths. They are listed in .gitignore — do not commit them. Use config.example.json as a reference template.

Run the full deploy pipeline¶

The CLI reads all paths from config.json and runs generate → flash → validate:

python -m birdnet_stm32 deploy

You can override any path via CLI arguments:

python -m birdnet_stm32 deploy --x_cube_ai_path /path/to/X-CUBE-AI.10.2.0

Or via environment variables:

export X_CUBE_AI_PATH=/path/to/X-CUBE-AI.10.2.0
python -m birdnet_stm32 deploy

Priority order: CLI arguments > environment variables > config.json values.

Verify the board is connected:

ls /dev/ttyACM*

You may need serial port permissions:

sudo chmod a+rw /dev/ttyACM0

Step 6: Validate on-device¶

The deploy command runs validation automatically. To run validation separately with additional options (e.g., --valinput for specific test data):

/path/to/X-CUBE-AI.10.2.0/Utilities/linux/stedgeai validate \
  --model checkpoints/my_model_quantized.tflite \
  --target stm32n6 \
  --mode target \
  --desc serial:921600 \
  --output /path/to/birdnet-stm32/validation/st_ai_output \
  --workspace /path/to/birdnet-stm32/validation/st_ai_ws \
  --valinput /path/to/checkpoints/my_model_quantized_validation_data.npz \
  --classifier \
  --verbose

The validation runs inference on the physical board and compares results to the reference model. Results are saved to network_validate_report.txt in the output directory.

Demo application¶

The demo application is under development. The planned pipeline:

Record audio using the on-board microphone.
Run FFT on 512-sample frames, accumulating into a ring buffer.
Run inference every second on the last 3 seconds of audio.
Map prediction scores to labels using labels.txt.
Log top-5 predictions to the serial console.

Board test¶

The board-test command runs a standalone inference test on the STM32N6570-DK. The firmware reads WAV files from the SD card, computes the STFT on the Cortex-M55, runs the model on the NPU, and streams results over UART. This verifies the entire on-device pipeline end-to-end.

python -m birdnet_stm32 board-test --config config.json

SD card preparation¶

Format a micro-SD card as FAT32.
Create an audio/ directory at the root.
Copy mono or stereo 16-bit PCM .wav files into audio/. Each file should be at least as long as the model's chunk duration (default 3 s). The sample rate must match the model's (printed in _model_config.json).
Insert the card into the STM32N6570-DK slot.

Board-test arguments¶

Argument	Default	Description
`--model_path`	(from config)	Path to quantized `.tflite` model
`--model_config`	(inferred)	Path to `_model_config.json`
`--labels`	(inferred)	Path to `_labels.txt`
`--serial_port`	`/dev/ttyACM0`	Serial port for UART capture
`--top_k`	5	Top-K predictions per file
`--score_threshold`	0.01	Minimum score to display
`--config`	`config.json`	Deploy configuration JSON
`--timeout`	300	Max seconds to wait for firmware response
`--save_results`	None	Save results summary to a CSV file

Board test is standalone

The board-test command deploys real firmware that does all processing on the board: read WAV from SD card → compute STFT on Cortex-M55 → run NPU inference → write results to SD card + serial. Do NOT precompute spectrograms on the host — that defeats the purpose of an integration test.

Firmware documentation¶

For detailed documentation on the board firmware — hardware specs, build system, configuration, source module reference, UART protocol, and troubleshooting — see the Firmware section.