Firmware Overview¶
The BirdNET-STM32 firmware is a standalone bare-metal application for the STM32N6570-DK development board. It reads WAV files from an SD card, computes audio features (depending on the selected frontend, e.g. STFT + Mel on the Cortex-M55 CPU, or passing the raw waveform directly), runs neural-network inference on the dedicated NPU, and reports bird species detections over UART and back to the SD card.
Design principle
The firmware is a self-contained integration test and demo. Everything runs on the board — no host preprocessing, no streaming, no RTOS. This makes it easy to validate the full pipeline (audio → spectrogram → NPU → classification) in isolation.
At a Glance¶
| Property | Value |
|---|---|
| Language | C11 (ARM GCC 13+) |
| RTOS | None (bare-metal, single-threaded while(1) loop) |
| Board | STM32N6570-DK |
| CPU | Arm Cortex-M55 @ 800 MHz |
| NPU | ST Neural-ART, 1.2 TOPS INT8 |
| Build system | Overlay on ST's NPU_Validation Makefile |
| Flash method | GDB via n6_loader.py (part of X-CUBE-AI) |
Processing Pipeline¶
flowchart LR
SD["SD card<br/>WAV files"] --> WAV["wav_reader.c<br/>PCM16 → float32"]
WAV --> |Hybrid / Precomputed| STFT["audio_stft.c<br/>Hann + 512-pt FFT"]
WAV --> |Raw Mode| NPU
STFT --> |Precomputed| Mel["audio_mel.c<br/>Mel Filterbank"]
STFT --> |Hybrid| NPU["NPU (LL_ATON)<br/>DS-CNN inference"]
Mel --> NPU
NPU --> UART["UART output<br/>top-K predictions"]
NPU --> RES["SD card<br/>results.txt"]
For each .wav file on the SD card:
- Read — parse RIFF/WAVE header, load the first chunk (2-3 seconds) as float32.
- Audio Frontend — depends on
APP_AUDIO_FRONTEND: - Hybrid: 512-point Hann-windowed STFT →
[257, frames]magnitude spectrogram. - Precomputed: STFT followed by an explicitly mapped Mel filterbank →
[64, frames]. - Raw: Bypass CPU processing entirely! The raw PCM float array goes straight to the NPU.
- NPU inference — copy features to NPU input, run the full DS-CNN (handling mel/PWL mappings intrinsically if required), read class scores.
- Output — print top-K species over UART; optionally write TSV results to the SD card.
Typical Performance¶
| Stage | Default Hybrid (24kHz x 3.0s) | Raw Waveform (24kHz x 2.0s) | Notes |
|---|---|---|---|
| SD read | 15–20 ms | ~56 ms | Reads PCM16 blocks, limits vary with chunk length |
| STFT | ~45 ms | 0 ms | 512-pt FFT on Cortex-M55 @ 800 MHz (Raw skips this) |
| NPU inference | ~12 ms | ~10–11 ms | INT8 DS-CNN. Raw frontend runs 400x filters directly. |
| Total | ~75 ms | ~67 ms | ~50–75× faster than real-time |
Source Layout¶
firmware/
├── Src/
│ ├── main.c # Board init + processing loop
│ ├── wav_reader.c # RIFF/WAVE parser, PCM16→float32
│ ├── audio_stft.c # Hann-windowed STFT
│ ├── fft.c # 512-pt real FFT (radix-2 DIT)
│ └── sd_handler.c # BSP SD + FatFs mount/scan/write
├── Inc/
│ ├── app_config.h # Audio params (patched at deploy time)
│ ├── app_labels.h # Class names (auto-generated)
│ ├── wav_reader.h
│ ├── audio_stft.h
│ ├── fft.h
│ └── sd_handler.h
├── Drivers/
│ ├── HAL_SD/ # HAL SD card driver sources
│ ├── FatFs/ # FatFs R0.15 filesystem
│ └── stm32n6570_discovery_sd.* # BSP SD driver
└── README.md # Standalone firmware reference
Next Steps¶
-
:material-chip:{ .lg .middle } Hardware
Learn about the STM32N6570-DK board, Cortex-M55, NPU, memory map.
-
:material-wrench:{ .lg .middle } Building & Flashing
How to build the firmware and flash it to the board.
-
:material-cog:{ .lg .middle } Configuration
Adapt the firmware to your model and audio parameters.
-
:material-code-braces:{ .lg .middle } Source Modules
Detailed reference for every C source file.
-
:material-serial-port:{ .lg .middle } UART Protocol
Serial output format and host-side parsing.
-
:material-bug:{ .lg .middle } Troubleshooting
Common pitfalls, debugging hints, and known issues.