Skip to content

Firmware Overview

The BirdNET-STM32 firmware is a standalone bare-metal application for the STM32N6570-DK development board. It reads WAV files from an SD card, computes audio features (depending on the selected frontend, e.g. STFT + Mel on the Cortex-M55 CPU, or passing the raw waveform directly), runs neural-network inference on the dedicated NPU, and reports bird species detections over UART and back to the SD card.

Design principle

The firmware is a self-contained integration test and demo. Everything runs on the board — no host preprocessing, no streaming, no RTOS. This makes it easy to validate the full pipeline (audio → spectrogram → NPU → classification) in isolation.

At a Glance

Property Value
Language C11 (ARM GCC 13+)
RTOS None (bare-metal, single-threaded while(1) loop)
Board STM32N6570-DK
CPU Arm Cortex-M55 @ 800 MHz
NPU ST Neural-ART, 1.2 TOPS INT8
Build system Overlay on ST's NPU_Validation Makefile
Flash method GDB via n6_loader.py (part of X-CUBE-AI)

Processing Pipeline

flowchart LR
    SD["SD card<br/>WAV files"] --> WAV["wav_reader.c<br/>PCM16 → float32"]
    WAV --> |Hybrid / Precomputed| STFT["audio_stft.c<br/>Hann + 512-pt FFT"]
    WAV --> |Raw Mode| NPU
    STFT --> |Precomputed| Mel["audio_mel.c<br/>Mel Filterbank"]
    STFT --> |Hybrid| NPU["NPU (LL_ATON)<br/>DS-CNN inference"]
    Mel --> NPU
    NPU --> UART["UART output<br/>top-K predictions"]
    NPU --> RES["SD card<br/>results.txt"]

For each .wav file on the SD card:

  1. Read — parse RIFF/WAVE header, load the first chunk (2-3 seconds) as float32.
  2. Audio Frontend — depends on APP_AUDIO_FRONTEND:
  3. Hybrid: 512-point Hann-windowed STFT → [257, frames] magnitude spectrogram.
  4. Precomputed: STFT followed by an explicitly mapped Mel filterbank → [64, frames].
  5. Raw: Bypass CPU processing entirely! The raw PCM float array goes straight to the NPU.
  6. NPU inference — copy features to NPU input, run the full DS-CNN (handling mel/PWL mappings intrinsically if required), read class scores.
  7. Output — print top-K species over UART; optionally write TSV results to the SD card.

Typical Performance

Stage Default Hybrid (24kHz x 3.0s) Raw Waveform (24kHz x 2.0s) Notes
SD read 15–20 ms ~56 ms Reads PCM16 blocks, limits vary with chunk length
STFT ~45 ms 0 ms 512-pt FFT on Cortex-M55 @ 800 MHz (Raw skips this)
NPU inference ~12 ms ~10–11 ms INT8 DS-CNN. Raw frontend runs 400x filters directly.
Total ~75 ms ~67 ms ~50–75× faster than real-time

Source Layout

firmware/
├── Src/
│   ├── main.c           # Board init + processing loop
│   ├── wav_reader.c     # RIFF/WAVE parser, PCM16→float32
│   ├── audio_stft.c     # Hann-windowed STFT
│   ├── fft.c            # 512-pt real FFT (radix-2 DIT)
│   └── sd_handler.c     # BSP SD + FatFs mount/scan/write
├── Inc/
│   ├── app_config.h     # Audio params (patched at deploy time)
│   ├── app_labels.h     # Class names (auto-generated)
│   ├── wav_reader.h
│   ├── audio_stft.h
│   ├── fft.h
│   └── sd_handler.h
├── Drivers/
│   ├── HAL_SD/          # HAL SD card driver sources
│   ├── FatFs/           # FatFs R0.15 filesystem
│   └── stm32n6570_discovery_sd.*  # BSP SD driver
└── README.md            # Standalone firmware reference

Next Steps

  • :material-chip:{ .lg .middle } Hardware

    Learn about the STM32N6570-DK board, Cortex-M55, NPU, memory map.

  • :material-wrench:{ .lg .middle } Building & Flashing

    How to build the firmware and flash it to the board.

  • :material-cog:{ .lg .middle } Configuration

    Adapt the firmware to your model and audio parameters.

  • :material-code-braces:{ .lg .middle } Source Modules

    Detailed reference for every C source file.

  • :material-serial-port:{ .lg .middle } UART Protocol

    Serial output format and host-side parsing.

  • :material-bug:{ .lg .middle } Troubleshooting

    Common pitfalls, debugging hints, and known issues.