Skip to content

Getting Started

Prerequisites

Requirement Version
Python 3.12+
TensorFlow 2.16+ (with CUDA for GPU training)
OS Ubuntu 22.04+ (other Linux distros should work)

For STM32 deployment you also need:

Installation

git clone https://github.com/birdnet-team/birdnet-stm32.git
cd birdnet-stm32

Create a virtual environment and install:

python3.12 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

This installs the birdnet_stm32 package in editable mode with development dependencies (pytest, ruff, pre-commit).

Optional extras:

pip install -e ".[dev,docs]"   # + documentation tools (mkdocs)
pip install -e ".[tune]"       # + Optuna for hyperparameter search
pip install -e ".[all]"        # everything (dev + docs + deploy + tune)

Quick workflow

1. Prepare data

Organize audio files into the expected folder structure:

data/
├── train/<species_name>/*.wav
└── test/<species_name>/*.wav

See Dataset Preparation for details on downloading and structuring the iNatSounds subset.

2. Train

python -m birdnet_stm32 train \
  --data_path_train data/train \
  --audio_frontend hybrid \
  --mag_scale pwl \
  --checkpoint_path checkpoints/my_model.keras

3. Convert

python -m birdnet_stm32 convert \
  --checkpoint_path checkpoints/my_model.keras \
  --model_config checkpoints/my_model_model_config.json \
  --data_path_train data/train

4. Evaluate

python -m birdnet_stm32 evaluate \
  --model_path checkpoints/my_model_quantized.tflite \
  --model_config checkpoints/my_model_model_config.json \
  --data_path_test data/test \
  --pooling lme

5. Deploy

See the Deployment guide for flashing the quantized model to the STM32N6570-DK.

Pre-trained model

This repository includes a pre-trained checkpoint (checkpoints/birdnet_stm32n6_100.tflite) trained on the 100 most common species of the northeastern US, central Europe, and Brazil. It achieves a ROC-AUC of 0.84 on iNatSounds test data and runs inference in ~3.3 ms per 3-second chunk on the STM32N6570-DK.

See birdnet_stm32n6_100_model_config.json for full model parameters.