Models¶
Acoustic model V2.4 - June 2023¶
more than 6,000 species worldwide
covers frequencies from 0 Hz to 15 kHz with two-channel spectrogram (one for low and one for high frequencies)
0.826 GFLOPs, 50.5 MB as FP32
enhanced and optimized metadata model
global selection of species (birds and non-birds) with 6,522 classes (incl. 11 non-event classes)
Technical details¶
48 kHz sampling rate (we up- and downsample automatically and can deal with artifacts from lower sampling rates)
we compute 2 mel spectrograms as input for the convolutional neural network:
first one has fmin = 0 Hz and fmax = 3000; nfft = 2048; hop size = 278; 96 mel bins
second one has fmin = 500 Hz and fmax = 15 kHz; nfft = 1024; hop size = 280; 96 mel bins
both spectrograms have a final resolution of 96x511 pixels
raw audio will be normalized between -1 and 1 before spectrogram conversion
we use non-linear magnitude scaling as mentioned in Schlüter 2018
V2.4 uses an EfficienNetB0-like backbone with a final embedding size of 1024
See this comment for more details
Geo model (species range model) V2.4 - V2, Jan 2024¶
updated species range model based on eBird data
more accurate (spatial) species range prediction
slightly increased long-tail distribution in the temporal resolution
see this discussion post for more details
Using older models¶
Older models are not supported in the current version of the package. If you need to use an older model, please refer to the BirdNET-Analyzer repository.