Skip to contents

Use a BirdNET model to predict species within an audio file. The model can be a TFLite model, a custom model, or a Protobuf model.

Usage

predict_species_from_audio_file(
  model,
  audio_file,
  min_confidence = 0.1,
  batch_size = 1L,
  chunk_overlap_s = 0,
  use_bandpass = TRUE,
  bandpass_fmin = 0L,
  bandpass_fmax = 15000L,
  apply_sigmoid = TRUE,
  sigmoid_sensitivity = 1,
  filter_species = NULL,
  keep_empty = TRUE
)

# S3 method for class 'birdnet_model'
predict_species_from_audio_file(
  model,
  audio_file,
  min_confidence = 0.1,
  batch_size = 1L,
  chunk_overlap_s = 0,
  use_bandpass = TRUE,
  bandpass_fmin = 0L,
  bandpass_fmax = 15000L,
  apply_sigmoid = TRUE,
  sigmoid_sensitivity = 1,
  filter_species = NULL,
  keep_empty = TRUE
)

Arguments

model

A BirdNET model object. An instance of the BirdNET model (e.g., birdnet_model_tflite, birdnet_model_protobuf).

audio_file

character. The path to the audio file.

min_confidence

numeric. Minimum confidence threshold for predictions (default is 0.1).

batch_size

integer. Number of audio samples to process in a batch (default is 1L).

chunk_overlap_s

numeric. The overlap between audio chunks in seconds (default is 0). Must be in the interval [0.0, 3.0].

use_bandpass

logical. Whether to apply a bandpass filter (default is TRUE).

bandpass_fmin, bandpass_fmax

integer. Minimum and maximum frequencies for the bandpass filter (in Hz). Ignored if use_bandpass is FALSE (default is 0L to 15000L).

apply_sigmoid

logical. Whether to apply a sigmoid function to the model output (default is TRUE).

sigmoid_sensitivity

numeric. Sensitivity parameter for the sigmoid function (default is 1). Must be in the interval [0.5, 1.5]. Ignored if apply_sigmoid is FALSE.

filter_species

NULL, a character vector of length greater than 0, or a list where each element is a single non-empty character string. Used to filter the predictions. If NULL (default), no filtering is applied.

keep_empty

logical. Whether to include empty intervals in the output (default is TRUE).

Value

A data frame with columns: start, end, scientific_name, common_name, and confidence. Each row represents a single prediction.

Details

Applying a sigmoid activation function (apply_sigmoid=TRUE) scales the unbound class output of the linear classifier ("logit score") to the range 0-1. This confidence score is a unitless, numeric expression of BirdNET’s “confidence” in its prediction (but not the probability of species presence). Sigmoid sensitivity < 1 leads to more higher and lower scoring predictions, and a value > 1 leads to more intermediate-scoring predictions.

For more information on BirdNET confidence scores, the sigmoid activation function, and a suggested workflow on how to convert confidence scores to probabilities, see Wood & Kahl, 2024.

References

Wood, C. M., & Kahl, S. (2024). Guidelines for appropriate use of BirdNET scores and other detector outputs. Journal of Ornithology. https://doi.org/10.1007/s10336-024-02144-5

See also

read_labels() for more details on species filtering.

predict_species_from_audio_file.birdnet_model

Examples

if (FALSE) { # interactive()
library(birdnetR)

model <- birdnet_model_tflite(version = "v2.4", language = "en_us")
predictions <- predict_species_from_audio_file(model, "path/to/audio.wav", min_confidence = 0.2)
}