Skip to contents

Subsamples a specified number of observations per species from a BirdNET output dataset. Supports three subsampling methods: stratified by confidence score bins, random, or top confidence scores. Optionally saves the subsampled data to a CSV file.

Usage

birdnet_subsample(
  data,
  n,
  method = c("stratified", "random", "top"),
  save_to_file = FALSE,
  file = NULL
)

Arguments

data

A data frame containing BirdNET output. Relevant columns (e.g., common name, confidence, datetime) are automatically detected by birdnet_detect_columns.

n

Integer. Number of observations to subsample per species.

method

Character. Subsampling method to use. One of:

"stratified"

Samples evenly across confidence score strata (0.1 to 1 by 0.05 bins).

"random"

Randomly samples n observations per species.

"top"

Selects the top n observations with the highest confidence per species.

Defaults to "stratified".

save_to_file

Logical. If TRUE, saves the output to a CSV file. Defaults to FALSE. Automatically set to TRUE if file is provided.

file

Character or NULL. File path to save the output CSV. If NULL and save_to_file = TRUE, saves as "subsampled_data.csv" in the working directory.

Value

A data frame containing the subsampled observations.

Examples

if (FALSE) { # \dontrun{
birdnet_subsample(data = my_data, n = 300, method = "stratified")
birdnet_subsample(data = my_data, n = 100, method = "top", save_to_file = TRUE,
file = "top_samples.csv")
} # }