Skip to contents

This convenience function retrieves the row(s) with the highest confidence value within each time interval. It can also limit the results to a specific time interval if specified.

Usage

get_top_prediction(data, filter = NULL)

Arguments

data

A data frame with columns 'start', 'end', 'scientific_name', 'common_name', and 'confidence'. This data frame is typically the output from predictions_to_df.

filter

A list containing 'start' and 'end' values to filter the data before calculation. If NULL, the function processes all time intervals.

Value

A data frame containing the rows with the highest confidence per group or for the specified interval.

Examples

if (FALSE) { # interactive()
# Example data
data <- data.frame(
  start = c(0, 0, 1, 1, 2, 2),
  end = c(1, 1, 2, 2, 3, 3),
  scientific_name = c(
    "Species A",
    "Species B",
    "Species A",
    "Species B",
    "Species A",
    "Species B"
  ),
  common_name = c(
    "Common A",
    "Common B",
    "Common A",
    "Common B",
    "Common A",
    "Common B"
  ),
  confidence = c(0.1, 0.2, 0.5, 0.3, 0.7, 0.8)
)
data

# Get top prediction for each time interval
get_top_prediction(data)

# Get top prediction for a specific time interval
get_top_prediction(data, filter = list(start = 1, end = 2))

# The same thing can be done using dplyr
# data |>
#    dplyr::group_by(start, end) |>
#    dplyr::slice_max(order_by = confidence)
}