quantize
birdnet_stm32.conversion.quantize
¶
Post-training quantization (PTQ) conversion from Keras to TFLite.
Provides representative dataset generation and TFLite conversion with float32 I/O and INT8 internal ops for STM32N6 NPU deployment.
representative_data_gen(file_paths, cfg, num_samples=100, snr_threshold=0.01)
¶
Build a representative dataset generator for TFLite PTQ calibration.
Yields one input tensor per iteration in the exact shape expected by the model. Filters out near-silent chunks (below snr_threshold) to avoid widening INT8 quantization ranges with uninformative data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_paths
|
list[str]
|
Audio file paths to sample from. |
required |
cfg
|
dict
|
Training config dict (sample_rate, num_mels, spec_width, chunk_duration, fft_length, audio_frontend, mag_scale). |
required |
num_samples
|
int
|
Maximum number of samples to draw. |
100
|
snr_threshold
|
float
|
Minimum RMS energy for a chunk to be included (0 to disable). |
0.01
|
Yields:
| Type | Description |
|---|---|
|
Single-element list containing the input tensor with batch dimension. |
Source code in birdnet_stm32/conversion/quantize.py
convert_to_tflite(model, rep_data_gen, output_path, quantization='ptq', per_tensor=False)
¶
Convert a Keras model to quantized TFLite with float32 I/O and INT8 internals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Model
|
Loaded Keras model. |
required |
rep_data_gen
|
Callable returning an iterable of [input_tensor] for calibration. Not used when quantization='dynamic'. |
required | |
output_path
|
str
|
Path to save the .tflite model. |
required |
quantization
|
str
|
'ptq' (full INT8 with calibration) or 'dynamic' (dynamic range). |
'ptq'
|
per_tensor
|
bool
|
If True, use per-tensor instead of per-channel quantization. |
False
|
Returns:
| Type | Description |
|---|---|
bytes
|
Raw TFLite model bytes. |