Fast Realtime Pitch Detection

This demo runs a neural network specialized for fast pitch detection of human voices. It is based on the FCPE model, with the audio extractor rewritten in browser-compatible WASM and the ONNX model quantised to FP16 to minimise latency.

Paper: https://arxiv.org/abs/2509.15140

Model: web/static/fcpe.single.fp16.onnx

(not initialized)
Voicing threshold thr=0.0060 conf=0.0000 (below)

Embedded sample