--- pipeline_tag: voice-activity-detection license: bsd-2-clause tags: - speech-processing - semantic-vad - multilingual datasets: - pipecat-ai/smart-turn-data-v3.1-train - pipecat-ai/smart-turn-data-v3.1-test --- # Smart Turn v3.x **Smart Turn** is an open‑source semantic Voice Activity Detection (VAD) model that tells you whether a speaker has finished their turn by analysing the raw waveform, not the transcript. ## Links * [Blog post: Smart Turn v3](https://www.daily.co/blog/announcing-smart-turn-v3-with-cpu-inference-in-just-12ms/) * [GitHub repo](https://github.com/pipecat-ai/smart-turn) with training and inference code, and more information * [Datasets](https://huggingface.co/pipecat-ai/datasets) ## Model architecture * Backbone: Whisper Tiny encoder * Head: shallow linear classifier * Params: 8M * Checkpoint: 8 MB ONNX (int8 quantized), 32MB ONNX (unquantized) ## How to use Please see the blog post and GitHub repo for more information on using the model, either standalone or with Pipecat. ## Thanks Thank you to the following organisations for contributing audio datasets: - [Liva AI](https://www.theliva.ai/) - [Midcentury](https://www.midcentury.xyz/) - [MundoAI](https://mundoai.world/)