onnx-community/pyannote-segmentation-3.0

Code in model card no longer works. Here is a working example for people new to this like me:

Notes:

The following does not work as is since AudioContext is a Web API that isn’t available in a Node.js. Follow steps here for that: https://huggingface.co/docs/transformers.js/guides/node-audio-processing.
Better documentation: https://github.com/huggingface/transformers.js.
No need to install like this as mentioned in the other discussion: npm install xenova/transformers.js#v3

I hope this helps!

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<title>Speaker Diarization with Transformers.js</title>
</head>
<body>
<h1>Speaker Diarization Example</h1>

<script type="module">
import { AutoProcessor, AutoModelForAudioFrameClassification, read_audio } from 'https://cdn.jsdelivr.net/npm/@huggingface/[email protected]';


(async () => {
  try {
    // Load model and processor
    const model_id = 'onnx-community/pyannote-segmentation-3.0';
    const model = await AutoModelForAudioFrameClassification.from_pretrained(model_id);
    const processor = await AutoProcessor.from_pretrained(model_id);

    // Read and preprocess audio
    const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/mlk.wav';
    const audio = await read_audio(url, processor.feature_extractor.config.sampling_rate);
    const inputs = await processor(audio);

    // Run model with inputs
    const { logits } = await model(inputs);
    const result = processor.post_process_speaker_diarization(logits, audio.length);

    console.table(result[0], ['start', 'end', 'id', 'confidence']);
  } catch (err) {
    console.error("Error running the diarization:", err);
  }
})();
</script>

</body>
</html>

onnx-community
/

pyannote-segmentation-3.0

Working example