DeepFilterNet3 β€” Core ML (FP16)

Real-time speech enhancement model for Apple Silicon. Removes background noise from speech audio.

  • 2.1M params, FP16, ~4.2 MB
  • Runs on Neural Engine via Core ML
  • 48kHz native, 10ms frames

Latency (M2 Max)

Duration Time RTF
5s 0.65s 0.13
10s 1.2s 0.12
20s 4.8s 0.24

Usage

import SpeechEnhancement

let enhancer = try await SpeechEnhancer.fromPretrained()
let clean = try enhancer.enhance(audio: noisyAudio, sampleRate: 48000)
swift run audio denoise noisy.wav --output clean.wav

Files

  • DeepFilterNet3.mlpackage β€” Core ML FP16 model (Neural Engine)
  • auxiliary.npz β€” ERB filterbank, Vorbis window, normalization states

Reference

Downloads last month
90
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for aufklarer/DeepFilterNet3-CoreML