Multi-agent collab to make Gemma go brrr
Validate pronunciation with audio input
Process audio for analysis and transcription