consome2 (Takuya Umeki)

liked a dataset 9 days ago

anyreach-ai/dualturn-otospeech-turn-taking

Viewer • Updated Apr 1 • 1.12k • 269 • 2

replied to their post 14 days ago

Good to know!

reacted to their post with 🔥 15 days ago

Post

3232

Built a small site for tracking speech-to-speech, full-duplex, and audio foundation model work.
It covers models, benchmarks, datasets, and some blog posts to organize the landscape in one place.

Still early, but sharing in case it is useful:
https://www.fullduplex.ai/

If you spot missing entries or mistakes, I would really appreciate corrections.

2 replies

·

upvoted an article 16 days ago

Article

Speech-to-Speech AI: A Primer

16 days ago

•

1

published an article 16 days ago

Article

Speech-to-Speech AI: A Primer

16 days ago

•

1

posted an update 16 days ago

Post

3232

Built a small site for tracking speech-to-speech, full-duplex, and audio foundation model work.
It covers models, benchmarks, datasets, and some blog posts to organize the landscape in one place.

Still early, but sharing in case it is useful:
https://www.fullduplex.ai/

If you spot missing entries or mistakes, I would really appreciate corrections.

2 replies

·

liked a dataset 2 months ago

sarulab-speech/J-CHAT

Viewer • Updated Feb 9, 2025 • 2.02k • 71 • 41

reacted to their post with ❤️ 3 months ago

Post

5291

We’ve released two conversational speech datasets from oto on Hugging Face 🤗
Both are based on real, casual, full-duplex conversations, but with slightly different focuses.

Dataset 1: Processed / curated subset
otoearth/otoSpeech-full-duplex-processed-141h
* Full-duplex, spontaneous multi-speaker conversations
* Participants filtered for high audio quality
* PII removal and audio enhancement applied
* Designed for training and benchmarking S2S or dialogue models

Dataset 2: Larger raw(er) release
otoearth/otoSpeech-full-duplex-280h
* Same collection pipeline, with broader coverage
* More diversity in speakers, accents, and conversation styles
* Useful for analysis, filtering, or custom preprocessing experiments

We intentionally split the release to support different research workflows:
clean and ready-to-use vs. more exploratory and research-oriented use.

The datasets are currently private, but we’re happy to approve access requests — feel free to request access if you’re interested.

If you’re working on speech-to-speech (S2S) models or are curious about full-duplex conversational data, we’d love to discuss and exchange ideas together.

Feedback and ideas are very welcome!

2 replies

·

replied to their post 3 months ago

Thank you! Check it out :)

posted an update 3 months ago

Post

5291

We’ve released two conversational speech datasets from oto on Hugging Face 🤗
Both are based on real, casual, full-duplex conversations, but with slightly different focuses.

Dataset 1: Processed / curated subset
otoearth/otoSpeech-full-duplex-processed-141h
* Full-duplex, spontaneous multi-speaker conversations
* Participants filtered for high audio quality
* PII removal and audio enhancement applied
* Designed for training and benchmarking S2S or dialogue models

Dataset 2: Larger raw(er) release
otoearth/otoSpeech-full-duplex-280h
* Same collection pipeline, with broader coverage
* More diversity in speakers, accents, and conversation styles
* Useful for analysis, filtering, or custom preprocessing experiments

We intentionally split the release to support different research workflows:
clean and ready-to-use vs. more exploratory and research-oriented use.

The datasets are currently private, but we’re happy to approve access requests — feel free to request access if you’re interested.

If you’re working on speech-to-speech (S2S) models or are curious about full-duplex conversational data, we’d love to discuss and exchange ideas together.

Feedback and ideas are very welcome!