Kyle's picture

Kyle PRO

iky1e

·

https://ikyle.me

kylehowells

AI & ML interests

None yet

Recent Activity

upvoted a paper about 15 hours ago

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

upvoted an article about 16 hours ago

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

liked a model about 16 hours ago

aufklarer/Qwen3.5-0.8B-Chat-MLX

View all activity

Organizations

upvoted a paper about 15 hours ago

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

Paper • 2312.17279 • Published Dec 27, 2023 • 4

upvoted an article about 16 hours ago

Article

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

nvidia

•

1 day ago

• 35

upvoted 2 collections about 16 hours ago

CoreML Speech Models

Speech AI models for Apple Neural Engine via CoreML. iOS/macOS ready. ASR, TTS, VAD, diarization. • 24 items • Updated about 23 hours ago • 4

MLX Speech Models

Speech AI models for Apple Silicon via MLX. ASR, TTS, VAD, diarization, speaker embedding. • 56 items • Updated about 18 hours ago • 5

upvoted a paper 3 days ago

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Paper • 2605.26368 • Published 12 days ago • 4

upvoted a collection 3 days ago

📟 PaGeR

Panorama Geometry Reconstruction • 7 items • Updated 9 days ago • 1

upvoted 5 papers 8 days ago

CubePart: An Open-Vocabulary Part-Controllable 3D Generator

Paper • 2605.28763 • Published 10 days ago • 14

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 10 days ago • 72

Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

Paper • 2605.28816 • Published 10 days ago • 419

EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM

Paper • 2312.06660 • Published Dec 11, 2023 • 2

InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

Paper • 2509.24663 • Published Sep 29, 2025 • 18

upvoted a paper 13 days ago

Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling

Paper • 2604.23586 • Published Apr 26 • 6

upvoted 2 collections 13 days ago

Talker-T2AV

Talker-T2AV • 3 items • Updated 13 days ago • 3

Lance MLX

Feature-complete MLX port of ByteDance Lance: t2i, image_edit, x2t_image, t2v, video_edit, x2t_video. • 4 items • Updated 3 days ago • 4

upvoted a collection 14 days ago

Hy-MT2

混元翻译模型2.0版本 • 11 items • Updated 10 days ago • 43

upvoted 3 papers 14 days ago

Pixal3D: Pixel-Aligned 3D Generation from Images

Paper • 2605.10922 • Published 26 days ago • 33

Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild

Paper • 2605.22064 • Published 16 days ago • 5

Lance: Unified Multimodal Modeling by Multi-Task Synergy

Paper • 2605.18678 • Published 19 days ago • 78

upvoted a paper about 1 month ago

DeepVQE: Real Time Deep Voice Quality Enhancement for Joint Acoustic Echo Cancellation, Noise Suppression and Dereverberation

Paper • 2306.03177 • Published Jun 5, 2023 • 1

upvoted an article about 1 month ago

Article

Granite 4.1 LLMs: How They’re Built

ibm-granite

•

Apr 29

• 77