Qwen3-VL-4B is incredibly easy to fine-tune! We've trained the first DSE model based on this model, and it's already performing at the same level as Jina v4!
While Jina Embeddings v4 is built on Qwen2.5-VL-3B (which has a non-commercial license), our model is based on Qwen3-VL-4B and released under Apache 2.0—making it fully commercially permissive.
Reproducing research code shouldn't take longer than reading the paper. For papers that include code, setting up the right environment often means hours of dependency hell and configuration debugging.
At Remyx AI, we built an agent that automatically creates and tests Docker images for research papers, then shares them publicly so anyone can reproduce results with a single command.
We just submitted PR #908 to integrate this directly into arXiv Labs.
As promised, and after the request of many, we have managed to fit in the first live session about Ark that we will be giving on the 28th of July.
pip install ark-robotics
For those who are already in the messaging channel, all is done, no need to do anything :-D For those interested in registering, please write to me at [email protected] - then I can add you and send you the invite.
We chose the timing to be 5 pm UK after consulting many of the interested people. Hope it works well for you too?
5 years ago, we launched Gradio as a simple Python library to let researchers at Stanford easily demo computer vision models with a web interface.
Today, Gradio is used by >1 million developers each month to build and share AI web apps. This includes some of the most popular open-source projects of all time, like Automatic1111, Fooocus, Oobabooga’s Text WebUI, Dall-E Mini, and LLaMA-Factory.
How did we get here? How did Gradio keep growing in the very crowded field of open-source Python libraries? I get this question a lot from folks who are building their own open-source libraries. This post distills some of the lessons that I have learned over the past few years:
1. Invest in good primitives, not high-level abstractions 2. Embed virality directly into your library 3. Focus on a (growing) niche 4. Your only roadmap should be rapid iteration 5. Maximize ways users can consume your library's outputs
1. Invest in good primitives, not high-level abstractions
When we first launched Gradio, we offered only one high-level class (gr.Interface), which created a complete web app from a single Python function. We quickly realized that developers wanted to create other kinds of apps (e.g. multi-step workflows, chatbots, streaming applications), but as we started listing out the apps users wanted to build, we realized what we needed to do:
See that purple banner on the Llama 4 models? It's Xet storage, and this is actually huge for anyone building with AI models. Let's geek out a little bit 🤓
Current problem: AI models are massive files using Git LFS. But with models getting bigger and downloads exploding, we needed something better. Xet lets you version large files like code, with compression and deduplication, all Git-compatible. That means less bandwidth, faster sharing, and smoother collaboration.
Real numbers: ~25% deduplication on Llama 4 models, hitting ~40% for finetunes.
Scale matters here - the Hub served 2B model downloads in 30 days, Llama models alone at 60M. The upcoming Llama 4 Behemoth has 2T parameters! Xet's chunk-based system was built exactly for this.
This is the kind of engineering that makes the next wave of large models actually usable. Kudos to the team! 🧨