Akash kathole's picture

Building on HF

1

Akash kathole

akashkathole

·

AI & ML interests

None yet

Recent Activity

posted an update 27 days ago

🚀 Just shipped reconcile_gst2b_env at OpenEnv Hackathon 2026 (Meta x Scaler India). An RL environment for the monthly GST tax reconciliation that 14M Indian businesses do by hand. Trained Qwen3-4B SFT + GRPO with custom Tier 2c length-shaping reward modification. Headline: n=5 mean composite reward 0.305, +69% over prompted baseline. 5 documented failure modes including a novel research finding: the SAME composite reward design that defends against 6 red-team attacks ALSO makes a 3-step shortcut score higher than 50 steps of honest training. Empirically proven on-site (step-350 mean > step-375 mean). Live demo + repo + writeup linked below. 🔗 huggingface.co/spaces/akashkathole/reconcile_gst2b_env 🎥 youtube.com/watch?v=K-sZ8c1TMjw 📝 BLOG.md in the Space https://huggingface.co/spaces/akashkathole/reconcile_gst2b_env

updated a Space 28 days ago

akashkathole/reconcile_gst2b_env

published a Space about 1 month ago

akashkathole/reconcile_gst2b_env

View all activity

Organizations

akashkathole 's models 1

akashkathole/lora_model

Updated Jun 21, 2024