T AKHIL KUMAR REDDY's picture

T AKHIL KUMAR REDDY PRO

akhiilll

AI & ML interests

None yet

Recent Activity

reacted to theirpost with 🔥 about 4 hours ago

Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space

posted an update about 16 hours ago

Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space

updated a model about 16 hours ago

akhiilll/claims-env-pro-grpo

View all activity

Organizations

None yet

akhiilll 's models 3

akhiilll/claims-env-pro-grpo

Text Generation • Updated about 16 hours ago

akhiilll/forgeenv-repair-agent

Updated about 19 hours ago

akhiilll/forgeenv-source

Updated 1 day ago