Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
T AKHIL KUMAR REDDY
PRO
akhiilll
Follow
AI & ML interests
None yet
Recent Activity
reacted
to
their
post
with 🔥
about 4 hours ago
Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space
posted
an
update
about 16 hours ago
Just shipped ClaimSense Adjudication Gym at OpenEnv Hackathon 2026 (Scaler India). An OpenEnv RL environment for enterprise insurance claims adjudication—the monthly “tool-heavy” workflow real adjusters do: pull policy + claim history, run fraud checks, verify purchase/transactions, then approve / deny / escalate under partial observability with long-horizon credit assignment. Trained Qwen/Qwen2.5-1.5B-Instruct with: Rollout evaluation on HF Jobs (A10G) and a random baseline for comparison Real GRPO weight updates (TRL GRPOTrainer) with LoRA adapters and two independent reward functions (format + env replay) Headline training evidence: GRPO run: 80 steps, 640 rollouts, KL rises ~0 → ~0.06 (real weight updates), completion length shrinks (~25 → ~10). Plots + logs are committed in the Space under runs/. Live demo + repo + writeup linked below. 🔗 Env (Space URL): https://huggingface.co/spaces/akhiilll/claims-env 🧪 Notebook: https://huggingface.co/spaces/akhiilll/claims-env/blob/main/training/InsureClaim_Training_Colab.ipynb 📝 Blog: docs/HF_MINI_BLOG.md in the Space
updated
a model
about 16 hours ago
akhiilll/claims-env-pro-grpo
View all activity
Organizations
None yet
akhiilll
's models
3
Sort: Recently updated
akhiilll/claims-env-pro-grpo
Text Generation
•
Updated
about 16 hours ago
akhiilll/forgeenv-repair-agent
Updated
about 19 hours ago
akhiilll/forgeenv-source
Updated
1 day ago