Zhijiang's picture

26

Zhijiang

Zeee

·

https://cartus.github.io/

AI & ML interests

Large Language Models

Recent Activity

upvoted a paper 21 days ago

Efficient RLVR Training via Weighted Mutual Information Data Selection

upvoted a paper 26 days ago

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

upvoted a paper 28 days ago

On Data Engineering for Scaling LLM Terminal Capabilities

View all activity

Organizations

upvoted a paper 21 days ago

Efficient RLVR Training via Weighted Mutual Information Data Selection

Paper • 2603.01907 • Published 22 days ago • 14

upvoted a paper 26 days ago

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

Paper • 2510.07896 • Published Oct 9, 2025 • 8

upvoted a paper 28 days ago

On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published 28 days ago • 99

upvoted a collection 29 days ago

CodeScaler

5 items • Updated 22 days ago • 6

upvoted a paper 29 days ago

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

Paper • 2602.17684 • Published Feb 4 • 22

upvoted 4 papers about 1 month ago

Probability-Entropy Calibration: An Elastic Indicator for Adaptive Fine-tuning

Paper • 2602.01745 • Published Feb 2 • 7

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Paper • 2602.08321 • Published Feb 9 • 42

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

Paper • 2602.07962 • Published Feb 8 • 24

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32

upvoted a paper 2 months ago

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Paper • 2601.12346 • Published Jan 18 • 50

upvoted 4 papers 5 months ago

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published Oct 28, 2025 • 72

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published Oct 27, 2025 • 98

QueST: Incentivizing LLMs to Generate Difficult Problems

Paper • 2510.17715 • Published Oct 20, 2025 • 35

Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13, 2025 • 106

upvoted 2 papers 6 months ago

TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenarios

Paper • 2505.12891 • Published May 19, 2025 • 10

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Paper • 2509.21268 • Published Sep 25, 2025 • 104

upvoted 4 papers 7 months ago

Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration

Paper • 2508.13755 • Published Aug 19, 2025 • 14

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Paper • 2508.20096 • Published Aug 27, 2025 • 37

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19, 2025 • 119

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published Aug 20, 2025 • 85