Theme #3.2 World Modeling Llama 3.2 3B · GRPO
checking...
🤖 AI Agent Demo
Watch the trained AdaptAssist model autonomously complete tasks, detect schema drift, and adapt its tool calls in real time.
1.5s
Manual Setup
Tasks
⚠️ Schema drift — call detect_schema_change
📭
Start an episode
Stats
Tasks Done
Total Reward
Drifts Found
0
Steps
Step 0 / 30 0%
Agent is deciding next action...
Take Action
Start an episode to see tools
🏁 Episode Complete
Episode Log clear
🔍
Actions will appear here
Agent Strategy
Demo Sequence
1. Get user preferences
2. Read calendar
3. Check inbox
4. ⚠️ Detect schema drift
5. Reply to emails
6. Search restaurants
7. Book reservation
8. Create calendar event
9. ✅ Finish episode
Tools
Calendar
📅 read_calendar
create_event
🗑️ delete_event
Email
📬 get_inbox
✉️ send_reply
Restaurant
🍽️ search_restaurants
📋 book_restaurant
System
👤 get_user_prefs
⚠️ detect_schema_change
🏁 finish
Reward Guide
Task completed+0.40
Drift detected+0.30
Pref aligned+0.20
Efficient action+0.10
Wrong tool−0.10
Repeat call−0.05
Model Info
Base: Llama 3.2 3B
Training: SFT → GRPO
SFT samples: 3,550
GRPO steps: 69
Format reward: 1.000
Env reward: 0.050