$EVOCLAW on Base  ·  CA: 0xC768...3bA3 VIEW ON BANKR →
HOME LIVE DEMO DOCS GET STARTED INSTALL ASK EVOCLAW PLAYGROUND GITHUB →
v0.2.1  ·  MIT LICENSE  ·  OPEN SOURCE
SELF-EVOLVING AI AGENT WRAPPER

Your AI agent,

continuously learning.

EvoClaw wraps your model, scores every conversation with a reward model, injects skills in real time, and trains via cloud LoRA — automatically. No GPU. No data team. No downtime.

0GPU Needed
100%Async
5 minSetup
2Learn Modes
MITLicense

HOW IT WORKS

From conversation to trained model.

Five automated steps run in the background — no manual intervention, no restarts, no service interruption.

01
Intercept
Transparent OpenAI-compatible proxy captures every conversation turn with zero added latency.
02
Score
A Process Reward Model scores each turn via a judge LLM. High quality turns contribute more to training.
03
Inject Skills
Relevant skills are injected into the system prompt before each response. Immediate improvement, no retraining wait.
04
Cloud Training
Batch submits to Tinker cloud for LoRA fine-tuning. Runs remotely — zero GPU on your end.
05
Hot-Swap & Repeat
Updated weights swap into the live server. Zero downtime. The cycle continues automatically.
EVOCLAW RUNTIME LIVE
[PROXY]Turn intercepted from OpenClaw
[PRM]Scoring... reward = 0.84
[SKILLS]Injected: [coding, security]
[BUFFER]Batch 32/32 → sending to Tinker
[TRAIN]LoRA step 48 — loss: 0.0308
[SWAP]✓ Weights updated. Agent upgraded.
▶ Open Full Interactive Demo →

FEATURES

Everything your agent needs to grow.

No data team. No fine-tuning pipeline. EvoClaw handles the entire learning loop in the background.

🎯
Real Usage Training

Learns from live conversations — no synthetic datasets, no offline retraining. Continuously improving from actual deployment.

NO SYNTHETIC DATA
💉
Skill Injection

Retrieves relevant skill instructions and injects them into the system prompt each turn. Instant improvement without waiting for retraining.

INSTANT BOOST
🧬
Skill Evolution

When the agent fails, EvoClaw auto-generates a new skill from the failure trajectory using an LLM. Learns from its own mistakes.

SELF-IMPROVEMENT
☁️
No GPU Cluster

Training offloads to Tinker cloud. Any machine with network access runs the complete system — zero infrastructure overhead.

CLOUD-NATIVE
Fully Async

Serving, scoring, and training run as decoupled coroutines. Your agent responds in real time while learning happens in the background.

NON-BLOCKING
🔀
Dual Learning Modes

RL (GRPO) for implicit environment signals. On-Policy Distillation for richer language supervision. One config field to switch.

GRPO + OPD


LEARNING MODES

Two ways your agent gets smarter.

EvoClaw supports both lightweight signal learning and rich natural-language supervision — choose what fits your setup.

MODE 01

Reinforcement Learning (GRPO)

Uses Group Relative Policy Optimization. The agent learns from implicit feedback — every scored conversation turn updates the policy automatically.

# Lightweight — works with any signal
config = EvoClawConfig(
  loss_fn="importance_sampling",
  use_prm=True,
)
GRPO PPO CISPO
MODE 02

On-Policy Distillation (OPD)

Leverages richer natural-language supervision from a teacher model. Best when you have access to a strong judge LLM for high-quality textual feedback.

# High quality — needs judge model
config = EvoClawConfig(
  use_prm=True,
  prm_model="gpt-5.2",
)
TEACHER MODEL RICH FEEDBACK

SUPPORTED MODELS

Works with the models you already use.

EvoClaw is model-agnostic. Use Kimi-2.5 for maximum quality, Qwen3-4B for lightweight deployment, or any Groq/OpenAI-compatible endpoint.

🌙
RECOMMENDED
Kimi-2.5
~200B MoE

Best quality, long context, strong reasoning. Recommended for production.

moonshotai/Kimi-2.5
LIGHTWEIGHT
Qwen3-4B
4B params

Fast iteration, lower API costs. Great for development and constrained budgets.

Qwen/Qwen3-4B
🔌
COMPATIBLE
Any API
OpenAI-compatible

Groq, OpenAI, Anthropic, or any Tinker-supported endpoint. Plug and play.

llama · gpt · claude

CONFIGURATION

One config object. Full control.

All settings are passed as a single EvoClawConfig instance — no YAML files, no env sprawl.

FIELD DEFAULT DESCRIPTION
loss_fn"importance_sampling"RL loss: importance_sampling / ppo / cispo
use_prmTrueEnable PRM reward scoring per turn
use_skillsFalseInject skills into system prompt
batch_size32Turns before each training step
lora_rank32LoRA rank. Higher = more capacity
enable_skill_evolutionFalseAuto-generate skills from failures
proxy_port30000Proxy listen port (default: 8080 in EvoClaw)
VIEW FULL CONFIG REFERENCE →
FULL EXAMPLE
from evoclaw import EvoClawConfig

config = EvoClawConfig(
  model_name="moonshotai/Kimi-2.5",
  loss_fn="importance_sampling",
  use_prm=True,
  use_skills=True,
  enable_skill_evolution=True,
  batch_size=32,
  lora_rank=32,
)

# That's it — start chatting
evoclaw start --config config
GET STARTED

Up and running in 5 minutes.

Tell us about your project and model stack — we'll send a personalized setup guide, a EvoClawConfig pre-filled for your scenario, and a starter skill bank curated for your use case.

📦
Personalized Setup Guide
Step-by-step walkthrough based on your model and use case, sent right after you submit.
⚙️
Ready-to-Use Config
EvoClawConfig pre-filled for your scenario — just add your API key.
🧬
Starter Skill Bank (25+ skills)
Curated skills for your domain — coding, security, research, or agentic workflows.
📊
Early Access Updates
First to know about new features and integrations. Unsubscribe anytime.

Get Your Free Setup Guide

Receive a personalized config + skill bank · No credit card · MIT licensed

🦎

You're all set!

Your setup guide, config template, and starter skill bank are on their way. Check your inbox.

No spam · Unsubscribe anytime · Open source



LORA TRAINING

Cloud LoRA training — fully automatic.

Every conversation trains your model. EvoClaw batches turns, submits to Tinker cloud, and hot-swaps updated weights — all in the background. No GPU, no downtime, no manual steps.

☁️
No GPU Required

Training runs entirely on Tinker cloud. Any machine with network access can run the full pipeline.

CLOUD-NATIVE
🔄
Hot-Swap Weights

New LoRA weights replace old ones automatically after each step. Zero downtime, zero restarts.

ZERO DOWNTIME
🌙
Kimi-K2.5

Same model as MetaClaw. ~200B MoE, best reasoning and long context. $4.40/M tokens on Tinker.

SAME AS METACLAW
Qwen3-4B Free

Lightweight alternative on Tinker free tier. Great for development and constrained budgets.

FREE TIER
📊
GRPO + OPD

Two learning modes — Reinforcement Learning (GRPO) or On-Policy Distillation. One config field.

DUAL MODE
💰
Pay Per Use

No subscription. Top up Tinker balance and pay only for actual training compute. Start from $5.

FROM $5
HOW TO USE LORA

Set up in 3 steps.

01
Get your Tinker API Key
Sign up at tinker-console.thinkingmachines.ai → API Keys → Create API Key. Copy the key starting with tm1-.... Top up at least $5 under Billing → Add to balance.
02
Run evoclaw init
Run evoclaw init in terminal. Paste your Groq key (free at console.groq.com), then your Tinker key. Choose model 3 — Kimi-K2.5 for best results, or model 1 — Qwen3-4B for free tier.
03
Start proxy — LoRA trains automatically
Run evoclaw start. You will see Tinker: ✅ connected. Point your OpenAI client to http://localhost:8080/v1 and chat normally. Every 32 conversations, EvoClaw auto-submits a LoRA job and hot-swaps the new weights.
QUICK START

Running in 3 commands.

No GPU, no cluster, no data team. Install, configure, and start — EvoClaw handles the rest.

TERMINAL
# 1. Install EvoClaw
pip install evoclaw
 
# 2. Setup API keys (Groq = free)
evoclaw init
 
# 3. Start the proxy
evoclaw start
 
EvoClaw Proxy v0.2.1 — localhost:8080 — evolving!

BUILT ON OPEN SOURCE
🦀
OpenClaw
Core agent framework
⚙️
Tinker
Cloud LoRA training
🧠
MetaClaw
Original inspiration
📚
Awesome Skills
Skill bank foundation

READY TO START?

Your agent starts evolving today.

No GPU. No data team. No setup headache. Just plug EvoClaw in and watch your agent improve with every conversation.