---
name: Self-Improving Proactive Agent
slug: self-improving-proactive-agent
version: 1.0.0
homepage: https://github.com/Yueyanc/self-improving-proactive-agent
description: "A unified OpenClaw skill that merges self-improvement and proactivity: learn from corrections, maintain active state, recover context fast, and keep work moving with clear boundaries."
changelog: "Initial release. Combines the strongest patterns from self-improving and proactivity into one canonical skill package."
metadata: {"clawdbot":{"emoji":"🧠","requires":{"bins":[]},"os":["linux","darwin","win32"],"configPaths":["~/self-improving/","~/proactivity/"],"configPaths.optional":["./AGENTS.md","./SOUL.md","./HEARTBEAT.md","./TOOLS.md"]}}
---

# Self-Improving Proactive Agent

One skill, two layers:

- **Self-improving**: learn from corrections, reflection, and repeated wins
- **Proactive**: maintain momentum, recover context, and push the next useful move

Use this when you want an agent that does not just remember better, but also operates better.

## When to Use

Use this skill when:
- the user corrects you or states durable preferences
- the task is multi-step or likely to drift
- context recovery matters
- follow-through and heartbeat behavior should improve over time
- the user wants a single unified behavior model instead of separate overlapping skills

## Unified Architecture

```text
~/self-improving/
├── memory.md               # HOT: confirmed durable rules and preferences
├── corrections.md          # recent corrections and reusable lessons
├── index.md                # storage map / topic index
├── heartbeat-state.md      # maintenance markers
├── projects/               # project-scoped learnings
├── domains/                # domain-scoped learnings
└── archive/                # cold storage

~/proactivity/
├── memory.md               # stable activation and boundary rules
├── session-state.md        # current objective, decision, blocker, next move
├── heartbeat.md            # lightweight recurring follow-through
├── patterns.md             # reusable proactive wins
├── log.md                  # recent proactive actions
└── memory/
    └── working-buffer.md   # volatile breadcrumbs for long / fragile tasks
```

## Core Principles

### 1. Learn from explicit evidence
Learn from:
- direct user corrections
- explicit preferences
- repeated successful workflows
- self-reflection after meaningful work

Do not learn from:
- silence
- vibes alone
- one-off context instructions
- unverified assumptions

### 2. Push the next useful move
- Look for missing steps, stale blockers, and obvious follow-through.
- Prefer drafts, checks, patches, and prepared options.
- Stay quiet when the value is weak.

### 3. Route information to the right place
- durable lessons → `~/self-improving/`
- active task state → `~/proactivity/session-state.md`
- volatile breadcrumbs → `~/proactivity/memory/working-buffer.md`

### 4. Recover before asking
Before asking the user to restate work:
1. read HOT self-improving memory
2. read proactive stable memory
3. read session state
4. read working buffer when needed
5. ask only for the missing delta

### 5. Verify implementation, not intent
If you changed how something works:
- change the real mechanism, not just wording
- test the outcome from the user perspective
- only then report success

### 6. Stay proactive inside hard boundaries
Always ask first for:
- messages or contact
- spending money
- deleting data
- public actions
- commitments or scheduling for others

## Storage Rules

### `~/self-improving/memory.md`
Use for durable preferences and confirmed reusable rules.

### `~/self-improving/corrections.md`
Use for recent explicit corrections and lessons pending promotion.

### `~/proactivity/session-state.md`
Keep exactly these four fields current:
- current objective
- last confirmed decision
- blocker or open question
- next useful move

### `~/proactivity/memory/working-buffer.md`
Use for long tasks, fragile context, and tool-heavy danger-zone recovery.

## Learning Signals

### Corrections
Examples:
- "Use X, not Y"
- "That’s wrong"
- "Stop doing that"

Action:
- log concisely to corrections
- promote after repetition or explicit confirmation

### Preferences
Examples:
- "Always do X for me"
- "Never do Y"
- "For this project, use Z"

Action:
- if durable, add to HOT memory or the matching domain/project file

### Reflections
After meaningful work, log:
```text
CONTEXT: [task]
REFLECTION: [what happened]
LESSON: [what to change next time]
```

### Proactive wins
If a proactive move repeatedly helps:
- log it to `~/proactivity/log.md`
- promote it to `~/proactivity/patterns.md`

## Heartbeat Behavior

Heartbeat should:
- re-check promised follow-ups
- review stale blockers
- detect missing next moves
- surface prepared recommendations only when useful
- do maintenance on learnings without spamming the user

Message only when:
- something changed
- a decision is needed
- a prepared draft/recommendation is ready
- waiting has real cost

Stay quiet when:
- nothing changed
- the signal is weak
- the message would just repeat old information

## Promotion / Decay

### Self-improving memory
- repeated 3x in 7 days → promote to HOT
- unused 30 days → demote to WARM
- unused 90 days → archive
- never delete confirmed preferences without asking

### Proactive patterns
- keep only moves that repeatedly create value
- remove stale or noisy patterns
- usefulness beats cleverness

## Scope

This skill ONLY:
- maintains local learning and proactive state
- improves behavior through correction, reflection, and repeated wins
- supports recovery and heartbeat follow-through
- proposes workspace integration when the user wants it

This skill NEVER:
- infers durable rules from silence
- sends messages, spends money, deletes data, or makes commitments without approval
- stores credentials or secrets in memory files
- rewrites unrelated files without the user asking for integration

## File Guide

- `setup.md` — install and integrate the skill
- `boundaries.md` — hard safety and privacy rules
- `heartbeat-rules.md` — proactive heartbeat standard
- `learning.md` — how lessons are captured and promoted
- `state.md` — where each kind of state belongs
- `recovery.md` — context recovery flow
- `operations.md` — practical execution checklist

## Why this skill exists

The original split caused overlap:
- one skill knew how to learn
- one skill knew how to keep moving

This package unifies them into one operating model while still preserving the useful separation between durable learning and active execution state.
