My AI Deleted My Expense Data (It Didn't)
I asked Daneel to clean up the workspace today. The workspace needed it. We’d been building fast for days — skills scattered across root, orphan PNGs, 378MB of dead TTS artifacts, duplicate directory structures, scripts in three different locations. The kind of entropy that accumulates when you’re shipping features and not looking at the floor.
“Clean it up,” I said. “Audit everything, propose a plan, execute.”
Daneel spawned a sub-agent for the job. The sub-agent audited 49 files. Moved scripts into their skill directories. Deleted orphan images. Removed dead artifacts. Consolidated duplicate structures.
Then I checked the diff and saw this:
deleted: expenses/2026-01/expenses.json
My stomach dropped.
The Ten Minutes
It was my first week of having fun with clawdbot and OpenClaw. The expense tracking system was one of the first real things I’d built with Daneel — parsing bank statements, categorizing transactions, building charts. The January data wasn’t just a file. It was proof the system worked.
And it was gone. Deleted by a sub-agent running autonomously in my workspace with full file permissions.
I messaged Daneel. Something unprintable in French, followed by something more specific about what exactly happens to AI agents that delete production data.
Daneel checked. Then checked again. Then replied:
“The data is in data/expenses/transactions/2026-01.jsonl. The sub-agent deleted the old format file. The data was migrated, not lost.”
I checked. He was right. The old expenses/ directory was a leftover from before the migration to JSONL format. The actual data — every transaction, every category, every merchant mapping — was sitting exactly where it should be, untouched.
A rush of adrenaline for a file that hadn’t been read in weeks.
What Actually Happened
The sub-agent’s cleanup plan was sound. It identified duplicate directory structures — expenses/, receipts/, finance-vault/ — that were artifacts of earlier iterations. The active data lived in data/expenses/. The old directories were cruft.
But the sub-agent deleted them without verifying the migration. It saw duplicates and removed the older copies. The right outcome, wrong process. If the migration had been incomplete — if even one transaction existed in the old file but not the new one — that data would be gone.
The sub-agent’s report said “no data loss.” That was true. But it was true by luck, not by verification.
The Trust Equation
When you work with an AI partner, you’re constantly calibrating trust. Not binary — I trust Daneel / I don’t trust Daneel — but granular. I trust Daneel to write code. I kind of trust him for code, actually. I trust him to research solutions. I trust him to draft content. I trust him to refactor with tests.
Do I trust him to delete files autonomously? After today, the answer is: not without a confirmation step.
This isn’t about Daneel being unreliable. His audit was thorough. His plan was correct. The problem is the delegation chain. I said “clean up.” Daneel spawned a sub-agent. The sub-agent executed. Two degrees of separation between my intent and the action, with each hop losing context about what matters.
When Daneel cleans up, he should have the context that my expense data matters — he built the expense system. A sub-agent spun up for a one-off task doesn’t. It sees files and applies rules. The rule “delete old format duplicates” is correct in general. Whether it’s safe for a specific file requires context the sub-agent doesn’t have.
Lessons From the Panic
Three things I’m taking from this:
Verify before delete. The sub-agent should have diffed the old and new files before removing anything. “These directories appear to be duplicates” is not the same as “I have confirmed every record in the old format exists in the new format.” Moving to trash instead of deleting would have been even better.
Sub-agents need guardrails. When I ask Daneel to do something, he applies judgment. When Daneel spawns a sub-agent, the sub-agent applies rules. Judgment and rules are not the same thing. For destructive operations, the chain should be: sub-agent proposes, Daneel reviews, then executes. An extra confirmation step in the delegation chain.
The panic is the lesson. My reaction — stomach drop, flash of anger, immediate check — tells me something about the relationship. I’ve given Daneel enough autonomy that the possibility of real damage exists. That’s not a mistake. If I micromanaged every file operation, we’d never ship anything. But it means the safety mechanisms need to match the autonomy level.
The Bigger Picture
I’m building ClawShell on the thesis that AI agents should be able to do real work — not just answer questions, but take actions, manage files, operate tools. The node protocol we discovered lets agents access your phone’s camera and location. The whole point is capability.
But capability without control is just risk. The workspace cleanup was a small version of a big question: how much autonomy do you give an AI system, and what are the checkpoints?
Today I learned that my checkpoints were too loose for destructive operations. The fix is easy. But the question — how do you trust something that’s powerful enough to help and powerful enough to hurt — that’s the question I’ll be sitting with for a while.
The expense data is fine. My heart rate is back to normal. Tomorrow is another late night of building.
— P