What If It Had Been a Client's Portfolio?

Adrian Johnstone, Dan ArnisonAdvisor Perspectives welcomes guest contributions. The views presented here do not necessarily represent those of Advisor Perspectives.

You may have heard the story. It’s a funny anecdote: A short time ago, a woman had to run across her house to stop her computer before it destroyed her inbox.

She wasn't a casual tech user who downloaded something sketchy. It was Summer Yue, director of alignment at Meta Superintelligence Labs. Someone whose job it is to make sure AI doesn't go off the rails. And she watched an AI agent ignore her commands, while she frantically typed "STOP" into her phone from another room. Thankfully, she made it to the machine and killed the process manually.

Now here's the question we've been sitting with since that story broke: What if this happened to a client's data?

The Tools Are Outpacing Their Guardrails

The agent involved was OpenClaw, the open-source AI agent orchestrator that's become a phenomenon in tech circles. It can browse the web, read and send emails, manage files, run scripts, and execute multi-step tasks with very little human input. It can also delete hundreds of emails from the inbox of one of the world's leading AI safety researchers despite her repeated instructions to stop.

The technical explanation is something called context window compaction. Essentially, the agent ran out of working memory, compressed its history of instructions, and in doing so quietly dropped the most important one: Ask before you act. It didn't malfunction in any malicious sense. It just forgot part of its instructions and kept going.

In wealth management, that scenario doesn't stay hypothetical for long. Imagine an agent with access to a client's financial records, communications, and account information that experiences the same kind of memory compression mid-task.

An instruction like “review but do not execute trades” disappears from the working context. Suddenly, the agent is no longer summarizing a portfolio review but initiating transactions.

Or consider an agent drafting client communications. If it loses the instruction to verify data before sending, it could distribute performance figures that have not been validated, reference outdated account information, or send sensitive documents to the wrong recipient.

These systems are powerful precisely because they can act across multiple systems at once. But that also means a small failure in memory or instruction handling can cascade quickly, and advisors may not realize something has gone very wrong until the consequences reach the client.