Building an OpenClaw alternative with Mastra

Over the last few weeks, I built tia-gateway as an OpenClaw-style runtime on top of Mastra. The goal was simple: create a practical assistant runtime that can think, call tools, keep memory, and stay reliable in real messaging channels.

This is the first write-up of what worked, what hurt, and what I would keep if I built it again.

1) Memory + toolloop is the real core

The most important decision was treating memory as part of execution, not as an afterthought.

Conversation runs are bound to memory thread/resource IDs so context stays stable per user/channel.
Working memory gives the agent lightweight continuity for active goals and constraints.
Toolloop behavior becomes predictable when memory scope is clear and each run has strict step limits.

In practice, this made delegated tools and follow-up tasks feel much more coherent. Without this, every run feels like starting from zero.

2) Channel handling needs one strong abstraction

Instead of hard-coding behavior for each chat app, I pushed everything through a TiaChannel interface and a ChannelManager.

Lark became a concrete adapter (receive/send/react).
Message normalization (ChannelMessage) gave us one internal format.
Channel-specific metadata still survives for policy checks like mention-only group handling.

That design also made Telegram planning easier. Even before a full Telegram adapter is done, the architecture is already prepared for it.

3) Onboarding UX matters more than we think

I used commander for CLI command structure and an interactive onboarding flow for setup. The result was much better than expecting users to hand-edit config files.

The key lesson: people should be able to run tia onboard, pick provider + credentials, and move on. If setup is painful, runtime quality does not matter because users drop before first success.

4) Human-linked conversations need steering + interruption

Real users do not wait politely for long tool chains. They change direction, add urgent tasks, and expect the assistant to adapt.

So I added an interruption classifier with queue vs interrupt decisions, plus explicit resume commands. This solved two real problems:

The assistant can switch immediately when urgency is clear.
The previous task is not lost; it can be resumed intentionally.

This is one of the most important pieces for human-in-the-loop trust.

5) Long-running tasks need guardrails (tmux by default)

Some tasks are naturally long-running: installs, repos, docker, external dependencies. Letting these block the main execution loop is risky.

I added guarded sandbox rules that require tmux for long-running/external commands. When tmux is missing, the runtime attempts installation first, then retries with the same policy.

That gave us safer async execution and fewer stuck sessions.

6) Other practical insights from building our own alternative

Streaming responses should be chunk-aware ([[BR]]) so channel delivery stays readable.
You need a silent mode (CHANNEL_SILENT) for internal runs like heartbeat or maintenance tasks.
Heartbeats are useful only when tightly scoped and easy to suppress.
Tool-result narration in plain language dramatically improves user trust during long runs.
Clear workspace boundaries prevent accidental unsafe file access.

Building this from scratch made one thing very clear: the hard part is not "calling an LLM". The hard part is making a runtime that stays reliable when real people, real channels, and real long-running work collide.