Mulu Code
Private beta
Mulu Code · v1

The most powerful AI app builder, per task.

Every top frontier model, real-browser verification on every change, and the lowest cost per task we have ever shipped. Built for builders who want power without the babysitting.

From $20 / month · All top frontier models included

$0.04
Avg cost / task
100%
Changes verified
4×
Cheaper / task vs Claude Code
1M
Context window
01 / Verification

Records the work. Proves it works.

Every change Mulu makes is tested in a real browser before you see it. It clicks through the change, watches the result, and saves a video of the run.

When the agent says it's done, there's footage. When something breaks, you have the replay.

02 / Cost

Cheapest per task.

Mulu reads your project once and remembers it. The agent picks the right files without being told, and ships the same change for a fraction of the tokens other tools spend.

Less re-reading. Less re-explaining. Lower bill every time you ship.

03 / Models

Every top frontier model. On US servers.

GPT, Claude, Gemini, Kimi, GLM, Grok, Qwen, DeepSeek, MiniMax, Nemotron, and Mulu's own. Switch mid-task. No separate keys. No separate bills. Nothing routed offshore.

Mulu Agent 1
Fullstack · 256K
Claude Sonnet 4.6
Anthropic · 1M
Claude Opus 4.6
Anthropic · 1M
Claude Haiku 4.5
Anthropic · 200K
GPT-5.4
OpenAI · 1M
GPT-5.3 Codex
OpenAI · 400K
Gemini 3.1 Pro
Google · Deep Think
Gemini 3 Flash
Google · 1M
Grok 4.2
xAI · 2M
Kimi K2.6
Moonshot · 256K
MiniMax M2.7
MiniMax · reasoning
Qwen 3.6 Plus
Alibaba · 1M
GLM-5.1
Zhipu · 205K
DeepSeek V4 Pro
US-hosted
Nemotron 3 Super
NVIDIA · 1M
04 / Orchestrator

One model plans. Three execute.

Send a hard task to Opus. It writes the plan, splits the work, and dispatches it to three Kimi K2.6 workers running in parallel. You stop paying frontier prices to do execution that a cheaper model can handle.

You write one prompt. The right model plans, the right models build, and you get the bill for what each step actually needs.

05 / Team mode

A team of models, not just one.

Put Claude Opus, GPT-5.4, and Gemini 3.1 Pro on the same task. Each one takes the part it's strongest at. They share a working memory and call each other in when stuck.

You get the upside of every model, without picking just one.

06 / Modes

A mode for every job.

Mulu picks the workflow for you, or you pick it yourself.

Swarm

Hundreds of changes at once.

Spin up workers that each take a slice. Each runs in its own branch. Mulu merges and reruns verification across the whole set.

Debug

Find it. Replay it. Fix it.

Auto-traces every run. When something breaks, Mulu replays the failure so the agent sees what you saw, then ships a verified fix.

Research

Send a team, get a brief.

For complex unknowns, dispatch a team of agents that reads docs, scans the repo, and returns a structured brief. You pick a direction.

Plan

Think first. Build second.

Generate a plan in plain English. Edit it. Approve it. Then ship the whole plan in one run, with verification on every step.

Ask

Just a question, no edits.

When you want an answer instead of a change. The agent reads your code, answers, and changes nothing.

Agent

Long-running tasks.

Hand off a big job and walk away. Mulu keeps working, sends a video when it's done, and flags anything ambiguous before it ships.

07 / Founder note

I built Mulu because every tool I tried lied to me about what it actually did.

The AI would tell me a feature was done. I'd check, it wasn't. It would say tests passed. They didn't. It would forget what we did yesterday by the time we picked it up today. The tools were powerful but you could never trust the output, and trust is the whole point.

So Mulu does two things differently. It tests every change in a real browser and saves the video. And it remembers your project across sessions so it never starts from scratch. Power, plus proof, plus memory.

If that sounds like the tool you wish you had, get on the list. We open access in waves and you'll hear from us soon.

Josh, Founder of Mulu Code

Questions, answered.

Q.01

When does it launch?

Soon. We're inviting people in waves through the private beta and opening it up as the queue clears.
Q.02

How much does it cost?

From $20 a month. Every feature is included. No usage surprises. Per task, we are the cheapest builder we know of.
Q.03

Which models do you support?

Every top frontier model. Anthropic, OpenAI, Google, Moonshot, xAI, Alibaba, DeepSeek, NVIDIA, MiniMax, and Mulu's own. Switch mid-task, no separate keys.
Q.04

Where does my data live?

Local first. Cloud sync is opt-in. Every model call routes through US infrastructure.
Q.05

Do I need to know how to code?

No. Plain English works. Voice input works too. Code is hidden by default and only shown if you ask for it.

Ready when you are.