v1.5.9 latest stable

It sees. It acts. It verifies.
Desktop control for any AI agent

Your agent reads the screen as stable element ids, not pixels, verifies every action, and routes it through one safety gate. Local, any model.

View on GitHub Quick Start

Compound tools (recommended)

Granular tools (compat / debug)

Operating Systems

Quick Start

Install in seconds.

Then run clawdcursor consent --accept. On macOS also run clawdcursor grant. Done.

Pick a mode

How will your AI talk to it?

Same tools, two entry shapes. Pick once at install.

clawdcursor mcp recommended

AI lives in your editor. It spawns clawdcursor over stdio. No daemon, no port.

{
  "mcpServers": {
    "clawdcursor": {
      "command": "clawdcursor",
      "args": ["mcp", "--compact"]
    }
  }
}

Claude Code Cursor Windsurf OpenClaw Zed

7 / 98

Compact / Granular tools

stdio

Transport

clawdcursor agent HTTP daemon

HTTP MCP on 127.0.0.1:3847/mcp. doctor then agent runs the built-in autonomous loop; agent --no-llm serves tools only when your agent has its own brain.

:3847

HTTP MCP

13+

Providers

How it works

Cheap paths first.

A11y tree before pixels. Vision only when needed.

1 Compile the screen

No vision model

Fuse the a11y tree + OCR into one confidence-scored el_NN map. Act on an element by stable id. No image bytes to the model. Vision is last resort.

2 Escalate as needed

Cheapest rung that works

OCR when the tree is sparse. A screenshot only when you truly need pixels: canvas-only apps or spatial reasoning.

3 Verify & gate

Reactive + one chokepoint

Pass expect and the action confirms its outcome, reporting a DEVIATION if the UI didn't obey. Every call routes through one safety layer.

Features

Any OS. Any model.

💻

Cross-platform

Windows x64/ARM64 · macOS 12+ · Linux X11/Wayland

⌨️

Shortcuts engine

Platform-aware key combos. Cmd on macOS, Ctrl elsewhere. No LLM cost.

📦

batch: one round-trip

Collapse N deterministic tool calls into a single guarded, safety-gated batch. N calls → 1.

Tools 7 compact tools + 98 granular The 7 compounds are the recommended surface. The 98 granular tools cover compatibility and debugging.

Compound	Purpose	Actions
`computer`	Mouse, keyboard, screenshots. Raw I/O.	`screenshot` · `click` · `double_click` · `right_click` · `triple_click` · `hover` · `scroll` · `scroll_horizontal` · `drag` · `drag_path` · `type` · `key` · `wait`
`accessibility`	Drive UI by element name, not pixel. Survives DPI, resize, layout shifts.	`read_tree` · `find` · `get_element` · `focused` · `invoke` · `focus` · `set_value` · `get_value` · `expand` · `collapse` · `toggle` · `select` · `state` · `list_children` · `wait_for`
`window`	Launch, focus, resize. App-level state.	`list` · `active` · `focus` · `maximize` · `minimize` · `restore` · `close` · `resize` · `list_displays` · `screen_size` · `open_app` · `open_file` · `open_url` · `switch_tab` · `navigate`
`system`	Clipboard, OCR, shortcuts, undo, webview detection, CDP relaunch, task delegation. The meta surface for an external brain.	`clipboard_read` · `clipboard_write` · `system_time` · `ocr` · `undo` · `shortcuts_list` · `shortcuts_run` · `delegate` · `detect_webview` · `relaunch_with_cdp` · `system_prompt`
`browser`	Chrome DevTools Protocol: real DOM access for Electron / WebView2 apps whose a11y tree is sparse.	`connect` · `page_context` · `read_text` · `click` · `type` · `select_option` · `evaluate` · `wait_for` · `list_tabs` · `switch_tab` · `scroll`
`task`	Hand the whole task to the autonomous loop. Daemon mode only: needs `clawdcursor agent` with an LLM configured.	single arg: `{ instruction: string }`, no action enum

Compact (~1,500 tokens): computer({ "action": "key", "combo": "mod+s" }). Granular: key_press({ "key": "mod+s" }). Both hit the same safety.evaluate() chokepoint. Pass --granular for the granular surface. See schema.snapshot.json for every parameter.

CLI Every command For humans diagnosing an install. Agents connect via MCP.

# Install & setup
clawdcursor consent          # one-time desktop-control authorization (always required)
clawdcursor grant            # macOS only: Accessibility + Screen Recording prompts. MCP setup ends here.
clawdcursor doctor           # ONLY for `agent` mode: configures the daemon's built-in LLM (+ diagnostics)
clawdcursor status           # readiness check (consent, permissions, AI config)

# Run
clawdcursor mcp              # stdio MCP server for editor hosts
clawdcursor mcp --compact    # same, with 7 compound tools (recommended)
clawdcursor agent            # HTTP MCP daemon at :3847/mcp, optional built-in LLM
clawdcursor agent --no-llm   # tool surface only: your agent brings its own brain
clawdcursor stop             # stop every running mode
clawdcursor uninstall        # remove all clawdcursor config and data

It sees. It acts. It verifies.Desktop control for any AI agent