Aslan Browser
browser for AI agents

Aslan Browser

A browser for AI agents. macOS native. No Chrome. Returns an accessibility tree instead of the raw DOM.

~0.5ms
JS eval
~15ms
screenshot
0
Python deps
~2.6k
lines total

Why not just use Safari?

Safari has native automation. It doesn't work for agents.

User interaction
Safari
Glass pane. A transparent shield blocks all clicks while automation runs. You can't touch the window.
Aslan
No glass pane. You and the agent share the same browser. Click something it missed, type a password, hand it back.
Login state
Safari
Forced private mode. No cookies, no history, no local storage. Your agent logs in from scratch every time. 2FA every time.
Aslan
Persistent profile. Stays logged in between sessions. Same as a real user.
Speed
Safari / WebDriver
HTTP over TCP. Every command is a network request to a local server. 10–50ms per action.
Aslan
Unix domain socket — a file on disk acting as a pipe. OS-level IPC. ~0.5ms per action.
Page data
Safari / AppleScript
Raw HTML. Your LLM gets <div><span class="wrapper">...</span></div> soup. Expensive and hard to reason about.
Aslan
Accessibility tree baked in. A flat list of what's actually on the page. 10–100x fewer tokens.

Safari automation was built for testing websites. Aslan was built for operating them.

Python SDK-CLI AslanBrowser() AsyncAslanBrowser() Unix socket NDJSON JSON-RPC aslan-browser.app (macOS native · Swift 6.2) SocketServer SwiftNIO · /tmp/aslan-browser.sock JSONRPCHandler parse · validate · respond MethodRouter maps method strings → actions TabManager tabs · sessions · lifecycle Browser Tab WKWebView navigate screenshot a11y tree JS eval interact ScriptBridge injected JS ↔ Swift The Web any URL, full JS support ↑ Click any component to learn about it

Click a component above

Rendered Page (what the user sees)

Hover over tree nodes on the right to see which element they map to.

Accessibility Tree → browser.get_accessibility_tree()

Use @eN refs in click(), fill(), etc. — this is what you send to the LLM.

Aslan vs. Alternatives

Capability Aslan Browser Playwright Puppeteer Selenium
Rendering engine WKWebView (macOS native) Chromium / Firefox / WebKit Chrome / Chromium Any via WebDriver
Browser download needed None — system WKWebView ~500MB Chromium ~500MB Chrome Full browser required
Page representation for AI Accessibility tree (compact) Full DOM Full DOM Full DOM
Token usage for LLM 10–100× fewer tokens Very high (raw DOM) Very high (raw DOM) Very high (raw DOM)
JS eval round-trip ~0.5ms 2–5ms 2–5ms 5–20ms
Screenshot latency ~15ms 50–150ms 50–150ms 200ms+
Memory per tab ~40MB 80–150MB 80–150MB 150MB+
Cold start <500ms 2–5s 2–5s 5–10s
Python dependencies Zero (stdlib only) playwright package pyppeteer package selenium + webdriver
Multi-tab & sessions Built-in (tab_create, session_create) Yes Yes Partial
Parallel batch ops parallel_navigate / parallel_get_trees Yes (async) Yes (async) Limited
AI agent skill Built-in (Prompts-are-Code) None None None
Platform macOS 14+ only Cross-platform Cross-platform Cross-platform

Quick Start

1

Build or download the app

Build with Xcode, or grab a pre-built universal binary from GitHub Releases (arm64 + x86_64).

git clone https://github.com/onorbumbum/aslan-browser.git
cd aslan-browser
xcodebuild build -scheme aslan-browser -configuration Debug -derivedDataPath .build
2

Install the Python SDK

Zero external dependencies — only stdlib. Works with Python 3.10+.

pip install -e sdk/python   # from source (PyPI coming soon)
3

Start the browser

The app listens on /tmp/aslan-browser.sock. Use --hidden for headless production use.

.build/Build/Products/Debug/aslan-browser.app/Contents/MacOS/aslan-browser --hidden
4

Navigate → Read → Act

from aslan_browser import AslanBrowser

with AslanBrowser() as browser:
    # 1. Navigate and wait for the page to be ready
    browser.navigate("https://github.com/login", wait_until="idle")

    # 2. Read the page — compact accessibility tree, send to LLM
    tree = browser.get_accessibility_tree()
    for node in tree:
        print(node['ref'], node['role'], node['name'])
    # @e0  textbox  "Username or email address"
    # @e1  textbox  "Password"
    # @e2  button   "Sign in"

    # 3. Act using @eN refs the LLM selected
    browser.fill("@e0", "myusername")
    browser.fill("@e1", "mypassword")
    browser.click("@e2")

    # 4. Screenshot for vision models (JPEG bytes, ~15ms)
    browser.save_screenshot("result.jpg")
5

Install the AI agent skill (optional)

Symlink the bundled skill into your agent's skill directory. One source of truth — git pull updates it everywhere.

# For pi / ASLAN
ln -s /path/to/aslan-browser/skills/aslan-browser ~/.pi/agent/skills/aslan-browser

# For Claude Code
ln -s /path/to/aslan-browser/skills/aslan-browser ~/.claude/skills/aslan-browser
SETUP SDK_REFERENCE.md core.md user.md check sites/ & playbooks/ aslan status must print "Connected" repeat until done ORIENT aslan tabs aslan url aslan title where am I? what tab is active? ACT aslan nav aslan click aslan fill aslan key ONE action only READ aslan tree aslan text aslan title what happened? did it work? DECIDE next action? → loop task done? → compile loop back to Act COMPILE route every discovery: core.md universal CLI/browser rules sites/{domain}.md site-specific selectors playbooks/ repeatable task recipe user.md user preferences
skills/aslan-browser/
├── SKILL.mdprotocol + rules — loaded every invocation
├── SDK_REFERENCE.mdfull CLI reference — loaded every session
├── knowledge/
│ ├── core.mduniversal rules — loaded every session
│ ├── user.mduser prefs — gitignored
│ ├── sites/loaded when visiting that domain
│ │ ├── linkedin.com.md
│ │ ├── instagram.com.md
│ │ ├── facebook.com.md
│ │ ├── business.google.com.md
│ │ ├── hubspot.com.md
│ │ ├── app.hubspot.com.md
│ │ ├── google.com.md
│ │ └── openrouter.ai.md
│ └── playbooks/step-by-step task recipes
│ ├── linkedin-create-post.md
│ ├── instagram-create-post.md
│ └── gmb-create-post.md
└── learnings/compiled after sessions
CommandDoesNote
Navigate
aslan nav <url> --wait idleNavigate. Wait for network idle.Use idle for SPAs, load for static
aslan back / forward / reloadHistory navigation
Read
aslan treeAccessibility tree — one line per nodeUse @eN refs to act on elements
aslan text [--chars 3000]Page innerText
aslan title / aslan urlPage title / current URL
aslan eval "return ..."Run JS, return resultMust include return
Interact
aslan click @eNClick by ref or CSS selectorPrefer @eN refs
aslan fill @eN <value>Fill input or textareaFails on contenteditable
aslan type @eN <value>Type textWorks on contenteditable too
aslan key Enter [--meta]Keypress with optional modifiers
aslan scroll --down 500Scroll page--to @eN scrolls element into view
Wait / Upload / Screenshot
aslan wait --idleWait for network idle + DOM stableAfter click that triggers navigation
aslan upload <file>Inject file via DataTransfer APIClick upload button first
aslan shot [path]Screenshot to fileDefault: /tmp/aslan-screenshot.jpg
Tabs
aslan tabsList tabs (* = current)State in /tmp/aslan-cli.json
aslan tab:new [url]Open new tab, switch to it
aslan tab:use <id>Switch active tab
aslan statusConnection infoMust print "Connected"
Don't
Write Python SDK boilerplate
AslanBrowser()
with browser: browser.navigate(...)
Do
Use the CLI
aslan nav https://example.com
aslan tree
aslan click @e2
Don't
Pre-plan multi-step scripts
aslan nav ...; aslan click @e3; aslan fill @e5 "x"; aslan key Enter
Do
Act one step, read, decide next
aslan nav ...
aslan tree # read what loaded
aslan click @e3 # act on what you see
aslan tree # read the result
Don't
aslan eval "document.title"
Do
aslan eval "return document.title"
Don't
Use fill on contenteditable (LinkedIn, Facebook, Notion)
aslan fill @e5 "hello world"
Do
Use type instead
aslan type @e5 "hello world"
Don't
aslan nav http://example.com
ATS blocks http:// at the OS level
Do
aslan nav https://example.com
Don't
Reuse @eN refs from a previous tree call
aslan tree
# ... do stuff ...
aslan tree
aslan click @e2 # ← @e2 may now be a different element
Do
Always read tree immediately before acting
aslan tree
aslan click @e2 # act on refs from THIS tree
Don't
Skip knowledge compilation after a task
Do
Ask: "Did I learn anything?" — route it to core.md, sites/, playbooks/, or user.md
Shareable Summary