Aslan Browser

A browser for AI agents. macOS native. No Chrome. Returns an accessibility tree instead of the raw DOM.

~0.5ms

JS eval

~15ms

screenshot

0

Python deps

~2.6k

lines total

Why not just use Safari?

Safari has native automation. It doesn't work for agents.

User interaction

Safari

Glass pane. A transparent shield blocks all clicks while automation runs. You can't touch the window.

Aslan

No glass pane. You and the agent share the same browser. Click something it missed, type a password, hand it back.

Login state

Safari

Forced private mode. No cookies, no history, no local storage. Your agent logs in from scratch every time. 2FA every time.

Aslan

Persistent profile. Stays logged in between sessions. Same as a real user.

Speed

Safari / WebDriver

HTTP over TCP. Every command is a network request to a local server. 10–50ms per action.

Aslan

Unix domain socket — a file on disk acting as a pipe. OS-level IPC. ~0.5ms per action.

Page data

Safari / AppleScript

Raw HTML. Your LLM gets <div><span class="wrapper">...</span></div> soup. Expensive and hard to reason about.

Aslan

Accessibility tree baked in. A flat list of what's actually on the page. 10–100x fewer tokens.

Safari automation was built for testing websites. Aslan was built for operating them.

Click a component above

Aslan vs. Alternatives

Capability	Aslan Browser	Playwright	Puppeteer	Selenium
Rendering engine	WKWebView (macOS native)	Chromium / Firefox / WebKit	Chrome / Chromium	Any via WebDriver
Browser download needed	None — system WKWebView	~500MB Chromium	~500MB Chrome	Full browser required
Page representation for AI	Accessibility tree (compact)	Full DOM	Full DOM	Full DOM
Token usage for LLM	10–100× fewer tokens	Very high (raw DOM)	Very high (raw DOM)	Very high (raw DOM)
JS eval round-trip	~0.5ms	2–5ms	2–5ms	5–20ms
Screenshot latency	~15ms	50–150ms	50–150ms	200ms+
Memory per tab	~40MB	80–150MB	80–150MB	150MB+
Cold start	<500ms	2–5s	2–5s	5–10s
Python dependencies	Zero (stdlib only)	playwright package	pyppeteer package	selenium + webdriver
Multi-tab & sessions	Built-in (tab_create, session_create)	Yes	Yes	Partial
Parallel batch ops	parallel_navigate / parallel_get_trees	Yes (async)	Yes (async)	Limited
AI agent skill	Built-in (Prompts-are-Code)	None	None	None
Platform	macOS 14+ only	Cross-platform	Cross-platform	Cross-platform

Quick Start

1

Build or download the app

Build with Xcode, or grab a pre-built universal binary from GitHub Releases (arm64 + x86_64).

git clone https://github.com/onorbumbum/aslan-browser.git
cd aslan-browser
xcodebuild build -scheme aslan-browser -configuration Debug -derivedDataPath .build

2

Install the Python SDK

Zero external dependencies — only stdlib. Works with Python 3.10+.

pip install -e sdk/python   # from source (PyPI coming soon)

3

Start the browser

The app listens on /tmp/aslan-browser.sock. Use --hidden for headless production use.

.build/Build/Products/Debug/aslan-browser.app/Contents/MacOS/aslan-browser --hidden

4

Navigate → Read → Act

from aslan_browser import AslanBrowser

with AslanBrowser() as browser:
    # 1. Navigate and wait for the page to be ready
    browser.navigate("https://github.com/login", wait_until="idle")

    # 2. Read the page — compact accessibility tree, send to LLM
    tree = browser.get_accessibility_tree()
    for node in tree:
        print(node['ref'], node['role'], node['name'])
    # @e0  textbox  "Username or email address"
    # @e1  textbox  "Password"
    # @e2  button   "Sign in"

    # 3. Act using @eN refs the LLM selected
    browser.fill("@e0", "myusername")
    browser.fill("@e1", "mypassword")
    browser.click("@e2")

    # 4. Screenshot for vision models (JPEG bytes, ~15ms)
    browser.save_screenshot("result.jpg")

5

Install the AI agent skill (optional)

Symlink the bundled skill into your agent's skill directory. One source of truth — git pull updates it everywhere.

# For pi / ASLAN
ln -s /path/to/aslan-browser/skills/aslan-browser ~/.pi/agent/skills/aslan-browser

# For Claude Code
ln -s /path/to/aslan-browser/skills/aslan-browser ~/.claude/skills/aslan-browser

skills/aslan-browser/

├── SKILL.mdprotocol + rules — loaded every invocation

├── SDK_REFERENCE.mdfull CLI reference — loaded every session

├── knowledge/

│ ├── core.mduniversal rules — loaded every session

│ ├── user.mduser prefs — gitignored

│ ├── sites/loaded when visiting that domain

│ │ ├── linkedin.com.md

│ │ ├── instagram.com.md

│ │ ├── facebook.com.md

│ │ ├── business.google.com.md

│ │ ├── hubspot.com.md

│ │ ├── app.hubspot.com.md

│ │ ├── google.com.md

│ │ └── openrouter.ai.md

│ └── playbooks/step-by-step task recipes

│ ├── linkedin-create-post.md

│ ├── instagram-create-post.md

│ └── gmb-create-post.md

└── learnings/compiled after sessions

Command	Does	Note
Navigate
aslan nav <url> --wait idle	Navigate. Wait for network idle.	Use idle for SPAs, load for static
aslan back / forward / reload	History navigation
Read
aslan tree	Accessibility tree — one line per node	Use @eN refs to act on elements
aslan text [--chars 3000]	Page innerText
aslan title / aslan url	Page title / current URL
aslan eval "return ..."	Run JS, return result	Must include return
Interact
aslan click @eN	Click by ref or CSS selector	Prefer @eN refs
aslan fill @eN <value>	Fill input or textarea	Fails on contenteditable
aslan type @eN <value>	Type text	Works on contenteditable too
aslan key Enter [--meta]	Keypress with optional modifiers
aslan scroll --down 500	Scroll page	--to @eN scrolls element into view
Wait / Upload / Screenshot
aslan wait --idle	Wait for network idle + DOM stable	After click that triggers navigation
aslan upload <file>	Inject file via DataTransfer API	Click upload button first
aslan shot [path]	Screenshot to file	Default: /tmp/aslan-screenshot.jpg
Tabs
aslan tabs	List tabs (* = current)	State in /tmp/aslan-cli.json
aslan tab:new [url]	Open new tab, switch to it
aslan tab:use <id>	Switch active tab
aslan status	Connection info	Must print "Connected"

Don't

Write Python SDK boilerplate

AslanBrowser()
with browser: browser.navigate(...)

Do

Use the CLI

aslan nav https://example.com
aslan tree
aslan click @e2

Don't

Pre-plan multi-step scripts

aslan nav ...; aslan click @e3; aslan fill @e5 "x"; aslan key Enter

Do

Act one step, read, decide next

aslan nav ...
aslan tree # read what loaded
aslan click @e3 # act on what you see
aslan tree # read the result

Don't

aslan eval "document.title"

Do

aslan eval "return document.title"

Don't

Use fill on contenteditable (LinkedIn, Facebook, Notion)

aslan fill @e5 "hello world"

Do

Use type instead

aslan type @e5 "hello world"

Don't

aslan nav http://example.com

ATS blocks http:// at the OS level

Do

aslan nav https://example.com

Don't

Reuse @eN refs from a previous tree call

aslan tree
# ... do stuff ...
aslan tree
aslan click @e2 # ← @e2 may now be a different element

Do

Always read tree immediately before acting

aslan tree
aslan click @e2 # act on refs from THIS tree

Don't

Skip knowledge compilation after a task

Do

Ask: "Did I learn anything?" — route it to core.md, sites/, playbooks/, or user.md

Aslan Browser

Why not just use Safari?

Click a component above

Rendered Page (what the user sees)

Accessibility Tree → browser.get_accessibility_tree()

Aslan vs. Alternatives

Quick Start

Build or download the app

Install the Python SDK

Start the browser

Navigate → Read → Act

Install the AI agent skill (optional)