Aslan Browser
A browser for AI agents. macOS native. No Chrome. Returns an accessibility tree instead of the raw DOM.
~0.5ms
JS eval
~15ms
screenshot
0
Python deps
~2.6k
lines total
Why not just use Safari?
Safari has native automation. It doesn't work for agents.
User interaction
Safari
Glass pane. A transparent shield blocks all clicks while automation runs. You can't touch the window.
Aslan
No glass pane. You and the agent share the same browser. Click something it missed, type a password, hand it back.
Login state
Safari
Forced private mode. No cookies, no history, no local storage. Your agent logs in from scratch every time. 2FA every time.
Aslan
Persistent profile. Stays logged in between sessions. Same as a real user.
Speed
Safari / WebDriver
HTTP over TCP. Every command is a network request to a local server. 10–50ms per action.
Aslan
Unix domain socket — a file on disk acting as a pipe. OS-level IPC. ~0.5ms per action.
Page data
Safari / AppleScript
Raw HTML. Your LLM gets
<div><span class="wrapper">...</span></div> soup. Expensive and hard to reason about.Aslan
Accessibility tree baked in. A flat list of what's actually on the page. 10–100x fewer tokens.
Safari automation was built for testing websites. Aslan was built for operating them.
Click a component above
Rendered Page (what the user sees)
Hover over tree nodes on the right to see which element they map to.
Accessibility Tree → browser.get_accessibility_tree()
Use @eN refs in click(), fill(), etc. — this is what you send to the LLM.
Aslan vs. Alternatives
| Capability | Aslan Browser | Playwright | Puppeteer | Selenium |
|---|---|---|---|---|
| Rendering engine | WKWebView (macOS native) | Chromium / Firefox / WebKit | Chrome / Chromium | Any via WebDriver |
| Browser download needed | None — system WKWebView | ~500MB Chromium | ~500MB Chrome | Full browser required |
| Page representation for AI | Accessibility tree (compact) | Full DOM | Full DOM | Full DOM |
| Token usage for LLM | 10–100× fewer tokens | Very high (raw DOM) | Very high (raw DOM) | Very high (raw DOM) |
| JS eval round-trip | ~0.5ms | 2–5ms | 2–5ms | 5–20ms |
| Screenshot latency | ~15ms | 50–150ms | 50–150ms | 200ms+ |
| Memory per tab | ~40MB | 80–150MB | 80–150MB | 150MB+ |
| Cold start | <500ms | 2–5s | 2–5s | 5–10s |
| Python dependencies | Zero (stdlib only) | playwright package | pyppeteer package | selenium + webdriver |
| Multi-tab & sessions | Built-in (tab_create, session_create) | Yes | Yes | Partial |
| Parallel batch ops | parallel_navigate / parallel_get_trees | Yes (async) | Yes (async) | Limited |
| AI agent skill | Built-in (Prompts-are-Code) | None | None | None |
| Platform | macOS 14+ only | Cross-platform | Cross-platform | Cross-platform |
Quick Start
1
Build or download the app
Build with Xcode, or grab a pre-built universal binary from GitHub Releases (arm64 + x86_64).
git clone https://github.com/onorbumbum/aslan-browser.git cd aslan-browser xcodebuild build -scheme aslan-browser -configuration Debug -derivedDataPath .build
2
Install the Python SDK
Zero external dependencies — only stdlib. Works with Python 3.10+.
pip install -e sdk/python # from source (PyPI coming soon)
3
Start the browser
The app listens on /tmp/aslan-browser.sock. Use --hidden for headless production use.
.build/Build/Products/Debug/aslan-browser.app/Contents/MacOS/aslan-browser --hidden
4
Navigate → Read → Act
from aslan_browser import AslanBrowser with AslanBrowser() as browser: # 1. Navigate and wait for the page to be ready browser.navigate("https://github.com/login", wait_until="idle") # 2. Read the page — compact accessibility tree, send to LLM tree = browser.get_accessibility_tree() for node in tree: print(node['ref'], node['role'], node['name']) # @e0 textbox "Username or email address" # @e1 textbox "Password" # @e2 button "Sign in" # 3. Act using @eN refs the LLM selected browser.fill("@e0", "myusername") browser.fill("@e1", "mypassword") browser.click("@e2") # 4. Screenshot for vision models (JPEG bytes, ~15ms) browser.save_screenshot("result.jpg")
5
Install the AI agent skill (optional)
Symlink the bundled skill into your agent's skill directory. One source of truth — git pull updates it everywhere.
# For pi / ASLAN ln -s /path/to/aslan-browser/skills/aslan-browser ~/.pi/agent/skills/aslan-browser # For Claude Code ln -s /path/to/aslan-browser/skills/aslan-browser ~/.claude/skills/aslan-browser
skills/aslan-browser/
├── SKILL.mdprotocol + rules — loaded every invocation
├── SDK_REFERENCE.mdfull CLI reference — loaded every session
├── knowledge/
│ ├── core.mduniversal rules — loaded every session
│ ├── user.mduser prefs — gitignored
│ ├── sites/loaded when visiting that domain
│ │ ├── linkedin.com.md
│ │ ├── instagram.com.md
│ │ ├── facebook.com.md
│ │ ├── business.google.com.md
│ │ ├── hubspot.com.md
│ │ ├── app.hubspot.com.md
│ │ ├── google.com.md
│ │ └── openrouter.ai.md
│ └── playbooks/step-by-step task recipes
│ ├── linkedin-create-post.md
│ ├── instagram-create-post.md
│ └── gmb-create-post.md
└── learnings/compiled after sessions
| Command | Does | Note |
|---|---|---|
| Navigate | ||
| aslan nav <url> --wait idle | Navigate. Wait for network idle. | Use idle for SPAs, load for static |
| aslan back / forward / reload | History navigation | |
| Read | ||
| aslan tree | Accessibility tree — one line per node | Use @eN refs to act on elements |
| aslan text [--chars 3000] | Page innerText | |
| aslan title / aslan url | Page title / current URL | |
| aslan eval "return ..." | Run JS, return result | Must include return |
| Interact | ||
| aslan click @eN | Click by ref or CSS selector | Prefer @eN refs |
| aslan fill @eN <value> | Fill input or textarea | Fails on contenteditable |
| aslan type @eN <value> | Type text | Works on contenteditable too |
| aslan key Enter [--meta] | Keypress with optional modifiers | |
| aslan scroll --down 500 | Scroll page | --to @eN scrolls element into view |
| Wait / Upload / Screenshot | ||
| aslan wait --idle | Wait for network idle + DOM stable | After click that triggers navigation |
| aslan upload <file> | Inject file via DataTransfer API | Click upload button first |
| aslan shot [path] | Screenshot to file | Default: /tmp/aslan-screenshot.jpg |
| Tabs | ||
| aslan tabs | List tabs (* = current) | State in /tmp/aslan-cli.json |
| aslan tab:new [url] | Open new tab, switch to it | |
| aslan tab:use <id> | Switch active tab | |
| aslan status | Connection info | Must print "Connected" |
Don't
Write Python SDK boilerplate
AslanBrowser()
with browser: browser.navigate(...)
with browser: browser.navigate(...)
Do
Use the CLI
aslan nav https://example.com
aslan tree
aslan click @e2
aslan tree
aslan click @e2
Don't
Pre-plan multi-step scripts
aslan nav ...; aslan click @e3; aslan fill @e5 "x"; aslan key Enter
Do
Act one step, read, decide next
aslan nav ...
aslan tree # read what loaded
aslan click @e3 # act on what you see
aslan tree # read the result
aslan tree # read what loaded
aslan click @e3 # act on what you see
aslan tree # read the result
Don't
aslan eval "document.title"
Do
aslan eval "return document.title"
Don't
Use fill on contenteditable (LinkedIn, Facebook, Notion)
aslan fill @e5 "hello world"
Do
Use type instead
aslan type @e5 "hello world"
Don't
aslan nav http://example.com
ATS blocks http:// at the OS level
Do
aslan nav https://example.com
Don't
Reuse @eN refs from a previous tree call
aslan tree
# ... do stuff ...
aslan tree
aslan click @e2 # ← @e2 may now be a different element
# ... do stuff ...
aslan tree
aslan click @e2 # ← @e2 may now be a different element
Do
Always read tree immediately before acting
aslan tree
aslan click @e2 # act on refs from THIS tree
aslan click @e2 # act on refs from THIS tree
Don't
Skip knowledge compilation after a task
Do
Ask: "Did I learn anything?" — route it to core.md, sites/, playbooks/, or user.md