Open Source Browser Automation

You speak.
It acts.

Type what you want done on any website. Golemn understands, navigates, and delivers results in seconds. Powered by crowd intelligence that gets smarter with every run.

golemn
$ golemn run "https://news.ycombinator.com" "get the top 5 posts with scores"
S1 cache check...miss
S2.5 HTTP extract...match
1. Show HN: I built a CPU from scratch (842 pts)
2. The unreasonable effectiveness of Rust (731 pts)
3. Why we moved from Kubernetes to bare metal (698 pts)
4. A visual guide to quantization (654 pts)
5. SQLite is not a toy database (612 pts)
Done in 1.2s // step 2.5, zero-browser extraction
9
Waterfall Steps
<2s
Cached Responses
70+
Modules
1452
Tests Passing

Nine steps from intent to result. Most finish in the first three.

Every request cascades through a waterfall of increasingly powerful strategies. Fast paths resolve most tasks before the LLM is ever called. Every success feeds back to make the next request faster.

S1
Result Cache
FNV-hashed memory + disk lookup. Same URL + goal = instant replay.
1-50ms
S2
API Cache Replay
Cached API routes skip the browser entirely. Direct HTTP call.
10-100ms
S2.5
HTTP Extract
Zero-browser CSS extraction via raw HTTP. Scraper-grade speed.
50-200ms
S2.7
Structured Data
JSON-LD, OpenGraph, Twitter Card metadata. Schema.org auto-match.
~0ms (free)
S3
Sequence Replay
Known CDP action sequences from past runs + crowd patches.
50-200ms
S4
Harvester
Structured CSS template extraction. No LLM needed.
100-500ms
S5
Patch Extract
Community CSS selectors. Crowd-sourced extraction paths.
~200ms
S6
API Discovery
Probe domains for REST/GraphQL endpoints. Cache routes for S2.
200-800ms
S7
DOM Heal
AXTree text-similarity + resilient selector multi-strategy heal.
+100ms
S8
Streaming LLM
AXTree to plan, stream-execute actions as tokens arrive. Parallel I/O.
1-3s
S9
Human Fix
Returns a fix request with AXTree context. Your fix trains the swarm.
You decide

Every run makes every future run faster.

Like Waze for the web. Selector patches, action sequences, and API routes are shared across the network. One user's fix helps everyone.

🌐
Selector Patches
When a CSS selector breaks, the fix propagates to every instance. Sites change their DOM. Golemn adapts in real-time.
Action Sequences
Verified CDP action chains cached and replayed. Second request to the same site skips the LLM entirely.
🔍
API Route Discovery
Found a REST endpoint? The whole network knows. Direct HTTP calls instead of browser automation.
🧠
Step Intelligence
Per-domain success rates auto-skip waterfall steps with <10% hit rate. No wasted cycles.
🛡
Circuit Breaker
AIMD rate control + circuit breaker per domain. Adapts to rate limits. Backs off intelligently.
🎯
Content Dedup
SimHash near-duplicate detection across jobs. No redundant work. Every byte of bandwidth counts.

One POST. Results in seconds.

REST API with real-time WebSocket progress. Run single jobs or batch thousands. Deploy as daemon, desktop app, or embed the engine.

curl
# Submit a job curl -X POST http://localhost:7001/api/run \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com", "goal": "extract the main heading" }' # Response { "output": "Example Domain", "step_used": 2, "elapsed_ms": 84, "cache_hit": false }

Deploy anywhere

Daemon
Production REST API with auth, rate limiting, WebSocket progress, and OpenAPI docs.
Desktop App
Native egui app. Type a URL and goal, watch it work. Built-in settings and history.
Engine Crate
Embed golemn-engine in your Rust app. Full waterfall, pool, cache, and stealth.
Docker
One-line deploy. Chrome headless baked in. Production-ready with Prometheus metrics.

Rust. Chrome DevTools Protocol. Zero compromises.

Rust
Tokio
Chromiumoxide
Axum
redb
egui
WASM

The Golem obeys.

In Jewish mythology, a Golem is animated by a word inscribed on its forehead. You type the word. Golemn does the rest.

View on GitHub