Back to Explore

billy-enrizky/openbrowser-ai

Python185141 issues6 contributorsMIT
View on GitHub

Summary

OpenBrowser is a Python framework for AI-driven browser automation that uses Chrome DevTools Protocol (CDP) directly instead of Playwright/Selenium, combined with a LangGraph-based CodeAgent where the LLM writes and executes Python code in a persistent namespace to navigate and interact with web pages. It supports 12+ LLM providers, an MCP server for Claude Desktop integration, and a CLI daemon mode. The architecture centers on an event bus (bubus) coordinating watchdogs that handle specific browser behaviors.

Great for

building LLM-powered web automation pipelines that need fine-grained browser control without Playwright overhead — scraping, form filling, flight booking bots, or any task requiring an agent that can reason about and interact with live web pages

Easy wins

  • +Add CONTRIBUTING.md with setup instructions and architecture overview — the watchdog pattern and event bus hierarchy are not documented anywhere in the repo
  • +The README model table lists models like 'gpt-5.2', 'claude-sonnet-4-6', 'gemini-3-flash' that don't exist yet — update to real current model names (gpt-4o, claude-3-5-sonnet, gemini-1.5-flash)
  • +Add 'good first issue' labels: the test files mock everything but there are no integration smoke tests — a simple real-browser 'navigate and screenshot' test would be valuable
  • +The docker-compose.yml references a frontend/ directory and Dockerfile that don't appear in the file tree — either add the frontend or remove those references to avoid confusion for new contributors

Red flags

  • !litellm pinned to exact version `==1.80.0` in pyproject.toml will cause dependency resolution conflicts for any project that uses a different litellm version
  • !README model table lists non-existent models (gpt-5.2, claude-sonnet-4-6, gemini-3-flash, claude-opus-4-6) — this is either aspirational documentation or fabricated, which erodes trust
  • !docker-compose.yml references a frontend service with its own Dockerfile but no frontend/ directory exists in the file tree — the compose file will fail to build
  • !Only 1 commit recorded despite 185 stars and apparent maturity — suggests history was squashed or the repo was migrated, making it impossible to understand the project's evolution
  • !posthog telemetry is a hard dependency (not optional) in the base install, meaning every user sends analytics by default — no opt-out documentation visible in README

Code quality

decent

The core session.py and watchdog code is architecturally thoughtful — the CDPSession/BrowserSession split is clean, the event bus pattern is consistently applied, and error handling uses specific exception types (BrowserError, URLNotAllowedError). However, there are rough spots: default_action_watchdog.py has a bare `except Exception as e: raise` pattern repeated multiple times (lines at end of on_ClickElementEvent, on_TypeTextEvent, on_ScrollEvent) which adds no value and clutters the code. The test suite is extensive but almost entirely unit tests with heavy mocking — the tests in test_agent_service_coverage.py and test_browser_session_coverage.py mock out every external dependency, which means real integration behavior is untested. The litellm dependency is hard-pinned to `==1.80.0` in pyproject.toml, which will cause dependency conflicts for users.

What makes it unique

The direct CDP approach (via cdp-use library) instead of Playwright is the genuine differentiator — most browser-use frameworks (browser-use, playwright-mcp, etc.) sit on top of Playwright. The CodeAgent architecture where the LLM writes executable Python rather than calling predefined tool functions is also distinct from action-space approaches. However, this occupies the same space as browser-use (which it explicitly benchmarks against) and Skyvern, and the benchmark comparisons in the repo appear to favor itself without independent verification.

Scores

Collab
5
Activity
5

Barrier to entry

high

The architecture is non-trivial — a multi-layer event bus + watchdog system with CDP sessions, Pydantic models with private attrs and complex init merging, and LangGraph agent loops — with zero contributing guide, only 1 commit recorded (suggesting a migrated or squashed history), and 0 good-first-issue labels despite the codebase being several thousand lines.

Skills needed

Python async/await (asyncio throughout)Chrome DevTools Protocol (CDP) internalsPydantic v2 models (used heavily for all data structures)LangGraph / LangChain patternsEvent-driven architecture (bubus EventBus, watchdog pattern)Browser automation concepts (DOM, XPath, selectors)LLM API integration (OpenAI, Anthropic, Google, etc.)