mediar-ai/mcp-server-macos-use
Summary
An MCP (Model Context Protocol) server written in Swift that lets AI assistants like Claude control macOS applications via Apple's Accessibility APIs. It exposes 5 tools (open app, click, type, key press, refresh) that traverse the accessibility tree and return structured UI element data, saving results as flat text files to /tmp/macos-use/ with screenshots. It wraps a private 'MacosUseSDK' dependency to do the actual OS-level automation.
Great for
building AI-driven desktop automation on macOS — specifically integrating LLM agents with native app UIs through accessibility APIs rather than screen scraping or AppleScript
Easy wins
- +Add a CONTRIBUTING.md explaining how to set up the dev environment, what MacosUseSDK is/where it comes from, and how to run tests — the CLAUDE.md has good internal notes that could be adapted
- +Set up GitHub Actions CI (even a basic 'swift build' check on macOS runners) — the repo has zero CI despite having a test script at scripts/test_mcp.py
- +Add 'scroll_and_traverse' tool to the README — the test script at line ~100 already expects it in the tools list ('macos-use_scroll_and_traverse') but it's not documented in README at all, which is a real discrepancy
- +Fix the license situation — package.json declares 'MIT' but the repo license field is 'NOASSERTION', creating ambiguity for contributors
Red flags
- !MacosUseSDK is a black-box dependency — it's listed as 'assumed local or external' in the README with no source, no public repo link, and no documentation. Core automation behavior is completely opaque to contributors.
- !README documents 5 tools but the test script asserts 6 tools exist (includes 'macos-use_scroll_and_traverse') — documentation is already out of sync with implementation.
- !Only 1 commit recorded in the metadata despite version 0.1.14 in package.json — suggests history was squashed or the repo was force-pushed, destroying contribution lineage.
- !Response files written to /tmp/macos-use/ with no apparent cleanup mechanism — long-running sessions could accumulate significant data including screenshots of user activity.
- !License conflict: package.json says MIT, GitHub reports NOASSERTION. Unclear what terms contributors are actually agreeing to.
Code quality
The test script (scripts/test_mcp.py) is actually well-structured — it has a proper MCPClient class, uses threading for stderr draining, includes a ToolResult parser with flat-file response handling, and covers meaningful test cases. The CLAUDE.md is unusually detailed and reveals sophisticated internal decisions (multi-screen coordinate system, response file caching to reduce context, diff format). However, the main Swift source (Sources/MCPServer/main.swift) isn't visible so actual server code quality can't be directly assessed — a significant blind spot.
What makes it unique
This is genuinely novel in the MCP ecosystem — most MCP automation tools target web browsers (Playwright-based) or work via screenshot+vision. Using macOS Accessibility APIs directly gives structured UI element access without vision models, which is faster and more reliable for native app automation. The closest comparison is the Python 'computer-use' projects, but this is the only known Swift/native MCP server doing AX tree traversal. The flat-file response pattern (saving to /tmp instead of returning large JSON inline) is a smart context-window optimization that I haven't seen elsewhere.
Scores
Barrier to entry
highThe core logic depends on a private/undocumented 'MacosUseSDK' package that is not in the public file tree and has no source visible in the repo — contributors cannot understand or modify core behavior without access to that dependency, and there's no contributing guide, no CI, only 1 commit in recorded history, and 0 good-first-issues labeled.