dandaka/traul
Summary
Traul is a local-first CLI tool that syncs messages from Slack, Discord, Telegram, Gmail, Linear, WhatsApp, and Claude Code sessions into a SQLite database, then lets you search them with FTS5 keyword search or Ollama-backed vector embeddings. It's designed to be exposed as a tool to AI agents (e.g., Claude) so they can autonomously search your communication history without you manually copy-pasting context.
Great for
People building local-first personal data pipelines, or anyone interested in giving AI agents searchable memory over real communication data (Slack, Telegram, Gmail) using SQLite FTS5 + vector embeddings without sending data to a cloud service.
Easy wins
- +Add Linux support: README and setup currently assumes Homebrew SQLite (macOS). The sqlite-vec dependency may already work cross-platform — verifying and documenting Linux setup would unblock non-Mac contributors.
- +Add CI: there's a full test suite (test/ covers db, commands, connectors, daemon, lib) but no GitHub Actions config exists. Wiring up 'bun test' on push would be straightforward and high-value.
- +WhatsApp connector is present in the file tree (src/connectors/whatsapp.ts, docker-compose.waha.yml) but the README lists it without setup docs — writing a getting-started section for WAHA would be a concrete doc win.
- +The Telegram connector bridges to a Python subprocess (tg_sync.py) via JSONL streaming — this is an interesting but fragile boundary. Adding integration tests or at least a mock-mode for CI would reduce breakage risk.
Red flags
- !Discord connector uses a user token pattern (Authorization header with raw token, not 'Bot ' prefix) — the README mentions 'xoxb/xoxc' for Slack and 'Discord bot token', but the Discord fetch code sends the token as-is. Using a user token (selfbot) violates Discord ToS and can get accounts banned.
- !No database migration system is visible in the file tree — schema changes will silently break existing databases for users who git pull.
- !The Telegram connector calls a Python subprocess (tg_sync.py) without any timeout — a hung Python process would block the sync indefinitely with no recovery path.
- !AGPL-3.0 license means any network-deployed fork must open-source modifications — worth flagging for contributors who might want to build a hosted service on top of this.
- !No CI configured despite having a test suite, meaning PRs have no automated validation gate.
Code quality
The connector code (discord.ts, telegram.ts) has real production-quality details: proper rate-limit backoff using Retry-After headers, snowflake-based cursor pagination, a contact deduplication cache, and JSONL streaming for the Telegram bulk sync. The database.ts hybrid search (RRF merge of vector + FTS results with FTS backfill for unembedded messages) is thoughtfully designed and not just a naive concat. SQL is parameterized throughout (no string interpolation), and the FTS sanitizer in sanitizeFtsQuery handles special characters correctly. The main weakness is that SQL query strings live in a separate queries.ts (referenced as Q.*) which isn't shown, so it's hard to audit fully, and there's no obvious migration system — schema changes would require manual intervention.
What makes it unique
This occupies a real niche: most 'personal data aggregators' are either cloud-based (Rewind, Mem) or academic prototypes. The local-first + AI agent tool exposure angle is genuinely differentiated. The hybrid FTS5+vector search with RRF merging and graceful FTS fallback for unembedded content is more sophisticated than most hobby projects in this space. It's not a clone of anything obvious, though it conceptually overlaps with projects like Recall or personal-search tools — the agent-first design and multi-source connector architecture are the distinguishing bets.
Scores
Barrier to entry
mediumThe codebase is well-organized and readable, but setup requires configuring multiple API tokens, a Homebrew SQLite build (macOS-only currently), and optionally Ollama; the Telegram connector also has a Python script dependency (tg_sync.py) that adds a second runtime requirement.