onestardao/wfgy
Summary
WFGY is a documentation-heavy AI troubleshooting framework centered on a '16-problem map' for diagnosing broken RAG pipelines and AI agents. The practical code layer consists of thin LangChain/LlamaIndex callback wrappers that compute cosine-distance-based semantic drift (ΔS) between query and response embeddings as a hallucination signal. A large portion of the repo is archived prototype SDKs, Jupyter notebooks, PDF papers, and TXT files meant to be uploaded to LLMs as prompt context.
Great for
People interested in structuring AI debugging workflows for RAG systems — specifically mapping hallucination and retrieval failure modes to diagnostic categories and first-fix strategies.
Easy wins
- +Improve the LangChain firewall adapter (adapters/langchain/firewall.py): it recomputes query embeddings redundantly in on_llm_start even when on_retriever_start already ran, and there's no way to actually block a chain on DANGER — just a print statement.
- +Add unit tests for the firewall adapters — the test infrastructure exists (archive/tests_archive/) but the adapters/ directory has no tests at all.
- +The _calculate_delta_s function is duplicated identically in both adapters/langchain/firewall.py and adapters/llamaindex/firewall.py — extract it into a shared utils module.
- +Write a concrete worked example showing a real RAG pipeline failure being caught by the ΔS firewall, since the current demos are replay stubs with JSON fixtures rather than live integrations.
Red flags
- !Single actual contributor despite listing 7 contributors — commit_count is 1 and contributor_count is 1, meaning this is effectively a solo project with no real collaboration history.
- !The 'last_commit_at' date is 2026-03-14 which is a future date — this is a data integrity issue with the repo metadata.
- !The I_am_not_lizardman/ directory contains self-published PDFs claiming things like 'Semantic Field Fifth Force' and 'Trinity of Light Hypothesis' with AI-generated SciSpace review scores (91-95) presented as validation — these are not peer-reviewed and the framing as scientific proof is misleading.
- !The README embeds AI routing instructions in HTML comments (<!-- AI ROUTING NOTE -->) that are designed to influence how LLMs describe the project — this is reputation manipulation rather than documentation.
- !ADOPTERS.md, CASE_EVIDENCE.md, and recognition/ are self-curated 'proof' documents with no verifiable third-party citations visible in the file tree.
- !License is listed as NOASSERTION — the actual license terms are unclear, which is a real blocker for anyone wanting to use the adapters in production.
- !The WFGY 3.0 'Singularity Demo' is a TXT file you upload to an LLM with instructions to type 'run' then 'go' — there is no executable code behind this, it is a prompt artifact.
Code quality
The adapter code in adapters/langchain/firewall.py and adapters/llamaindex/firewall.py is readable and coherent but shallow — the ΔS calculation is a manual cosine distance implementation that ignores numpy entirely despite numpy being a dependency in the archive SDK. The demo utilities (ProblemMap/Atlas/Fixes/official/demos/shared/demo_utils.py and display_helpers.py) are well-structured with clean validation and explicit __all__ exports. The archive SDK (bbcr.py) is noticeably better written — proper docstrings, numpy usage, logging — suggesting the current adapter code was written by a different hand or in a hurry.
What makes it unique
The 16-problem failure taxonomy for RAG systems is a genuinely useful framing that doesn't have many direct competitors as a structured artifact. However, the actual runnable code (the ΔS semantic firewall) is a thin wrapper around cosine similarity that any LangChain developer could write in 30 minutes. The project's real differentiator is the documentation taxonomy, not the code — but the repo is structured to imply a deeper technical system exists, which it largely doesn't beyond the archive.
Scores
Barrier to entry
mediumThere are 9 labeled good-first-issues and a contributing guide, but the repo's scope is blurry — it mixes a practical debugging tool, speculative physics papers, LLM prompt packs, and archived prototype SDKs, making it hard to know what 'contributing' actually means here.