Back to Explore

chrisgve/codesize

Rust1 contributorsMIT
View on GitHub

Summary

codesize is a CLI tool that scans source trees and reports files/functions exceeding configurable line-count limits. It uses tree-sitter grammars to accurately detect function boundaries (not just regex line counting) across 10 built-in languages, outputs CSV results, and integrates with CI via a --fail flag. Think of it as a lightweight complexity gate you drop into GitHub Actions or pre-commit.

Great for

people interested in code quality enforcement tooling, specifically static analysis pipelines that need accurate AST-based function boundary detection without spinning up a full language server

Easy wins

  • +Add missing language grammars: Ruby (tree-sitter-ruby), Elixir, Kotlin, PHP are all available as crates — the pattern in parser.rs is a trivial match arm + func_types entry
  • +The `walk()` function in parser.rs uses recursive tree traversal with index-based child access — convert it to tree-sitter's TreeCursor API for better performance on large files
  • +Arrow functions in JS/TS only detect `variable_declarator` wrapping `arrow_function` — missed cases: exported consts, destructured assignments, object method shorthand. Write failing tests first
  • +Add a `--format` flag (table/JSON/CSV) since the CSV-only output is limiting for interactive use; the write_csv abstraction in scanner.rs already isolates the output logic cleanly

Red flags

  • !config.rs limits override: `limits.insert(lang, overrides)` replaces the entire LangLimits struct, so a user who sets only `[limits.Rust] function = 100` in their config.toml will inadvertently lose the default file limit (it won't fall back to 500). The README implies partial overrides work ('leave file limit at 300') but the code doesn't implement this — it's a user-facing bug.
  • !Silent error swallowing in analyze_file: unreadable files are silently skipped with (0, vec![]) return, giving no indication to the user that files were missed.
  • !Only 1 commit total — this is brand new and has never been used in production. The Homebrew tap and codesize-action referenced in the README may not exist yet.
  • !has_tests field in repo metadata says false, but tests ARE present inline in the source files — this is a metadata scraping artifact, not an actual code issue.

Code quality

good

The code is clean and idiomatic Rust. Each module has a clear responsibility and the public API surface in lib.rs is minimal. The test suite in both parser.rs and scanner.rs is genuinely thorough — covering edge cases like empty source, trailing newlines, nested functions, and gitignore interactions. One real gap: `analyze_file` in parser.rs silently returns (0, []) on read errors (line: `Err(_) => (0, Vec::new())`), swallowing IO errors like permission denied without any logging. The config merging in config.rs correctly handles partial overrides but the `limits` override replaces entire language entries rather than merging file/function independently — a user setting only `function` will lose the default `file` limit.

What makes it unique

There are other line-count linters (e.g., lizard, scc with some metrics) but most use regex or heuristic parsing. Using tree-sitter for function boundary detection is the genuine differentiator here — it correctly handles nested functions, multiline signatures, and language-specific constructs that fool line-based tools. The CSV output targeting task-management workflows (rather than just CI pass/fail) is a practical angle most similar tools ignore.

Scores

Collab
3
Activity
3

Barrier to entry

low

The entire codebase is ~600 lines across 4 files with clear single-responsibility modules (config.rs, parser.rs, scanner.rs, main.rs), comprehensive inline tests, and CI already set up — a new contributor can clone, run `cargo test`, and understand the full system in under an hour.

Skills needed

Rust (intermediate — iterators, trait bounds, error handling with anyhow)tree-sitter API (how to walk ASTs, grammar node types per language)CLI design with clap derive macrosBasic understanding of gitignore-style file filtering (the ignore crate)