dirgrab ๐โก
dirgrab
walks a directory (or Git repository), selects the files that matter, and concatenates their contents for easy copy/paste into language models. It can write to stdout, a file, or your clipboard, and it ships with a library crate so the same logic can be embedded elsewhere.
Highlights
- ๐ง Configurable defaults โ merge built-in defaults with global
config.toml
, project-local.dirgrab.toml
,.dirgrabignore
, and CLI flags. - ๐งญ Git-aware out of the box โ untracked files are included by default, scoped to the selected subdirectory, with
--tracked-only
and--all-repo
to opt out. - ๐๏ธ Structured context โ optional directory tree, per-file headers, PDF text extraction, and deterministic file ordering for stable diffs.
- ๐งฎ Better stats โ
-s/--stats
now prints summary totals plus a per-file token leaderboard, and you can pick which reports to show each run. - ๐
Safety nets โ automatically ignores the active output file, respects
.gitignore
, and gracefully skips binary/non-UTF8 files.
Installation
# or from a local checkout
# cargo install --path .
Check it worked:
Usage
TARGET_PATH
defaults to the current directory. When invoked inside a Git repo, dirgrab
scopes the listing to that subtree unless you pass --all-repo
.
Common Options
-o, --output [FILE]
โ write to a file (defaults todirgrab.txt
if no name is given). Conflicts with--clipboard
.-c, --clipboard
โ copy to the system clipboard instead of stdout or a file.--no-headers
/--no-tree
/--no-pdf
โ disable headers, the directory tree, or PDF extraction.-e, --exclude <PATTERN>
โ add glob-style excludes (applied after config files).--tracked-only
โ Git mode: limit to tracked files. (Compatibility note:-u/--include-untracked
still forces inclusion if you need it.)--all-repo
โ Git mode: operate on the entire repository even if the target is a subdirectory.--include-default-output
โ allowdirgrab.txt
back into the run.--no-git
โ ignore Git context entirely and walk the filesystem.--no-config
โ ignore global/local config files and.dirgrabignore
.--config <FILE>
โ load an additional TOML config file (applied after global/local unless--no-config
).--token-ratio <FLOAT>
โ override the characters-to-tokens ratio used by--stats
(defaults to 3.6).--tokens-exclude-tree
/--tokens-exclude-headers
โ subtract tree or header sections when estimating tokens.-s, --stats [REPORT...]
โ print stats reports to stderr. Defaults tooverview
+top-files=5
; provide explicit reports like--stats overview top-files=10
.-v, -vv, -vvv
โ increase log verbosity (Warn, Info, Debug, Trace).-h, --help
/-V, --version
โ CLI boilerplate.
Configuration Files
dirgrab
layers configuration in the following order (later wins):
- Built-in defaults
- Global config + ignore
- Linux:
~/.config/dirgrab/config.toml
&~/.config/dirgrab/ignore
- macOS:
~/Library/Application Support/dirgrab/config.toml
&โฆ/ignore
- Windows:
%APPDATA%\dirgrab\config.toml
&ignore
- Linux:
- Project-local config:
<target>/.dirgrab.toml
- Project-local ignore patterns:
<target>/.dirgrabignore
- CLI flags (
--tracked-only
,--no-tree
, etc.)
Sample config.toml
:
[]
= ["Cargo.lock", "*.csv", "node_modules/", "target/"]
= true
= true
= true
= false
= false
[]
= true
= 3.6
= ["tree"]
= ["overview", "top-files=8"]
ignore
files use the same syntax as .gitignore
. CLI -e
patterns and the active output file name are appended last, so the freshly written file is never re-ingested accidentally.
Examples
# Grab the current repo subtree (includes untracked files) and show stats
# Limit to tracked files only and exclude build artifacts
# Force a whole-repo snapshot from within a subdirectory
# Plain directory mode with custom excludes, writing to the default file
# Use project defaults but ignore configs for a โcleanโ run
Behaviour Notes
- Git scope & ordering โ Paths are gathered via
git ls-files
, scoped to the target subtree unless--all-repo
is set, and the final list is sorted for deterministic output. Non-Git mode useswalkdir
with the same ordering. - File headers & tree โ Headers and tree sections remain enabled by default; toggle them per run or through config files.
- PDF handling โ Text is extracted from PDFs unless disabled. Failures and binary files are skipped with informative (but less noisy) logs.
- Stats โ When
--stats
is active (or enabled in config), stderr shows the requested reports (default: totals + top files). Exclude tree/headers, adjust the ratio, or pick different reports via config or CLI. - Safety โ
dirgrab.txt
stays excluded unless explicitly re-enabled, and any active-o FILE
target is auto-excluded for that run.
Library (dirgrab-lib
)
The same engine powers dirgrab-lib
; import it to drive custom tooling:
use ;
# // build a GrabConfig and call grab_contents(&config)
See docs.rs for API details.
Changelog
See CHANGELOG.md for the full release history.
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
Contributing
Issues and PRs are welcome! Please run cargo fmt
, cargo clippy
, and cargo test
before submitting.