Architecture
This page replaces the former root-levelARCHITECTURE.md file.
System Overview
The CLI is split into five subsystems:- command parsing and validation
- auth and host resolution
- native search execution
- enhanced discovery and reranking
- rendering and error reporting
Request Pipeline
Native Search
- Resolve effective host.
- Resolve host-scoped PAT.
- Parse the query and structured flags.
- Validate conflicts against raw query qualifiers.
- Build one GitHub repository-search REST request.
- Fetch one result set.
- Apply post-filters that are not natively expressible, such as
updated-*. - Render in the selected output format.
Discovery Search
- Resolve effective host and PAT.
- Parse and validate flags.
- Build the seed search.
- Collect candidates using the staged fan-out plan for the selected depth.
- Dedupe on
full_name. - Enrich candidates with repo metadata needed for activity and quality scoring.
- Optionally enrich the top README window if
--readmeis enabled. - Compute the selected rank mode.
- Apply limit.
- Render output, optionally with
--explain.
Inspect
- Resolve effective host and PAT.
- Fetch repo metadata by explicit
owner/repo. - Optionally fetch README if
--readmeis enabled. - Render details.
Storage Model
Secrets
- PATs are stored per host in the OS credential store
- env vars override stored credentials
- insecure fallback storage is opt-in only
Config
- store non-secret settings in a per-user config file
- config never stores behavioral search defaults that would silently enable enhanced behavior
Cache / Index
- v1 uses per-command ephemeral in-memory state only
- no persistent repo metadata DB survives across commands
- no persistent discovery index in default behavior
Scoring Model
Normalization
Each score component is normalized to a stable0..1 range before ranking.
Component Use
nativerank: preserve retrieval orderqueryrank: query score onlyactivityrank: activity score onlyqualityrank: quality score onlyblendedrank: weighted average of query, activity, and quality
Explainability
Every enhanced score must be explainable from explicit inputs and repo facts.- no learned personalization
- no history-based bias
Progress And Concurrency
Progress
- human progress output uses
stderr - structured stdout remains clean
- progress is phase-based:
- searching
- collecting
- enriching metadata
- enriching README
- ranking
Concurrency
- official GitHub guidance favors serial requests to avoid secondary rate limits
- default concurrency is
1 --concurrencyis advanced and explicit- on secondary-rate-limit responses, the client must:
- respect
retry-after - respect
x-ratelimit-reset - otherwise back off exponentially with jitter
- respect
Host Handling
Normalization
Accept:github.comgithub.example.comhttps://github.example.comhttps://github.example.com/api/v3
- canonical web host
- canonical REST API base URL
- GitHub.com web host ->
github.com - GitHub.com REST base ->
https://api.github.com - GHES REST base ->
https://<host>/api/v3
Support Promise
- GitHub.com is the primary target
- custom hosts are best-effort in v1
- the CLI should always send a stable API version header where supported
Validation And Conflict Policy
First-Error Policy
- validate flags before any network work where possible
- return the first clear error only
Examples Of Enforced Conflicts
- raw qualifier plus overlapping structured flag
--created-afterplus--created-within--weight-querywithout--rank blended--concurrencywithout explicit discovery or enrichment mode
Output Contracts
Success
prettyis human-first and concisejsonis stable and structuredcompactis minified JSONcsvis flat export-friendly output
Errors
stderronly- symbolic error code prefix
- plain text message
- shell exit code
1