49 stars for GitHub’s RAG production list
Yigtwxx’s GitHub repo maps production RAG stacks, from LangGraph and Qdrant to Milvus, with benchmarks, pitfalls, and case studies.

GitHub repo Yigtwxx/awesome-rag-production catalogs production RAG tools, stacks, and benchmarks.
49 stars and 17 forks mark Yigtwxx/awesome-rag-production, a GitHub-curated list for building and operating retrieval-augmented generation systems in production. The repo was last reviewed on 2026-05-30 and says its freshness is audited weekly.
| 項目 | 數值 |
|---|---|
| GitHub stars | 49 |
| Forks | 17 |
| Last reviewed | 2026-05-30 |
| Commits | 180 |
What changed
Get the latest AI news in your inbox
Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.
No spam. Unsubscribe at any time.
The repository is not a single framework. It is a long-form reference map that groups tools by job: orchestration, ingestion, embeddings, vector databases, reranking, evaluation, observability, deployment, caching, security, and cost control.

Its decision guide pushes readers toward different stacks based on maturity. For complex agents, it points to LangGraph; for indexing-heavy apps, LlamaIndex; for auditable pipelines, Haystack; and for fast prototypes, LangChain.
- Local stack: Ollama + Chroma + Ragas for laptop-only testing.
- Mid-scale stack: Qdrant or Weaviate, plus Cohere Rerank and Langfuse or Arize Phoenix.
- Enterprise stack: Milvus, vLLM, DeepEval, and OpenLIT.
- Evaluation and tracing are treated as first-class parts of the stack, not add-ons.
The repo also includes production case studies from LinkedIn, DoorDash, and Discord. Those examples stress hybrid search, domain-specific embeddings, reranking, and A/B testing before LLM rollout.
Why it matters
For developers, the value is practical: it compresses a messy tool search into a production checklist. Instead of starting from tutorial code, teams can compare tradeoffs in latency, vendor lock-in, observability, and deployment complexity before they commit.

For the market, the repo reflects where RAG work is moving now. The focus is less on demo quality and more on operating cost, traceability, and scale. That is useful for teams choosing between managed services and self-hosted infrastructure, especially when retrieval quality and latency have to hold up under real traffic.
The main question the list raises is simple: which RAG stack fits your scale without creating a maintenance burden you cannot support?
// Related Articles
- [TOOLS]
Open Source RAG Stack Turns Chaos Into a Build Plan
- [TOOLS]
How to build AkiraOS WASM apps for Zephyr
- [TOOLS]
Foundry MCP turns remote tools into one agent endpoint
- [TOOLS]
Leverage lets you stop sounding like a buzzword
- [TOOLS]
LLM Leaderboard 2026: 300+ Models Ranked
- [TOOLS]
llama-benchy brings llama-bench tests to APIs