RAG Initiative

Knowledge-Oriented Retrieval-Augmented Generation

We move from competition-winning systems to reusable frameworks, datasets, and surveys that improve robustness, efficiency, and safety of Retrieval-Augmented Generation pipelines.

6+ Active RAG Tracks
Meta KDD '24 🥈 Competition Finish
ACL, CIKM, WWW & FCS 2025-2026 Acceptances
1.2k★ Agent-R1 Community

Flagship Projects

Systems, Benchmarks, and Knowledge Resources

Each project pushes Retrieval-Augmented Generation forward—from competition-grade systems to public benchmarks and surveys.

CRAG Framework

KDD Cup CRAG 2024 · Silver Medal & FCS

Second-place solution for Meta's CRAG challenge, delivering reliable multi-hop retrieval and controlled generation; technical report accepted to FCS.

Competition Repo →
PruningRAG Framework

PruningRAG · CIKM 2025

Plug-and-play framework with multi-source pruning strategies that filter noisy knowledge before generation; accepted to CIKM 2025.

Paper & Framework →
HoH Benchmark

HoH Benchmark · ACL 2025

Measures how outdated information hurts RAG reliability and now part of the ACL 2025 Main Conference program.

ArXiv →
Knowledge-Oriented Survey

Knowledge-Oriented Survey

Comprehensive taxonomy and evaluation review for RAG systems, covering mechanisms, applications, and open problems.

Survey Repo →
Agent-R1 Framework

Agent-R1 · 1.2k★

End-to-end reinforcement learning agent that couples retrieval, planning, and execution for complex tasks; community support now exceeds 1.2k GitHub stars.

Framework →
MemWeaver Framework

MemWeaver · WWW 2026

Hierarchical memory framework for personalized generation from user textual history.

Paper →

Timeline

Milestones in Retrieval-Augmented Generation

2024

Meta KDD Cup · CRAG Track & FCS

Multi-stage retrieval, reranking, and controllable decoding achieved a silver medal among 1,400+ teams; technical report accepted to FCS.

Details →

2024

PruningRAG · CIKM 2025

Multi-granularity pruning policies for source selection, now an accepted paper at CIKM 2025, reducing hallucinations in deployed RAG agents.

Framework →

2025

HoH Benchmark · ACL 2025

First dataset to stress outdated knowledge, now accepted by the ACL 2025 Main Conference, quantifying freshness effects on downstream reasoning accuracy.

Paper →

2025

Knowledge-Oriented RAG Survey

Comprehensive taxonomy that catalogs retrieval strategies, evaluation lenses, and safety considerations for production RAG systems.

Survey Repo →

2025

Agent-R1 · 1.2k★ Community

Open-source Agent-R1 crosses 1.2k stars as researchers adopt its retrieval-aware RL training loop for trustworthy autonomous agents.

Framework →

2025

MemWeaver · WWW 2026

Hierarchical memory framework for personalized generation from user textual history; accepted to WWW 2026.

Paper →

Resources

Toolchains, Data, and Community

Competition Toolkit

Modular retrievers, rerankers, and decoders extracted from the KDD Cup pipeline for rapid experimentation.

Pruning Policies

Multi-source heuristics that cull redundant passages before feeding them into the generator.

Temporal Benchmarks

HoH-style datasets that capture changing facts and measure model freshness.

Awesome RAG Papers

Curated reading list and taxonomy for anyone building RAG systems in production.

Why Retrieval Still Matters

We prioritize grounded reasoning by constraining large language models with curated, verifiable context. Our work spans pruning, benchmark design, and agent training to keep answers safe and up-to-date.

Responsible Systems

Multi-source pruning and audits reduce hallucinations and encourage evidence-backed responses.

Continuous Evaluation

Automated freshness tests and safety audits keep RAG deployments aligned with evolving knowledge and risk requirements.

Open Collaboration

Repositories, dataset releases, and surveys lower the barrier for the broader RAG community.