Quickstart

Installation

pip install rag-audit

Or with uv:

uv add rag-audit

CLI usage

1. Create a config file

pipeline.json

{
  "pipeline_id": "my-pipeline",
  "question": "What is the capital of France?",
  "answer": "Paris is the capital of France.",
  "contexts": [
    "Paris is the capital and largest city of France.",
    "France is a country in Western Europe."
  ],
  "relevant": [
    "Paris is the capital and largest city of France."
  ],
  "k": 2,
  "llm": {
    "provider": "openai",
    "model": "gpt-4o-mini"
  }
}

Field	Type	Description
`pipeline_id`	`string`	Identifier for the pipeline
`question`	`string`	The question posed to the RAG pipeline
`answer`	`string`	The answer generated by the pipeline
`contexts`	`string[]`	Retrieved chunks, in rank order
`relevant`	`string[]`	Ground-truth relevant chunks
`k`	`int`	Number of top chunks to evaluate (default: `5`)
`llm.provider`	`"openai"` \| `"anthropic"`	LLM provider for the faithfulness judge
`llm.model`	`string`	Model name (e.g. `"gpt-4o-mini"`)

2. Run the audit

export OPENAI_API_KEY=sk-...
rag-audit run pipeline.json -o result.json

3. Generate a report

# Markdown (default)
rag-audit report result.json

# JSON
rag-audit report result.json --format json

Python API

Audit a pipeline

from rag_audit.core.config import LLMConfig, PipelineConfig
from rag_audit.core.runner import AuditRunner
from rag_audit.report.renderer import ReportRenderer

config = PipelineConfig(
    pipeline_id="my-pipeline",
    question="What is the capital of France?",
    answer="Paris is the capital of France.",
    contexts=["Paris is the capital and largest city of France."],
    relevant=["Paris is the capital and largest city of France."],
    k=1,
    llm=LLMConfig(provider="openai", model="gpt-4o-mini"),
)

report = AuditRunner(config).run()
print(ReportRenderer().to_markdown(report))

Compare chunking strategies

from langchain_openai import OpenAIEmbeddings

from rag_audit.chunker import ChunkingEvaluator, FixedSizeChunker, RecursiveChunker, SemanticChunker

embeddings = OpenAIEmbeddings()
evaluator = ChunkingEvaluator(embeddings)

document = "Your long document text here..."

report = evaluator.evaluate(
    document,
    {
        "fixed": FixedSizeChunker(chunk_size=500, overlap=50),
        "recursive": RecursiveChunker(chunk_size=500),
        "semantic": SemanticChunker(embeddings, similarity_threshold=0.8),
    },
)

print(f"Best strategy: {report.best_strategy}")
for s in report.strategies:
    print(f"  {s.strategy}: avg_cohesion={s.avg_cohesion:.3f}, chunks={s.chunk_count}")

Use a vectorstore adapter

from rag_audit.adapters import ChromaDBAdapter

adapter = ChromaDBAdapter("my-collection")

adapter.add(
    ids=["doc1", "doc2"],
    texts=["Paris is in France.", "Berlin is in Germany."],
    embeddings=[[0.1, 0.2, ...], [0.3, 0.4, ...]],
)

results = adapter.query(embedding=[0.1, 0.2, ...], k=1)
print(results)  # ["Paris is in France."]