Design Philosophy¶

LLM Expect was built with a specific worldview on how LLM evaluation should work. We believe that evaluation should be local, deterministic, and code-centric.

Core Principles¶

1. Local-First & Privacy-Centric¶

We do not send your data to our servers. * No Login Required: You don't need an account to use LLM Expect. * No Cloud Dashboard: Your results live on your machine (or your CI runner). * Your Keys, Your Control: You manage your own API keys. We never see them.

2. Zero-Config by Default¶

You shouldn't need a 50-line YAML file to run a test. * Sensible Defaults: We assume you want to test accuracy and safety unless you say otherwise. * Convention over Configuration: If you name your dataset tests.jsonl, we'll find it.

3. Code-Based Testing (Not UI-Based)¶

Evaluation belongs in your codebase, version-controlled alongside your application logic. * Git-Friendly: JSONL datasets and Python test files are easy to diff and review. * CI/CD Native: Since it's just a Python script, it runs anywhere Python runs.

4. Minimal Surface Area¶

We focus on doing one thing well: running a function against a dataset and checking the output. * We are not an agent framework. * We are not a prompt management tool. * We are not a vector database.

Comparison to Other Tools¶

Feature	LLM Expect	DeepEval / Ragas	LangSmith / Arize
Primary Interface	Python Decorator	Python SDK	Web Dashboard
Data Storage	Local JSONL	Local / Cloud	Cloud
Focus	Integration Testing	RAG Metrics	Observability
Complexity	Low	High	High
Cost	Free (Open Source)	Free / Paid	Paid

Why "Expect"?¶

The name comes from the testing assertion pattern (e.g., expect(result).toBe(value)). We want LLM testing to feel as rigorous and standard as unit testing.