Roadmap¶
This roadmap outlines the future direction of LLM Expect. We prioritize features that align with our philosophy of simplicity and developer experience.
🚀 Upcoming Features¶
1. Dataset Builder UI¶
A simple, local web interface to visually create and edit tests.jsonl files.
* Status: In Progress
* Goal: Reduce friction for creating complex JSONL structures.
2. Agent Step Evaluation¶
Better support for testing intermediate steps in agent chains. * Status: Planned * Goal: Allow decorating individual tools or reasoning steps.
3. Embedding-Based Scoring¶
New metric using cosine similarity of embeddings for semantic matching. * Status: Planned * Goal: Cheaper and faster than LLM-as-a-judge for semantic similarity.
4. Optional CLI Runner¶
A llm-expect run command to execute tests without writing a Python script.
* Status: Under Consideration
* Goal: Simplify CI/CD integration.
🔮 Long-Term Vision¶
- IDE Integration: VS Code extension for running tests directly from the editor.
- Custom Reporters: Plug-and-play reporters for Slack, Discord, or custom webhooks.
- Parallel Cloud Execution: Optional ability to offload execution to a cloud runner for massive datasets.
❌ Out of Scope¶
- Prompt Management: We will not build a prompt registry.
- Observability Dashboard: We will not build a hosted dashboard.