Skip to content

Roadmap

This roadmap outlines the future direction of LLM Expect. We prioritize features that align with our philosophy of simplicity and developer experience.

🚀 Upcoming Features

1. Dataset Builder UI

A simple, local web interface to visually create and edit tests.jsonl files. * Status: In Progress * Goal: Reduce friction for creating complex JSONL structures.

2. Agent Step Evaluation

Better support for testing intermediate steps in agent chains. * Status: Planned * Goal: Allow decorating individual tools or reasoning steps.

3. Embedding-Based Scoring

New metric using cosine similarity of embeddings for semantic matching. * Status: Planned * Goal: Cheaper and faster than LLM-as-a-judge for semantic similarity.

4. Optional CLI Runner

A llm-expect run command to execute tests without writing a Python script. * Status: Under Consideration * Goal: Simplify CI/CD integration.

🔮 Long-Term Vision

  • IDE Integration: VS Code extension for running tests directly from the editor.
  • Custom Reporters: Plug-and-play reporters for Slack, Discord, or custom webhooks.
  • Parallel Cloud Execution: Optional ability to offload execution to a cloud runner for massive datasets.

❌ Out of Scope

  • Prompt Management: We will not build a prompt registry.
  • Observability Dashboard: We will not build a hosted dashboard.