๐ŸŽ„ Advent of Haystack solutions are here, explore them now!

Advent of Haystack

Explore Haystack with Weaviate, AssemblyAI, NVIDIA, Arize AI, and MongoDB through 10 challenges! ๐ŸŽ‰

๐Ÿ’™ Thank you for your interest in the Advent of Haystack 2024.

While submissions are now closed, solutions are available on each challenge page throughout January.

Sign up for the Haystack newsletter to stay updated on upcoming events. See you next year! ๐Ÿ‘‹

Day 10: Jingle Metrics All the Way ๐Ÿ””

Haystack Elves

Haystack Elves worked tirelessly this year to make the holiday season stress-free and joyful. Determined to innovate, they tackled challenges with cutting-edge AI solutions.

They enhanced pipelines with speech-to-text models, explored various LLM providers, and customized Haystack pipelines for unique needs. They built AI Agents with tool-calling and self-reflection, added tracing mechanisms, and developed faster with deepset Studio. To ensure a top-notch tech stack, they partnered with tools like Weaviate, AssemblyAI, NVIDIA NIMs, Arize Phoenix, and MongoDB.

However, there’s one crucial step remaining before taking anything into production: ๐Ÿ“Š Evaluation ๐Ÿ“Š

Haystack equips the elves with the tools they need, including integrations with evaluation frameworks and built-in evaluators. Adding to this, the Haystack ecosystem now features a powerful new tool: EvaluationHarness. This tool streamlines the evaluation process for Haystack pipelines by eliminating the need to create a separate evaluation pipeline while also making it easier to compare configurations using overrides.

For this challenge, you need to help Haystack elves evaluate a simple RAG pipeline using RAGEvaluationHarness, a specialized extension of EvaluationHarness designed to simplify and optimize evaluation specifically for RAG pipelines.

๐ŸŽฏ Requirements:

๐Ÿ’ Some Hints:

โญ Bonus Task: Take it a step further by incorporating hybrid retrieval into your pipeline. Use EvaluationHarness with customizations to test whether hybrid retrieval improves Recall and MRR ๐Ÿ‘€

๐Ÿฉต Here is the Starter Colab