๐Ÿ†• Build and deploy Haystack pipelines with deepset Studio
Maintained by deepset

Integration: Optimum

High-performance inference using Hugging Face Optimum

Authors
deepset

Table of Contents

Overview

Hugging Face Optimum is an extension of the Transformers library that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. Using Optimum, you can leverage the ONNX Runtime to automatically export models from the Hugging Face Model Hub and deploy them in pipelines to achieve significant improvements in performance.

Installation

pip install optimum-haystack

Usage

Components

This integration introduces two components: OptimumTextEmbedder and OptimumDocumentEmbedder.

To create semantic embeddings for documents, use OptimumDocumentEmbedder in your indexing pipeline. For generating embeddings for queries, use OptimumTextEmbedder.

Below is the example indexing pipeline with InMemoryDocumentStore, OptimumDocumentEmbedder and DocumentWriter:

from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack_integrations.components.embedders.optimum import (
    OptimumDocumentEmbedder,
    OptimumEmbedderPooling,
)


document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")

documents = [Document(content="I enjoy programming in Python"),
             Document(content="My city does not get snow in winter"),
             Document(content="Japanese diet is well known for being good for your health"),
             Document(content="Thomas is injured and can't play sports")]

indexing_pipeline = Pipeline()
indexing_pipeline.add_component("embedder", OptimumDocumentEmbedder(
    model="intfloat/e5-base-v2",
    normalize_embeddings=True,
    pooling_mode=OptimumEmbedderPooling.MEAN,
))
indexing_pipeline.add_component("writer", DocumentWriter(document_store=document_store))
indexing_pipeline.connect("embedder", "writer")

indexing_pipeline.run({"embedder": {"documents": documents}})

License

optimum-haystack is distributed under the terms of the Apache-2.0 license.