Aller au contenu principal

Using the Reranker API

Ce document explique comment utiliser le modèle : ClovisReranker

POST/v1/rerank

Request format

Corps de requête
{
"model": "ClovisReranker",
"query": "What is the capital of France?",
"documents": [
"Paris is the capital of France.",
"Berlin is the capital of Germany.",
"Madrid is the capital of Spain."
],
"top_n": 3
}
Réponse
{
"id": "rerank-XXXXXXXXXXXXX"
"results": [
{
"index": 0,
"score": 0.98,
"document": {
"text": "Paris is the capital of France."
}
},
{
"index": 2,
"score": 0.21,
"document": {
"text": "Madrid is the capital of Spain."
}
},
{
"index": 1,
"score": 0.04,
"document": {
"text": "Berlin is the capital of Germany."
}
}
]
"meta": {
"billed_units": {
"total_tokens": 54
},
"tokens": {
"input_tokens": 54
}
}
}

Interprétation des résultats

IndexDocumentScore
0Paris is the capital of France0.98
1Madrid is the capital of Spain0.21
2Berlin is the capital of Germany0.04

Les documents doivent ensuite être triés par score décroissant afin d'obtenir le classement final.

Exemple

#!/usr/bin/env python3
"""
Minimalist script demonstrating how to use the reranking model.
"""

import httpx

# Configuration
LITELLM_BASE_URL = "https://llm-gateway.clovis-ai.fr"
RERANK_MODEL = "ClovisReranker"
API_KEY = "YOUR-API-KEY"

# Example query
query = "What is machine learning?"

# Example documents to rerank
documents = [
"Machine learning is a subset of artificial intelligence that enables systems to learn from data.",
"Python is a popular programming language for data science and web development.",
"Deep learning uses neural networks with multiple layers to process complex patterns.",
"The weather today is sunny with a temperature of 25 degrees Celsius.",
"Supervised learning algorithms require labeled training data to make predictions.",
]

# Number of top results to return
top_n = 3

print(f"Query: {query}")
print(f"
Reranking {len(documents)} documents...
")

# Setup HTTP client
headers = {"Authorization": f"Bearer {API_KEY}"}
client = httpx.Client(base_url=LITELLM_BASE_URL, timeout=30.0)

# Make the reranking request
response = client.post(
"/v1/rerank",
json={
"model": RERANK_MODEL,
"query": query,
"documents": documents,
"top_n": top_n,
},
headers=headers,
)
response.raise_for_status()
data = response.json()
results = data.get("results", [])

# Display results
if results:
print("Ranked results:")
for rank, result in enumerate(results, 1):
index = result["index"]
score = result["relevance_score"]
document = documents[index]
print(f"
{rank}. Score: {score:.4f}")
print(f" Document: {document}")
else:
print("No results returned.")

Bonnes pratiques

  • utiliser idéalement 10 à 30 documents
  • ne pas dépasser 50 documents
  • effectuer un retrieval vectoriel avant le reranking