On this tutorial, we discover the best way to construct an clever and self-correcting question-answering system utilizing the DSPy framework, built-in with Google’s Gemini 1.5 Flash mannequin. We start by defining structured Signatures that clearly define input-output habits, which DSPy makes use of as its basis for constructing dependable pipelines. With DSPy’s declarative programming method, we assemble composable modules, equivalent to AdvancedQA and SimpleRAG, to reply questions utilizing each context and retrieval-augmented era. By combining DSPy’s modularity with Gemini’s highly effective reasoning, we craft an AI system able to delivering correct, step-by-step solutions. As we progress, we additionally leverage DSPy’s optimization instruments, equivalent to BootstrapFewShot, to robotically improve efficiency primarily based on coaching examples.
!pip set up dspy-ai google-generativeai
import dspy
import google.generativeai as genai
import random
from typing import Checklist, Elective
GOOGLE_API_KEY = "Use Your Personal API Key"
genai.configure(api_key=GOOGLE_API_KEY)
dspy.configure(lm=dspy.LM(mannequin="gemini/gemini-1.5-flash", api_key=GOOGLE_API_KEY))
We begin by putting in the required libraries, DSPy for declarative AI pipelines, and google-generativeai to entry Google’s Gemini fashions. After importing the required modules, we configure Gemini utilizing our API key. Lastly, we arrange DSPy to make use of the Gemini 1.5 Flash mannequin as our language mannequin backend.
class QuestionAnswering(dspy.Signature):
"""Reply questions primarily based on given context with reasoning."""
context: str = dspy.InputField(desc="Related context data")
query: str = dspy.InputField(desc="Query to reply")
reasoning: str = dspy.OutputField(desc="Step-by-step reasoning")
reply: str = dspy.OutputField(desc="Remaining reply")
class FactualityCheck(dspy.Signature):
"""Confirm if a solution is factually appropriate given context."""
context: str = dspy.InputField()
query: str = dspy.InputField()
reply: str = dspy.InputField()
is_correct: bool = dspy.OutputField(desc="True if reply is factually appropriate")
We outline two DSPy Signatures to construction our system’s inputs and outputs. First, QuestionAnswering expects a context and a query, and it returns each reasoning and a closing reply, permitting the mannequin to elucidate its thought course of. Subsequent, FactualityCheck is designed to confirm the truthfulness of a solution by returning a easy boolean, serving to us construct a self-correcting QA system.
class AdvancedQA(dspy.Module):
def __init__(self, max_retries: int = 2):
tremendous().__init__()
self.max_retries = max_retries
self.qa_predictor = dspy.ChainOfThought(QuestionAnswering)
self.fact_checker = dspy.Predict(FactualityCheck)
def ahead(self, context: str, query: str) -> dspy.Prediction:
prediction = self.qa_predictor(context=context, query=query)
for try in vary(self.max_retries):
fact_check = self.fact_checker(
context=context,
query=query,
reply=prediction.reply
)
if fact_check.is_correct:
break
refined_context = f"{context}nnPrevious incorrect reply: {prediction.reply}nPlease present a extra correct reply."
prediction = self.qa_predictor(context=refined_context, query=query)
return prediction
We create an AdvancedQA module so as to add self-correction functionality to our QA system. It first makes use of a Chain-of-Thought predictor to generate a solution with reasoning. Then, it checks the factual accuracy utilizing a fact-checking predictor. If the reply is inaccurate, we refine the context and retry, as much as a specified variety of occasions, to make sure extra dependable outputs.
class SimpleRAG(dspy.Module):
def __init__(self, knowledge_base: Checklist[str]):
tremendous().__init__()
self.knowledge_base = knowledge_base
self.qa_system = AdvancedQA()
def retrieve(self, query: str, top_k: int = 2) -> str:
# Easy keyword-based retrieval (in apply, use vector embeddings)
scored_docs = []
question_words = set(query.decrease().cut up())
for doc in self.knowledge_base:
doc_words = set(doc.decrease().cut up())
rating = len(question_words.intersection(doc_words))
scored_docs.append((rating, doc))
# Return top-k most related paperwork
scored_docs.type(reverse=True)
return "nn".be part of([doc for _, doc in scored_docs[:top_k]])
def ahead(self, query: str) -> dspy.Prediction:
context = self.retrieve(query)
return self.qa_system(context=context, query=query)
We construct a SimpleRAG module to simulate Retrieval-Augmented Technology utilizing DSPy. We offer a information base and implement a primary keyword-based retriever to fetch essentially the most related paperwork for a given query. These paperwork function context for the AdvancedQA module, which then performs reasoning and self-correction to supply an correct reply.
knowledge_base = [
“Use Your Context and Knowledge Base Here”
]
training_examples = [
dspy.Example(
question="What is the height of the Eiffel Tower?",
context="The Eiffel Tower is located in Paris, France. It was constructed from 1887 to 1889 and stands 330 meters tall including antennas.",
answer="330 meters"
).with_inputs("question", "context"),
dspy.Example(
question="Who created Python programming language?",
context="Python is a high-level programming language created by Guido van Rossum. It was first released in 1991 and emphasizes code readability.",
answer="Guido van Rossum"
).with_inputs("question", "context"),
dspy.Example(
question="What is machine learning?",
context="ML focuses on algorithms that can learn from data without being explicitly programmed.",
answer="Machine learning focuses on algorithms that learn from data without explicit programming."
).with_inputs("question", "context")
]
We outline a small information base containing various info throughout varied subjects, together with historical past, programming, and science. This serves as our context supply for retrieval. Alongside, we put together a set of coaching examples to information DSPy’s optimization course of. Every instance features a query, its related context, and the right reply, serving to our system learn to reply extra precisely.
def accuracy_metric(instance, prediction, hint=None):
"""Easy accuracy metric for analysis"""
return instance.reply.decrease() in prediction.reply.decrease()
print("🚀 Initializing DSPy QA System with Gemini...")
print("📝 Observe: Utilizing Google's Gemini 1.5 Flash (free tier)")
rag_system = SimpleRAG(knowledge_base)
basic_qa = dspy.ChainOfThought(QuestionAnswering)
print("n📊 Earlier than Optimization:")
test_question = "What's the peak of the Eiffel Tower?"
test_context = knowledge_base[0]
initial_prediction = basic_qa(context=test_context, query=test_question)
print(f"Q: {test_question}")
print(f"A: {initial_prediction.reply}")
print(f"Reasoning: {initial_prediction.reasoning}")
print("n🔧 Optimizing with BootstrapFewShot...")
optimizer = dspy.BootstrapFewShot(metric=accuracy_metric, max_bootstrapped_demos=2)
optimized_qa = optimizer.compile(basic_qa, trainset=training_examples)
print("n📈 After Optimization:")
optimized_prediction = optimized_qa(context=test_context, query=test_question)
print(f"Q: {test_question}")
print(f"A: {optimized_prediction.reply}")
print(f"Reasoning: {optimized_prediction.reasoning}")
We start by defining a easy accuracy metric to examine if the anticipated reply comprises the right response. After initializing our SimpleRAG system and a baseline ChainOfThought QA module, we check it on a pattern query earlier than any optimization. Then, utilizing DSPy’s BootstrapFewShot optimizer, we fine-tune the QA system with our coaching examples. This allows the mannequin to robotically generate simpler prompts, resulting in improved accuracy, which we confirm by evaluating responses earlier than and after optimization.
def evaluate_system(qa_module, test_cases):
"""Consider QA system efficiency"""
appropriate = 0
complete = len(test_cases)
for instance in test_cases:
prediction = qa_module(context=instance.context, query=instance.query)
if accuracy_metric(instance, prediction):
appropriate += 1
return appropriate / complete
print(f"n📊 Analysis Outcomes:")
print(f"Primary QA Accuracy: {evaluate_system(basic_qa, training_examples):.2%}")
print(f"Optimized QA Accuracy: {evaluate_system(optimized_qa, training_examples):.2%}")
print("n✅ Tutorial Full! Key DSPy Ideas Demonstrated:")
print("1. 🔤 Signatures - Outlined enter/output schemas")
print("2. 🏗️ Modules - Constructed composable QA methods")
print("3. 🔄 Self-correction - Carried out iterative enchancment")
print("4. 🔍 RAG - Created retrieval-augmented era")
print("5. ⚡ Optimization - Used BootstrapFewShot to enhance prompts")
print("6. 📊 Analysis - Measured system efficiency")
print("7. 🆓 Free API - Powered by Google Gemini 1.5 Flash")
We run an Superior RAG demo by asking a number of questions throughout totally different domains. For every query, the SimpleRAG system retrieves essentially the most related context after which makes use of the self-correcting AdvancedQA module to generate a well-reasoned reply. We print the solutions together with a preview of the reasoning, showcasing how DSPy combines retrieval and considerate era to ship dependable responses.
In conclusion, we have now efficiently demonstrated the total potential of DSPy for constructing superior QA pipelines. We see how DSPy simplifies the design of clever modules with clear interfaces, helps self-correction loops, integrates primary retrieval, and permits few-shot immediate optimization with minimal code. With only a few strains, we configure and consider our fashions utilizing real-world examples, measuring efficiency beneficial properties. This hands-on expertise reveals how DSPy, when mixed with Google’s Gemini API, empowers us to quickly prototype, check, and scale refined language purposes with out boilerplate or advanced logic.
Take a look at the Codes. All credit score for this analysis goes to the researchers of this venture. Additionally, be at liberty to comply with us on Twitter, Youtube and Spotify and don’t neglect to affix our 100k+ ML SubReddit and Subscribe to our Newsletter.

Sana Hassan, a consulting intern at Marktechpost and dual-degree scholar at IIT Madras, is keen about making use of know-how and AI to handle real-world challenges. With a eager curiosity in fixing sensible issues, he brings a recent perspective to the intersection of AI and real-life options.