On this tutorial, we exhibit the combination of Python’s sturdy knowledge manipulation library Pandas with Google Cloud’s superior generative capabilities by means of the google.generativeai bundle and the Gemini Professional mannequin. By establishing the setting with the required libraries, configuring the Google Cloud API key, and leveraging the IPython show functionalities, the code offers a step-by-step method to constructing an information science agent analyzing a pattern gross sales dataset. The instance reveals the best way to convert a DataFrame into markdown format after which use pure language queries to generate insights concerning the knowledge, highlighting the potential of mixing conventional knowledge evaluation instruments with fashionable AI-driven strategies.
!pip set up pandas google-generativeai --quiet
First, we set up the Pandas and google-generativeai libraries quietly, establishing the setting for knowledge manipulation and AI-powered evaluation.
import pandas as pd
import google.generativeai as genai
from IPython.show import Markdown
We import Pandas for knowledge manipulation, google.generativeai for accessing Google’s generative AI capabilities, and Markdown from IPython.show to render markdown-formatted outputs.
GOOGLE_API_KEY = "Use Your API Key Right here"
genai.configure(api_key=GOOGLE_API_KEY)
mannequin = genai.GenerativeModel('gemini-2.0-flash-lite')
We assign a placeholder API key, configure the google.generativeai consumer with it, and initialize the ‘gemini-2.0-flash-lite’ GenerativeModel for producing content material.
knowledge = {'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Headphones'],
'Class': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics'],
'Area': ['North', 'South', 'East', 'West', 'North', 'South'],
'Items Offered': [150, 200, 180, 120, 90, 250],
'Value': [1200, 25, 75, 300, 50, 100]}
sales_df = pd.DataFrame(knowledge)
print("Pattern Gross sales Knowledge:")
print(sales_df)
print("-" * 30)
Right here, we create a Pandas DataFrame named sales_df containing pattern gross sales knowledge for varied merchandise, after which print the DataFrame adopted by a separator line to visually distinguish the output.
def ask_gemini_about_data(dataframe, question):
"""
Asks the Gemini Professional mannequin a query concerning the given Pandas DataFrame.
Args:
dataframe: The Pandas DataFrame to research.
question: The pure language query concerning the DataFrame.
Returns:
The response from the Gemini Professional mannequin as a string.
"""
immediate = f"""You're a knowledge evaluation agent. Analyze the next pandas DataFrame and reply the query.
DataFrame:
```
{dataframe.to_markdown(index=False)}
```
Query: {question}
Reply:
"""
response = mannequin.generate_content(immediate)
return response.textual content
Right here, we assemble a markdown-formatted immediate from a Pandas DataFrame and a pure language question, then use the Gemini Professional mannequin to generate and return an analytical response.
# Question 1: What's the complete variety of models bought throughout all merchandise?
query1 = "What's the complete variety of models bought throughout all merchandise?"
response1 = ask_gemini_about_data(sales_df, query1)
print(f"Query 1: {query1}")
print(f"Reply 1:n{response1}")
print("-" * 30)
# Question 2: Which product had the best variety of models bought?
query2 = "Which product had the best variety of models bought?"
response2 = ask_gemini_about_data(sales_df, query2)
print(f"Query 2: {query2}")
print(f"Reply 2:n{response2}")
print("-" * 30)
# Question 3: What's the common worth of the merchandise?
query3 = "What's the common worth of the merchandise?"
response3 = ask_gemini_about_data(sales_df, query3)
print(f"Query 3: {query3}")
print(f"Reply 3:n{response3}")
print("-" * 30)
# Question 4: Present me the merchandise bought within the 'North' area.
query4 = "Present me the merchandise bought within the 'North' area."
response4 = ask_gemini_about_data(sales_df, query4)
print(f"Query 4: {query4}")
print(f"Reply 4:n{response4}")
print("-" * 30)
# Question 5. Extra complicated question: Calculate the entire income for every product.
query5 = "Calculate the entire income (Items Offered * Value) for every product and current it in a desk."
response5 = ask_gemini_about_data(sales_df, query5)
print(f"Query 5: {query5}")
print(f"Reply 5:n{response5}")
print("-" * 30)
In conclusion, the tutorial efficiently illustrates how the synergy between Pandas, the google.generativeai bundle, and the Gemini Professional mannequin can remodel knowledge evaluation duties right into a extra interactive and insightful course of. The method simplifies querying and deciphering knowledge and opens up avenues for superior use circumstances reminiscent of knowledge cleansing, characteristic engineering, and exploratory knowledge evaluation. By harnessing these state-of-the-art instruments throughout the acquainted Python ecosystem, knowledge scientists can improve their productiveness and innovation, making it simpler to derive significant insights from complicated datasets.
Right here is the Colab Notebook. Additionally, don’t neglect to observe us on Twitter and be a part of our Telegram Channel and LinkedIn Group. Don’t Overlook to affix our 85k+ ML SubReddit.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its reputation amongst audiences.