Multi-Modal Agentic System For Stock Insights Using DeepSeek-R1 & Crew AI

Nibedita Dutta Last Updated : 19 Feb, 2025

10 min read

Multimodal agentic systems represent a revolutionary advancement in the field of artificial intelligence, seamlessly combining diverse data types—such as text, images, audio, and video—into a unified system that significantly enhances the capabilities of intelligent technologies. These systems rely on autonomous intelligent agents that can independently process, analyze, and synthesize information from various sources, facilitating a deeper and more nuanced understanding of complex situations.

By merging multimodal inputs with agentic functionality, these systems can dynamically adapt in real time to changing environments and user interactions, offering a more responsive and intelligent experience. This fusion not only boosts operational efficiency across a range of industries but also elevates human-computer interactions, making them more fluid, intuitive, and contextually aware. As a result, multimodal agentic frameworks are set to reshape the way we interact with and utilize technology, driving innovation in countless applications across sectors.

Learning Objectives

Benefits of agentic AI systems with advanced image analysis
How Crew AI’s Vision Tool enhances agentic AI capabilities?
Overview of DeepSeek-R1-Distill-Qwen-7B model and its features
Hands-on Python tutorial integrating Vision Tool with DeepSeek R1
Building a multi-modal, multi-agentic system for stock analysis
Analyzing and comparing stock behaviours using stock charts

This article was published as a part of the Data Science Blogathon.

Agentic AI systems with Image Analysis Capabilities
Building a Multi-Modal Agentic System to Explain Stock Behavior From Stock Charts
Hands-On Python Implementation using Ollama on Google Colab
Another Example of a Multi-Modal Agentic System For Stock Insights
Conclusions
Frequently Asked Questions

Agentic AI systems with Image Analysis Capabilities

Agentic AI systems, fortified with sophisticated image analysis capabilities, are transforming industries by enabling a suite of indispensable functions.

Instantaneous Visual Data Processing: These advanced systems possess the capacity to analyze immense quantities of visual information in real time, dramatically improving operational efficiency across diverse sectors, including healthcare, manufacturing, and retail. This rapid processing facilitates quick decision-making and immediate responses to dynamic conditions.
Superior Precision in Image Recognition: Boasting recognition accuracy rates surpassing 95%, agentic AI substantially diminishes the occurrence of false positives in image recognition tasks. This elevated level of precision translates to more dependable and trustworthy outcomes, crucial for applications where accuracy is paramount.
Autonomous Task Execution: By seamlessly incorporating image analysis into their operational frameworks, these intelligent systems can autonomously execute intricate tasks, such as providing medical diagnoses or conducting surveillance operations, all without the need for direct human oversight. This automation not only streamlines workflows but also minimizes the potential for human error, paving the way for increased productivity and reliability.

Crew AI Vision Tool

CrewAI is a cutting-edge, open-source framework designed to orchestrate autonomous AI agents into cohesive teams, enabling them to tackle complex tasks collaboratively. Within CrewAI, each agent is assigned specific roles, equipped with designated tools, and driven by well-defined goals, mirroring the structure of a real-world work crew.

The Vision Tool expands CrewAI’s capabilities, allowing agents to process and understand image-based text data, thus integrating visual information into their decision-making processes. Agents can leverage the Vision Tool to extract text from images by simply providing a URL or a file path, enhancing their ability to gather information from diverse sources. After the text is extracted, agents can then utilize this information to generate comprehensive responses or detailed reports, further automating workflows and enhancing overall efficiency. To effectively use the Vision Tool, it’s necessary to set the OpenAI API key within the environment variables, ensuring seamless integration with language models.

We will construct a sophisticated, multi-modal agentic system that will first leverage the Vision Tool from CrewAI designed to interpret and analyze stock charts (presented as images) of two companies. This system will then harness the power of the DeepSeek-R1-Distill-Qwen-7B model to provide detailed explanations of these companies’ stock’s behaviour, offering well-reasoned insights into the two companies’ performance and comparing their behaviour. This approach allows for a comprehensive understanding and comparison of market trends by combining visual data analysis with advanced language models, enabling informed decision-making.

DeepSeek-R1-Distill-Qwen-7B

To adapt DeepSeek R1’s advanced reasoning abilities for use in more compact language models, the creators compiled a dataset of 800,000 examples generated by DeepSeek R1 itself. These examples were then used to fine-tune existing models such as Qwen and Llama. The results demonstrated that this relatively simple knowledge distillation method effectively transferred R1’s sophisticated reasoning capabilities to these other models

The DeepSeek-R1-Distill-Qwen-7B model is one of the distilled DeepSeek R1’s models. It is a distilled version of the larger DeepSeek-R1 architecture, designed to offer enhanced efficiency while maintaining robust performance. Here are some key features:

The model excels in mathematical tasks, achieving an impressive score of 92.8% on the MATH-500 benchmark, demonstrating its capability to handle complex mathematical reasoning effectively.

In addition to its mathematical prowess, the DeepSeek-R1-Distill-Qwen-7B performs reasonably well on factual question-answering tasks, scoring 49.1% on GPQA Diamond, indicating a good balance between mathematical and factual reasoning abilities.

We will leverage this model to explain and find reasonings behind the behaviour of stocks of companies post extraction of information from stock chart images.

Performance Benchmarks of DeepSeek R1 distilled models: Source

Hands-On Python Implementation using Ollama on Google Colab

We will be using Ollama for pulling the LLM models and utilizing T4 GPU on Google Colab for building this multi-modal agentic system.

Step 1. Install Necessary Libraries

!pip install crewai crewai_tools
!sudo apt update
!sudo apt install -y pciutils
!pip install langchain-ollama
!curl -fsSL https://ollama.com/install.sh | sh
!pip install ollama==0.4.2

Step 2. Enablement of Threading to Setup Ollama Server

import threading
import subprocess
import time

def run_ollama_serve():
  subprocess.Popen(["ollama", "serve"])

thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)

Step 3. Pulling Ollama Models

!ollama pull deepseek-r1

Step 4. Defining OpenAI API Key and LLM model

import os
from crewai import Agent, Task, Crew, Process, LLM
from crewai_tools import LlamaIndexTool
from langchain_openai import ChatOpenAI
from crewai_tools import VisionTool
vision_tool = VisionTool()

os.environ['OPENAI_API_KEY'] =''
os.environ["OPENAI_MODEL_NAME"] = "gpt-4o-mini"

llm = LLM(
    
    model="ollama/deepseek-r1",
)

Step 5. Defining the Agents, Tasks in the Crew

def create_crew(image_url,image_url1):

  #Agent For EXTRACTNG INFORMATION FROM STOCK CHART
  stockchartexpert= Agent(
        role="STOCK CHART EXPERT",
        goal="Your goal is to EXTRACT INFORMATION FROM THE TWO GIVEN %s & %s stock charts correctly """%(image_url, image_url1),
        backstory="""You are a STOCK CHART expert""",
        verbose=True,tools=[vision_tool],
        allow_delegation=False

    )

  #Agent For RESEARCH WHY THE STOCK BEHAVED IN A SPECIFIC WAY
  stockmarketexpert= Agent(
        role="STOCK BEHAVIOUR EXPERT",
        goal="""BASED ON THE PREVIOUSLY EXTRACTED INFORMATION ,RESEARCH ABOUT THE RECENT UPDATES OF THE TWO COMPANIES and EXPLAIN AND COMPARE IN SPECIFIC POINTS WHY THE STOCK BEHAVED THIS WAY . """,
        backstory="""You are a STOCK BEHAVIOUR EXPERT""",
        verbose=True,

        allow_delegation=False,llm = llm
         )

  #Task For EXTRACTING INFORMATION FROM A STOCK CHART
  task1 = Task(
      description="""Your goal is to EXTRACT INFORMATION FROM THE GIVEN %s & %s stock chart correctly """%((image_url,image_url1)),
      expected_output="information in text format",
      agent=stockchartexpert,
  )

  #Task For EXPLAINING WITH ENOUGH REASONINGS WHY THE STOCK BEHAVED IN A SPECIFIC WAY
  task2 = Task(
      description="""BASED ON THE PREVIOUSLY EXTRACTED INFORMATION ,RESEARCH ABOUT THE RECENT UPDATES OF THE TWO COMPANIES and EXPLAIN AND COMPARE IN SPECIFIC POINTS WHY THE STOCK BEHAVED THIS WAY.""",
      expected_output="Reasons behind stock behavior in BULLET POINTS",
      agent=stockmarketexpert
  )
 
  #Define the crew based on the defined agents and tasks
  crew = Crew(
      agents=[stockchartexpert,stockmarketexpert],
      tasks=[task1,task2],
      verbose=True,  # You can set it to 1 or 2 to different logging levels
  )

  result = crew.kickoff()
  return result

Step 6. Running the Crew

The below two stock charts were given as input to the crew

text = create_crew("https://www.eqimg.com/images/2024/11182024-chart6-equitymaster.gif","https://www.eqimg.com/images/2024/03262024-chart4-equitymaster.gif")
pprint(text)

Final Output

Mamaearth's stock exhibited volatility during the year due to internal
 challenges that led to significant price changes. These included unexpected
 product launches and market controversies which caused both peaks and
 troughs in the share price, resulting in an overall fluctuating trend.

On the other hand, Zomato demonstrated a generally upward trend in its share
 price over the same period. This upward movement can be attributed to
 expanding business operations, particularly with successful forays into
 cities like Bengaluru and Pune, enhancing their market presence. However,
 near the end of 2024, external factors such as a major scandal or regulatory
 issues might have contributed to a temporary decline in share price despite
 the overall positive trend.

In summary, Mamaearth's stock volatility stems from internal inconsistencies
 and external controversies, while Zomato's upward trajectory is driven by
 successful market expansion with minor setbacks due to external events.

As seen from the final output, the agentic system has given quite a good analysis and comparison of the share price behaviours from the stock charts with sufficient reasonings like a foray into cities, and expansion in business operations behind the upward trend of the share price of Zomato.

Let’s check and compare the share price behaviour from stock charts for two more companies – Jubilant Food Works & Bikaji Foods International Ltd. for the year 2024.


text = create_crew("https://s3.tradingview.com/p/PuKVGTNm_mid.png","https://images.cnbctv18.com/uploads/2024/12/bikaji-dec12-2024-12-b639f48761fab044197b144a2f9be099.jpg?im=Resize,width=360,aspect=fit,type=normal")
print(text)

Final Output

The stock behavior of Jubilant Foodworks and Bikaji can be compared based on
 their recent updates and patterns observed in their stock charts.

Jubilant Foodworks:

Cup & Handle Pattern: This pattern is typically bullish, indicating that the
 buyers have taken control after a price decline. It suggests potential
 upside as the candlestick formation may signal a reversal or strengthening
 buy interest.

Breakout Point: The horizontal dashed line marking the breakout point implies
 that the stock has reached a resistance level and may now test higher
 prices. This is a positive sign for bulls, as it shows strength in the
 upward movement.

Trend Line Trend: The uptrend indicated by the trend line suggests ongoing
 bullish sentiment. The price consistently moves upwards along this line,
 reinforcing the idea of sustained growth.

Volume Correlation: Volume bars at the bottom showing correlation with price
 movements indicate that trading volume is increasing alongside upward price
 action. This is favorable for buyers as it shows more support and stronger
 interest in buying.

Bikaji:

Recent Price Change: The stock has shown a +4.80% change, indicating positive
 momentum in the short term.

Year-to-Date Performance: Over the past year, the stock has increased by
 61.42%, which is significant and suggests strong growth potential. This
 performance could be attributed to various factors such as market
 conditions, company fundamentals, or strategic initiatives.

Time Frame: The time axis spans from January to December 2024, providing a
 clear view of the stock's performance over the next year.

Comparison:

Both companies' stocks are showing upward trends, but Jubilant Foodworks has
 a more specific bullish pattern (Cup & Handle) that supports its current
 movement. Bikaji, on the other hand, has demonstrated strong growth over the
 past year and continues to show positive momentum with a recent price
 increase. The volume in Jubilant Foodworks correlates well with upward
 movements, indicating strong buying interest, while Bikaji's performance
 suggests sustained or accelerated growth.

The stock behavior reflects different strengths: Jubilant Foodworks benefits
 from a clear bullish pattern and strong support levels, whereas Bikaji
 stands out with its year-to-date growth. Both indicate positive
 developments, but the contexts and patterns differ slightly based on their
 respective market positions and dynamics.

As seen from the final output, the agentic system has given quite a good analysis and comparison of the share price behaviours from the stock charts with elaborate explanations on the trends seen like Bikaji’s sustained performance in contrast to Jubilant Foodworks’ bullish pattern.

Conclusions

In conclusion, multimodal agentic frameworks mark a transformative shift in AI by blending diverse data types for better real-time decision-making. These systems enhance adaptive intelligence by integrating advanced image analysis and agentic capabilities. As a result, they optimize efficiency and accuracy across various sectors. The Crew AI Vision Tool and DeepSeek R1 model demonstrate how such frameworks enable sophisticated applications, like analyzing stock behaviour. This advancement highlights AI’s growing role in driving innovation and improving decision-making.

Key Takeaways

Multimodal Agentic Frameworks: These frameworks integrate text, images, audio, and video into a unified AI system, enhancing artificial intelligence capabilities. Intelligent agents within these systems independently process, analyze, and synthesize information from diverse sources. This ability allows them to develop a nuanced understanding of complex situations, making AI more adaptable and responsive.
Real-Time Adaptation: By merging multimodal inputs with agentic functionality, these systems adapt dynamically to changing environments. This adaptability enables more responsive and intelligent user interactions. The integration of multiple data types enhances operational efficiency across various sectors, including healthcare, manufacturing, and retail. It improves decision-making speed and accuracy, leading to better outcomes
Image Analysis Capabilities: Agentic AI systems with advanced image recognition can process large volumes of visual data in real time, delivering precise results for applications where accuracy is critical. These systems autonomously perform intricate tasks, such as medical diagnoses and surveillance, reducing human error and improving productivity.
Crew AI Vision Tool: This tool enables autonomous agents within CrewAI to extract and process text from images, enhancing their decision-making capabilities and improving overall workflow efficiency.
DeepSeek-R1-Distill-Qwen-7B Model: This distilled model delivers robust performance while being more compact, excelling in tasks like mathematical reasoning and factual question answering, making it suitable for analyzing stock behaviour.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Frequently Asked Questions

Q1. What are multimodal agentic frameworks in AI?

Ans. Multimodal agentic frameworks combine diverse data types like text, images, audio, and video into a unified AI system. This integration enables intelligent agents to analyze and process multiple forms of data for more nuanced and efficient decision-making.

Q2. What is Crew AI?

Ans. Crew AI is an advanced, open-source framework designed to coordinate autonomous AI agents into cohesive teams that work collaboratively to complete complex tasks. Each agent within the system is assigned a specific role, equipped with designated tools, and driven by well-defined goals, mimicking the structure and function of a real-world work crew.

Q3. How does the Crew AI Vision Tool enhance multimodal systems?

Ans. The Crew AI Vision Tool allows agents to extract and process text from images. This capability enables the system to understand visual data and integrate it into decision-making processes, further improving workflow efficiency.

Q4. What industries can benefit from agentic AI systems with image analysis capabilities?

Ans. These systems are especially beneficial in industries like healthcare, manufacturing, and retail, where real-time analysis and precision in image recognition are critical for tasks such as medical diagnosis and quality control.

Q5. What are DeepSeek R1’s distilled models?

Ans. DeepSeek-R1’s distilled models are smaller, more efficient versions of the larger DeepSeek-R1 model, created using a process called distillation, which preserves much of the original model’s reasoning power while reducing computational demands. These distilled models are fine-tuned using data generated by DeepSeek-R1. Some examples of these distilled models are DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B, DeepSeek-R1-Distill-Llama-8B amongst others.

Nibedita Dutta

Nibedita completed her master’s in Chemical Engineering from IIT Kharagpur in 2014 and is currently working as a Senior Data Scientist. In her current capacity, she works on building intelligent ML-based solutions to improve business processes.

Advanced AI Agents Generative AI Application

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Multi-Modal Agentic System For Stock Insights Using DeepSeek-R1 & Crew AI

Learning Objectives

Table of contents

Agentic AI systems with Image Analysis Capabilities

Crew AI Vision Tool

Building a Multi-Modal Agentic System to Explain Stock Behavior From Stock Charts

DeepSeek-R1-Distill-Qwen-7B

Hands-On Python Implementation using Ollama on Google Colab

Step 1. Install Necessary Libraries

Step 2. Enablement of Threading to Setup Ollama Server

Step 3. Pulling Ollama Models

Step 4. Defining OpenAI API Key and LLM model

Step 5. Defining the Agents, Tasks in the Crew

Step 6. Running the Crew

Final Output

Another Example of a Multi-Modal Agentic System For Stock Insights

Final Output

Conclusions

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt