How to Access Falcon 3?

Yashashwy Alok Last Updated : 09 Jan, 2025

7 min read

Falcon 3 is the newest breakthrough in the Falcon series of large language models, celebrated for its cutting-edge design and open accessibility. Developed by the Technology Innovation Institute (TII), it’s built to meet the growing demands of AI-driven applications, whether it’s generating creative content or data analysis. What truly sets Falcon 3 apart is its commitment to being open-source, making it easily accessible on platforms like Hugging Face. This ensures researchers, developers, and businesses of all sizes can leverage its capabilities with ease.

Designed for efficiency, scalability, and adaptability, Falcon 3 excels in both training and inference, delivering speed and accuracy without compromising on performance. Its enhanced architecture and fine-tuned parameters make it a versatile powerhouse, paving the way for revolutionary advancements in AI applications.

Falcon 3: Decoder-only Architecture
Comparison of Falcon 3 with Other Models
Accessing Falcon 3-10B Through Ollama in Colab
Frequently Asked Questions

Falcon 3: Decoder-only Architecture

Falcon 3 represents a leap forward in the AI landscape, offering cutting-edge capabilities in an open-source large language model (LLM). It excels in combining advanced performance with the ability to operate on resource-constrained infrastructures, making it accessible to a broader audience. Unlike traditional LLMs that require high-end GPUs or cloud infrastructure, Falcon 3 can run efficiently on devices as lightweight as laptops, eliminating the dependency on powerful computational resources. This breakthrough democratizes advanced AI, empowering developers, researchers, and businesses to leverage its capabilities without prohibitive costs.

At its core, Falcon 3 adopts a decoder-only architecture, a streamlined design optimized for tasks like text generation, reasoning, and comprehension. This architecture enables the models to focus on generating coherent, contextually relevant outputs, making them particularly effective for applications such as dialogue systems, creative content generation, and summarization. By eschewing the encoder-decoder complexity seen in some architectures, Falcon 3 maintains high efficiency while still achieving state-of-the-art performance in benchmarks.

The Falcon 3 lineup consists of four scalable models: 1B, 3B, 7B, and 10B, each available in both Base and Instruct versions. These models cater to a diverse range of applications:

Base models are ideal for general-purpose tasks, such as language understanding and text generation.
Instruct models are fine-tuned for instruction-following tasks, making them perfect for applications like customer service chatbots or virtual assistants.

Whether you’re developing generative AI tools, exploring complex reasoning, or implementing specialized instruction-following systems, Falcon 3 offers unparalleled flexibility and efficiency. Its scalable architecture and decoder-focused design ensure that it delivers exceptional results across a wide spectrum of use cases, all while remaining resource-friendly.

Falcon 3 is built on a decoder-only architecture, optimized for speed and resource efficiency.
Uses Flash Attention 2 and Grouped Query Attention (GQA):
- GQA minimizes memory usage during inference by sharing parameters, enabling faster and more efficient processing.
Tokenizer supports a high vocabulary of 131K tokens, double that of Falcon 2.
Trained with a 32K context size, enabling better handling of long-context data.
While this context length is substantial, some other models offer longer context capabilities.

Also read: Experience Advanced AI Anywhere with Falcon 3’s Lightweight Design

Comparison of Falcon 3 with Other Models

Here’s the comparison table:

Category	Benchmark	Llama3.1-8B	Qwen2.5-7B	Falcon3-7B-Base	Gemma2-9b	Falcon3-10B-Base	Falcon3-Mamba-7B
General	MMLU (5-shot)	65.2	74.2	67.5	70.8	73.1	64.9
	MMLU-PRO (5-shot)	32.7	43.5	39.2	41.4	42.5	30.4
	IFEval	12.0	33.9	34.3	21.2	36.4	28.9
Math	GSM8K (5-shot)	49.4	82.9	76.2	69.1	81.4	65.9
	MATH Lvl-5 (4-shot)	4.1	15.5	18.0	10.5	22.9	19.3
Reasoning	Arc Challenge (25-shot)	58.2	63.2	63.1	67.5	62.6	56.7
	GPQA (0-shot)	31.0	33.0	35.5	33.4	34.1	31.0
	MUSR (0-shot)	38.0	44.2	47.3	45.3	44.2	34.3
	BBH (3-shot)	46.5	54.0	51.0	54.3	59.7	46.8
CommonSense Understanding	PIQA (0-shot)	81.2	79.9	79.1	82.9	79.4	79.5
	SciQ (0-shot)	94.6	95.2	92.4	97.1	93.5	92.0
	Winogrande (0-shot)	74.0	72.9	71.0	74.2	73.6	71.3
	OpenbookQA (0-shot)	44.8	47.0	43.8	47.2	45.0	45.8

1. General Knowledge (MMLU, MMLU-PRO, and IFEval)

These benchmarks test how much the model knows about general topics and professional-level stuff.

Best performer:
Qwen2.5-7B scores the highest for general knowledge (74.2 in MMLU). It’s like the class topper in this category.
Falcon Models:
- Falcon3-7B-Base: Pretty decent at 67.5—not as great as Qwen but better than most others.
- Falcon3-10B-Base: Does even better with 73.1, closing in on Qwen.
- Falcon3-Mamba-7B: This one lags behind at 64.9 in MMLU and struggles with professional-level knowledge (MMLU-PRO, 30.4).
What it means:
If you’re looking for a model to answer general knowledge or professional-level questions, Falcon3-10B is a great choice, but Qwen2.5-7B still edges out.

2. Math (GSM8K and MATH Level-5)

Here, the benchmarks test the ability to solve math problems, from basic to advanced levels.

Best performer:
Qwen2.5-7B crushes the competition in GSM8K with 82.9. For advanced math (MATH Level-5), Falcon3-10B-Base wins with 22.9, showing it handles tougher problems better.
Falcon Models:
- Falcon3-7B-Base does surprisingly well in GSM8K with 76.2, showing it’s good at basic math problems.
- Falcon3-Mamba-7B falls behind at 65.9 in GSM8K, which is still decent but not competitive with the best.
What it means:
If you need strong math capabilities, go for Falcon3-10B-Base or Qwen2.5-7B. They’re the math whizzes here.

3. Reasoning (Arc Challenge, GPQA, MUSR, and BBH)

Reasoning tasks test how well the models can think logically and connect ideas.

Best performer:
- Gemma2-9b is the reasoning champ, scoring 67.5 in Arc Challenge and leading in several benchmarks.
- Falcon3-10B-Base shines in BBH (Big Bench Hard), scoring 59.7, showing it can handle really tough reasoning tasks.
Falcon Models:
- Falcon3-7B-Base is a solid performer in reasoning, especially in MUSR (47.3) and Arc Challenge (63.1). It’s not the best, but it holds its ground.
- Falcon3-Mamba-7B struggles a bit here, with lower scores like 56.7 in Arc Challenge and 46.8 in BBH.
What it means:
If your task is reasoning-heavy, Gemma2-9b and Falcon3-10B-Base are strong choices. Falcon3-7B is also a good budget option.

4. Common Sense Understanding (PIQA, SciQ, Winogrande, and OpenbookQA)

This category checks how well the models understand real-world logic and common sense.

Best performer:
- Gemma2-9b leads in most tasks, like PIQA (82.9) and SciQ (97.1). It’s great at commonsense and science-based QA.
- Falcon3-10B-Base is close behind, scoring 93.5 in SciQ and 79.4 in PIQA.
Falcon Models:
- Falcon3-7B-Base does well in PIQA (79.1) and SciQ (92.4)—not the best, but very competitive.
- Falcon3-Mamba-7B holds steady here, scoring 82.9 in PIQA, but lags behind slightly in tasks like SciQ (92.0).
What it means:
For tasks that involve everyday logic or science, Gemma2-9b and Falcon3-10B-Base are the top picks. Falcon3-7B-Base is still solid if you’re looking for a balanced option.

The Falcon models strike a balance between performance and versatility. While Falcon3-10B-Base is the clear leader in raw power, Falcon3-7B-Base offers a cost-effective option for most tasks, and Falcon3-Mamba-7B caters to specialized needs.

Accessing Falcon 3-10B Through Ollama in Colab

Falcon 3-10B, a state-of-the-art language model, can be accessed programmatically using Ollama and Python libraries like LangChain. This approach enables seamless integration of the model into a Colab environment for diverse use cases such as content generation, problem-solving, and more. Below are the detailed steps to set up and interact with Falcon 3:10B:

1. Install Ollama and Dependencies

To begin, you need to install the necessary system tools and the Ollama CLI, which acts as a bridge for interacting with Falcon 3:10B. The following commands will:

Update your system’s package manager.

Install essential utilities like pciutils.

Download and install the Ollama CLI directly.

Commands:

!sudo apt update
!sudo apt install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh

This ensures your environment is ready for the Falcon 3:10B setup.

2. Install Required Python Libraries

Once the CLI is installed, you’ll need to install the Python libraries required for programmatic access. The LangChain Core library and its Ollama extension allow you to craft custom prompts and query models seamlessly.

Commands:

!pip install langchain-core
!pip install langchain-ollama
!pip install ipython

These libraries will enable you to design workflows that interact with the Falcon 3:10B model.

3. Query Falcon 3:10B

After the installation, you can interact with Falcon 3-10B using a Python script. The example below demonstrates how to:

Define a structured prompt template.
Load the model using the Ollama integration.
Create a query chain combining the prompt and model.
Retrieve and display the model’s response in Markdown format.

Python Code:

from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
from IPython.display import Markdown
template = """Question: {question}
Answer: Let's think step by step."""
prompt = ChatPromptTemplate.from_template(template)
model = OllamaLLM(model="falcon3:10b")
chain = prompt | model
# Query Falcon 3:10B
display(Markdown(chain.invoke({"question": "fibonacci series code"})))

Explanation of Code:

ChatPromptTemplate: Structures the input query to provide step-by-step responses.
OllamaLLM: Loads the Falcon 3-10B model, specifying the exact model identifier.
Chain: Combines the prompt and model into a single pipeline for querying.
Markdown Display: Ensures the response is shown in a clean, readable format.

This script queries the model for Python code to generate a Fibonacci series and displays the result.

Output:

4. Automate and Extend

The framework is not limited to basic queries. You can:

Automate repetitive tasks like generating multiple content pieces or answering FAQs.
Solve complex problems, such as coding tasks or mathematical computations.
Integrate Falcon 3:10B into larger applications, like chatbots or data analysis tools.

By modifying the prompt or model identifier, you can tailor this setup for various domains, including technical documentation, creative writing, and educational content.

Conclusion

Falcon 3-10B represents a significant leap forward in the field of open-source large language models, combining state-of-the-art capabilities with the flexibility and accessibility needed for a wide range of applications. I hope you have understood, how to access Falcon 3-10B, its integration with Ollama and Python libraries like LangChain makes it easier than ever for developers, researchers, and enterprises to harness its power in environments like Google Colab.

With straightforward installation steps, an intuitive querying process, and the ability to automate and extend its functionality, Falcon 3-10B stands out as a versatile tool for tasks such as content generation, problem-solving, and advanced data analysis. The combination of cutting-edge performance and open-source accessibility solidifies Falcon 3-10B as an invaluable asset for those seeking to push the boundaries of natural language processing in their projects.

Whether you’re a developer exploring new possibilities, a researcher diving into NLP innovations, or an enterprise looking for scalable AI solutions, Falcon 3-10B offers a robust and adaptable platform to meet your needs. Its commitment to open-source principles ensures that the latest advancements in AI remain within reach for everyone, empowering the community to innovate and excel.

Frequently Asked Questions

Q1. What are the system requirements to access Falcon 3-10B in Colab?

Ans. To run Falcon 3-10B in Colab, ensure the following:
Colab Environment: Use Google Colab Pro or Pro+ for better performance since Falcon 3-10B is resource-intensive.
Python Version: Python 3.8 or higher.
RAM: A minimum of 16GB is recommended to handle the model effectively.
Dependencies: Install required tools like pciutils and libraries like LangChain Core and Ollama CLI.

Q2. How do I troubleshoot installation issues with Ollama CLI?

Ans. If you encounter issues during the installation of Ollama CLI:
1. Verify your internet connection as the installer fetches files online.
2. Ensure that curl is installed on your system (sudo apt install curl).
3. Check permissions and rerun the command with sudo.
4. If the problem persists, refer to the Ollama documentation or their GitHub page for updates or alternative installation methods.

Q3. Can Falcon 3:10B be fine-tuned for specific applications?

Ans. Yes, Falcon 3-10B supports fine-tuning. While the example focuses on querying the pre-trained model, you can fine-tune Falcon 3-10B using custom datasets for domain-specific tasks. This requires additional computational resources and expertise in fine-tuning large language models.

Q4. How secure is using Falcon 3-10B in cloud-based environments like Colab?

Ans. Using Falcon 3-10B in Colab is generally secure, but follow these practices:
1. Avoid sharing sensitive data directly with the model.
2. Use encrypted connections and APIs if integrating with external systems.
3. Regularly update the libraries and dependencies to patch any security vulnerabilities.

Q5. Can Falcon 3-10B generate outputs in languages other than English?

Ans. Yes, Falcon 3-10B supports multilingual capabilities. You can query the model in various languages, provided the language is supported by its training data. For improved results, structure your prompts clearly and include examples in the desired language.

Yashashwy Alok

Hello, my name is Yashashwy Alok, and I am passionate about data science and analytics. I thrive on solving complex problems, uncovering meaningful insights from data, and leveraging technology to make informed decisions. Over the years, I have developed expertise in programming, statistical analysis, and machine learning, with hands-on experience in tools and techniques that help translate data into actionable outcomes.

I’m driven by a curiosity to explore innovative approaches and continuously enhance my skill set to stay ahead in the ever-evolving field of data science. Whether it’s crafting efficient data pipelines, creating insightful visualizations, or applying advanced algorithms, I am committed to delivering impactful solutions that drive success.

In my professional journey, I’ve had the opportunity to gain practical exposure through internships and collaborations, which have shaped my ability to tackle real-world challenges. I am also an enthusiastic learner, always seeking to expand my knowledge through certifications, research, and hands-on experimentation.

Beyond my technical interests, I enjoy connecting with like-minded individuals, exchanging ideas, and contributing to projects that create meaningful change. I look forward to further honing my skills, taking on challenging opportunities, and making a difference in the world of data science.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

How to Access Falcon 3?

Table of contents

Falcon 3: Decoder-only Architecture

Comparison of Falcon 3 with Other Models

1. General Knowledge (MMLU, MMLU-PRO, and IFEval)

2. Math (GSM8K and MATH Level-5)

3. Reasoning (Arc Challenge, GPQA, MUSR, and BBH)

4. Common Sense Understanding (PIQA, SciQ, Winogrande, and OpenbookQA)

Accessing Falcon 3-10B Through Ollama in Colab

1. Install Ollama and Dependencies

2. Install Required Python Libraries

3. Query Falcon 3:10B

4. Automate and Extend

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv