Top 12 Free APIs for AI Development

Harsh Mishra Last Updated : 01 Mar, 2025

19 min read

In the rapidly evolving digital world of today, being able to use artificial intelligence (AI) is becoming essential for survival. Businesses may now improve customer relations, optimize processes, and spur innovation with the help of large language models, or LLMs. However, how can this potential be realised without a lot of money or experience? LLM APIs are the key to smoothly incorporating cutting-edge AI capabilities into your apps.

You may use Natural Language Processing (NLP) and comprehension without having to create intricate models from the start thanks to LLM APIs, which serve as the intermediaries between your software and the difficult realm of artificial intelligence. Whether you want to create intelligent coding assistants or improve customer service chatbots, LLM APIs give you the resources you need to be successful.

Understanding LLM APIs
Free API for LLMs Resources
OpenRouter – Free API
Google AI Studio – Free API
Mistral (La Plateforme) – Free API
HuggingFace Serverless Inference – Free API
Cerebras – Free API
Groq – Free API
Scaleway Generative Free API
OVH AI Endpoints – Free API
Together Free API
Cohere – Free API
GitHub Models – Free API
Fireworks AI – Free API
Benefits of Using LLM-Free APIs
Tips for Efficient Use of LLM-Free APIs
Conclusion

Understanding LLM APIs

LLM APIs operate on a straightforward request-response model:

Request Submission: Your application sends a request to the API, formatted in JSON, containing the model variant, prompt, and parameters.
Processing: The API forwards this request to the LLM, which processes it using its NLP capabilities.
Response Delivery: The LLM generates a response, which the API sends back to your application.

Pricing and Tokens

Tokens: In the context of LLMs, tokens are the smallest units of text processed by the model. Pricing is typically based on the number of tokens used, with separate charges for input and output tokens.
Cost Management: Most providers offer pay-as-you-go pricing, allowing businesses to manage costs effectively based on their usage patterns.

Free API for LLMs Resources

To help you get started without incurring costs, here’s a comprehensive list of LLM-free API providers, along with their descriptions, advantages, pricing, and token limits.

1. OpenRouter – Free API

OpenRouter provides a variety of LLMs for different tasks, making it a versatile choice for developers. The platform allows up to 20 requests per minute and 200 requests per day.

Some of the notable models available include:

DeepSeek R1
Llama 3.3 70B Instruct
Mistral 7B Instruct

All available models: Link
Documentation: Link

Advantages

High request limits.
A diverse range of models.

Pricing: Free tier available.

Example Code

from openai import OpenAI
client = OpenAI(
 base_url="https://openrouter.ai/api/v1",
 api_key="<OPENROUTER_API_KEY>",
)
completion = client.chat.completions.create(
 model="cognitivecomputations/dolphin3.0-r1-mistral-24b:free",
 messages=[
   {
     "role": "user",
     "content": "What is the meaning of life?"
   }
 ]
)
print(completion.choices[0].message.content)

Output

The meaning of life is a profound and multifaceted question explored through
 diverse lenses of philosophy, religion, science, and personal experience.
 Here's a synthesis of key perspectives:

1. **Existentialism**: Philosophers like Sartre argue life has no inherent
 meaning. Instead, individuals create their own purpose through actions and
 choices, embracing freedom and responsibility.

2. **Religion/Spirituality**: Many traditions offer frameworks where meaning
 is found through faith, divine connection, or service to a higher cause. For
 example, in Christianity, it might relate to fulfilling God's will.

3. **Psychology/Philosophy**: Viktor Frankl proposed finding meaning through
 work, love, and overcoming suffering. Others suggest meaning derives from
 personal growth, relationships, and contributing to something meaningful.

4. **Science**: While natural selection emphasizes survival, many see life's
 meaning in consciousness, creativity, or bonds formed with others,
 transcending mere biological imperatives.

5. **Art/Culture**: Through art, music, or literature, individuals express
 their search for meaning, often finding it in beauty, expression, or
 collective storytelling.

**Conclusion**: Ultimately, the meaning of life is subjective. It emerges
 from the interplay of experiences, beliefs, and personal choices. Whether
 through love, contribution, spirituality, or self-discovery, it is a journey
 where individuals define their own purpose. This diversity highlights the
 richness and mystery of existence, inviting each person to explore and craft
 their own answer.

2. Google AI Studio – Free API

Google AI Studio is a powerful platform for AI model experimentation, offering generous limits for developers. It allows up to 1,000,000 tokens per minute and 1,500 requests per day.

Some models available include:

Gemini 2.0 Flash
Gemini 1.5 Flash

All available models: Link
Documentation: Link

Advantages

Access to powerful models.
High token limits.

Pricing: Free tier available.

Example Code

from google import genai
client = genai.Client(api_key="YOUR_API_KEY")
response = client.models.generate_content(
   model="gemini-2.0-flash",
   contents="Explain how AI works",
)
print(response.text)

Output

/usr/local/lib/python3.11/dist-packages/pydantic/_internal/_generate_schema.py:502: UserWarning: <built-in
function any> is not a Python type (it may be an instance of an object),
Pydantic will allow any object with no validation since we cannot even
enforce that the input is an instance of the given type. To get rid of this
error wrap the type with `pydantic.SkipValidation`.

warn(

Okay, let's break down how AI works, from the high-level concepts to some of
the core techniques. It's a vast field, so I'll try to provide a clear and
accessible overview.

**What is AI, Really?**

At its core, Artificial Intelligence (AI) aims to create machines or systems
that can perform tasks that typically require human intelligence. This
includes things like:

* **Learning:** Acquiring information and rules for using the information

* **Reasoning:** Using information to draw conclusions, make predictions,
and solve problems.

* **Problem-solving:** Finding solutions to complex situations.

* **Perception:** Interpreting sensory data (like images, sound, or text).

* **Natural Language Processing (NLP):** Understanding and generating
human language.

* **Planning:** Creating sequences of actions to achieve a goal.

**The Key Approaches & Techniques**

AI isn't a single technology, but rather a collection of different approaches
and techniques. Here are some of the most important:

1. **Machine Learning (ML):**

* **The Foundation:** ML is the most prominent approach to AI today.
Instead of explicitly programming a machine to perform a task, you *train*
it on data. The machine learns patterns from the data and uses those
patterns to make predictions or decisions on new, unseen data.

* **How it works:**

* **Data Collection:** Gather a large dataset relevant to the task
you want the AI to perform. For example, if you want to build an AI to
recognize cats in images, you need a dataset of many images of cats (and
ideally, images that aren't cats).

* **Model Selection:** Choose a suitable ML model. Different
models are good for different types of problems. Examples include:

* **Linear Regression:** For predicting continuous values
(e.g., house prices).

* **Logistic Regression:** For predicting categorical values
(e.g., spam/not spam).

* **Decision Trees:** For making decisions based on a tree-like
structure.

* **Support Vector Machines (SVMs):** For classification
tasks, finding the best boundary between classes.

* **Neural Networks:** Inspired by the structure of the human
brain, excellent for complex tasks like image recognition, natural language
processing, and more.

* **Training:** Feed the data into the chosen model. The model
adjusts its internal parameters (weights, biases, etc.) to minimize errors
and improve its ability to make accurate predictions. This process involves:

* **Forward Propagation:** The input data is passed through the
model to generate a prediction.

* **Loss Function:** A loss function calculates the difference
between the model's prediction and the actual correct answer. The goal is
to minimize this loss.

* **Backpropagation:** The model uses the loss to adjust its
internal parameters (weights and biases) to improve its predictions in the
future. This is how the model "learns."

* **Optimization:** Algorithms (like gradient descent) are used
to find the parameter values that minimize the loss function.

* **Evaluation:** After training, you evaluate the model on a
separate dataset (the "test set") to see how well it generalizes to unseen
data. This helps you determine if the model is accurate enough and if it's
overfitting (performing well on the training data but poorly on new data).

* **Deployment:** If the model performs well, it can be deployed to
make predictions on real-world data.

* **Types of Machine Learning:**

* **Supervised Learning:** The model is trained on labeled data
(data where the correct answer is already known). Examples: classification
(categorizing data) and regression (predicting continuous values).

* **Unsupervised Learning:** The model is trained on unlabeled
data. It tries to find patterns and structures in the data on its own.
Examples: clustering (grouping similar data points together) and
dimensionality reduction (simplifying data while preserving important
information).

* **Reinforcement Learning:** The model learns by interacting with
an environment and receiving rewards or penalties for its actions. It aims
to learn a policy that maximizes its cumulative reward. Examples: training
AI agents to play games or control robots.

2. **Deep Learning:**

* **A Subfield of ML:** Deep learning is a type of machine learning
that uses artificial neural networks with many layers (hence "deep"). These
deep networks are capable of learning very complex patterns.

* **Neural Networks:** Neural networks are composed of interconnected
nodes (neurons) organized in layers. Each connection has a weight associated
with it, which determines the strength of the connection. The network
learns by adjusting these weights.

* **How it works:** Deep learning models are trained in a similar way
to other ML models, but they require significantly more data and
computational power due to their complexity. The layers of the network
learn increasingly abstract features from the data. For example, in image
recognition, the first layers might learn to detect edges and corners, while
the later layers learn to recognize more complex objects like faces or cars.

* **Applications:** Deep learning has achieved remarkable success in
areas like image recognition, natural language processing, speech
recognition, and game playing. Examples include:

* **Computer Vision:** Image classification, object detection,
image segmentation.

* **Natural Language Processing:** Machine translation, text
summarization, sentiment analysis, chatbot development.

* **Speech Recognition:** Converting speech to text.

3. **Natural Language Processing (NLP):**

* **Enabling AI to Understand and Generate Language:** NLP focuses on
enabling computers to understand, interpret, and generate human language.

* **Key Techniques:**

* **Tokenization:** Breaking down text into individual words or
units (tokens).

* **Part-of-Speech (POS) Tagging:** Identifying the grammatical
role of each word (e.g., noun, verb, adjective).

* **Named Entity Recognition (NER):** Identifying and classifying
named entities (e.g., people, organizations, locations).

* **Sentiment Analysis:** Determining the emotional tone of a piece
of text (e.g., positive, negative, neutral).

* **Machine Translation:** Translating text from one language to
another.

* **Text Summarization:** Generating a concise summary of a longer
text.

* **Topic Modeling:** Discovering the main topics discussed in a
collection of documents.

* **Applications:** Chatbots, virtual assistants, machine translation,
sentiment analysis, spam filtering, search engines, and more.

4. **Knowledge Representation and Reasoning:**

* **Symbolic AI:** This approach focuses on representing knowledge
explicitly in a symbolic form (e.g., using logical rules or semantic
networks).

* **Reasoning:** AI systems can use this knowledge to reason and draw
conclusions, often using techniques like:

* **Inference Engines:** Apply logical rules to derive new facts
from existing knowledge.

* **Rule-Based Systems:** Use a set of rules to make decisions or
solve problems.

* **Semantic Networks:** Represent knowledge as a graph of
interconnected concepts.

* **Applications:** Expert systems (systems that provide expert-level
advice in a specific domain), automated reasoning systems, and knowledge-
based systems.

5. **Robotics:**

* **Combining AI with Physical Embodiment:** Robotics combines AI with
mechanical engineering to create robots that can perform physical tasks.

* **Key Challenges:**

* **Perception:** Enabling robots to perceive their environment
using sensors (e.g., cameras, lidar, sonar).

* **Planning:** Planning sequences of actions to achieve a goal.

* **Control:** Controlling the robot's movements and actions.

* **Localization and Mapping:** Enabling robots to determine their
location and build a map of their environment.

* **Applications:** Manufacturing, logistics, healthcare, exploration,
and more.

**The AI Development Process (Simplified)**

Here's a simplified view of how an AI project typically unfolds:

1. **Define the Problem:** Clearly identify the task you want the AI to
perform.

2. **Gather Data:** Collect a relevant dataset. The quality and quantity of
data are crucial for AI success.

3. **Choose an Approach:** Select the appropriate AI technique (e.g., machine learning, deep learning, rule-based system).

4. **Build and Train the Model:** Develop and train the AI model using the
collected data.

5. **Evaluate the Model:** Assess the model's performance and make
adjustments as needed.

6. **Deploy and Monitor:** Deploy the AI system and continuously monitor
its performance, retraining as needed.

**Important Considerations:**

* **Ethics:** AI raises important ethical considerations, such as bias in
algorithms, privacy concerns, and the potential for job displacement.

* **Bias:** AI models can inherit biases from the data they are trained
on, leading to unfair or discriminatory outcomes.

* **Explainability:** Some AI models (especially deep learning models) can
be difficult to understand and explain, which raises concerns about
accountability and trust.

* **Security:** AI systems can be vulnerable to attacks, such as
adversarial attacks that can fool the system into making incorrect
predictions.

**In Summary:**

AI is a broad and rapidly evolving field that aims to create intelligent
machines. It relies on a variety of techniques, including machine learning,
deep learning, natural language processing, knowledge representation, and
robotics. While AI has made remarkable progress in recent years, it also
presents significant challenges and ethical considerations that must be
addressed. It's a field with immense potential to transform many aspects of
our lives, but it's important to approach it responsibly.

3. Mistral (La Plateforme) – Free API

Mistral offers a variety of models for different applications, focusing on high performance. The platform allows 1 request per second and 500,000 tokens per minute. Some models available include:

mistral-large-2402
mistral-8b-latest

All available models: Link
Documentation: Link

Advantages

High request limits.
Focus on experimentation.

Pricing: Free tier available.

Example Code

import os
from mistralai import Mistral
api_key = os.environ["MISTRAL_API_KEY"]
model = "mistral-large-latest"
client = Mistral(api_key=api_key)
chat_response = client.chat.complete(
   model= model,
   messages = [
       {
           "role": "user",
           "Content": "What is the best French cheese?",
       },
   ]
)
print(chat_response.choices[0].message.content)

Output

The "best" French cheese can be subjective as it depends on personal taste
 preferences. However, some of the most famous and highly regarded French
 cheeses include:

1. Roquefort: A blue-veined sheep's milk cheese from the Massif Central
 region, known for its strong, pungent flavor and creamy texture.

2. Brie de Meaux: A soft, creamy cow's milk cheese with a white rind,
 originating from the Brie region near Paris. It is known for its mild,
 buttery flavor and can be enjoyed at various stages of ripeness.

3. Camembert: Another soft, creamy cow's milk cheese with a white rind,
 similar to Brie de Meaux, but often more pungent and runny. It comes from
 the Normandy region.

4. Comté: A hard, nutty, and slightly sweet cow's milk cheese from the
 Franche-Comté region, often used in fondues and raclettes.

5. Munster: A semi-soft, washed-rind cow's milk cheese from the Alsace
 region, known for its strong, pungent aroma and rich, buttery flavor.

6. Reblochon: A semi-soft, washed-rind cow's milk cheese from the Savoie
 region, often used in fondue and tartiflette.

4. HuggingFace Serverless Inference – Free API

HuggingFace provides a platform for deploying and using various open models. It is limited to models smaller than 10GB and offers variable credits per month.

Some models available include:

GPT-2
DistilBERT

All available models: Link
Documentation: Link

Advantages

Wide range of models.
Easy integration.

Pricing: Variable credits per month.

Example Code

from huggingface_hub import InferenceClient
client = InferenceClient(
 provider="hf-inference",
 api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxx"

)
messages = [
 {
   "role": "user",
   "content": "What is the capital of Germany?"
 }
]
completion = client.chat.completions.create(
   model="meta-llama/Meta-Llama-3-8B-Instruct",
 messages=messages,
 max_tokens=500,
)
print(completion.choices[0].message)

Output

ChatCompletionOutputMessage(role='assistant', content='The capital of Germany
 is Berlin.', tool_calls=None)

5. Cerebras – Free API

Cerebras provides access to Llama models with a focus on high performance. The platform allows 30 requests per minute and 60,000 tokens per minute.

Some models available include:

Llama 3.1 8B
Llama 3.3 70B

All available models: Link
Documentation: Link

Advantages

High request limits.
Powerful models.

Pricing: Free tier available, join the waitlist

Example Code

import os
from cerebras.cloud.sdk import Cerebras
client = Cerebras(
 api_key=os.environ.get("CEREBRAS_API_KEY"),
)
chat_completion = client.chat.completions.create(
 messages=[
 {"role": "user", "content": "Why is fast inference important?",}
],
 model="llama3.1-8b",
)

Output

Fast inference is crucial in various applications because it has several
 benefits, including:

1. **Real-time decision making**: In applications where decisions need to be
 made in real-time, such as autonomous vehicles, medical diagnosis, or online
 recommendation systems, fast inference is essential to avoid delays and
 ensure timely responses.

2. **Scalability**: Machine learning models can process a high volume of data
 in real-time, which requires fast inference to keep up with the pace. This
 ensures that the system can handle large numbers of users or events without
 significant latency.

3. **Energy efficiency**: In deployment environments where power consumption
 is limited, such as edge devices or mobile devices, fast inference can help
 optimize energy usage by reducing the time spent on computations.

4. **Cost-effectiveness**: Faster inference can help reduce computing
 resources, such as GPU or CPU capacity, which can lead to lower costs and
 more efficient usage.

5. **Improved user experience**: Fast inference ensures that users receive
 quick and accurate results, leading to a better overall experience and
 increasing user engagement.

6. **Reduced latency**: In applications where latency is critical, such as
 online gaming, voice assistants, or customer service, fast inference
 minimizes the time between user input and response, resulting in a smoother
 experience.

7. **Optimization for inference engines**: Many inference engines have
 optimized for faster inference speeds for deployment on edge devices. Some
 cloud-based services specifically optimize their inference speed and
 latency.

Key areas where fast inference is essential include:

1. **Computer vision**: Applications like image classification, object
 detection, and facial recognition require fast inference to analyze and
 process visual data in real-time.

2. **Natural Language Processing (NLP)**: NLP models need fast inference to
 understand and process text input, such as chatbots, speech recognition, and
 sentiment analysis.

3. **Recommendation systems**: Online recommendation systems rely on fast
 inference to predict and personalize user experiences.

4. **Autonomous systems**: Autonomous vehicles, drones, and robots require
 fast inference to make real-time decisions about navigation, obstacle
 avoidance, and control.

In summary, fast inference is crucial in various applications where real-time
 decision making, scalability, energy efficiency, cost-effectiveness, user
 experience, and reduced latency are critical factors.Fast inference is
 crucial in various applications because it has several benefits, including:

1. **Real-time decision making**: In applications where decisions need to be
 made in real-time, such as autonomous vehicles, medical diagnosis, or online
 recommendation systems, fast inference is essential to avoid delays and
 ensure timely responses.

2. **Scalability**: Machine learning models can process a high volume of data
 in real-time, which requires fast inference to keep up with the pace. This
 ensures that the system can handle large numbers of users or events without
 significant latency.

3. **Energy efficiency**: In deployment environments where power consumption
 is limited, such as edge devices or mobile devices, fast inference can help
 optimize energy usage by reducing the time spent on computations.

4. **Cost-effectiveness**: Faster inference can help reduce computing
 resources, such as GPU or CPU capacity, which can lead to lower costs and
 more efficient usage.

5. **Improved user experience**: Fast inference ensures that users receive
 quick and accurate results, leading to a better overall experience and
 increasing user engagement.

6. **Reduced latency**: In applications where latency is critical, such as
 online gaming, voice assistants, or customer service, fast inference
 minimizes the time between user input and response, resulting in a smoother
 experience.

7. **Optimization for inference engines**: Many inference engines have
 optimized for faster inference speeds for deployment on edge devices. Some
 cloud-based services specifically optimize their inference speed and
 latency.

Key areas where fast inference is essential include:

1. **Computer vision**: Applications like image classification, object
 detection, and facial recognition require fast inference to analyze and
 process visual data in real-time.

2. **Natural Language Processing (NLP)**: NLP models need fast inference to
 understand and process text input, such as chatbots, speech recognition, and
 sentiment analysis.

3. **Recommendation systems**: Online recommendation systems rely on fast
 inference to predict and personalize user experiences.

4. **Autonomous systems**: Autonomous vehicles, drones, and robots require
 fast inference to make real-time decisions about navigation, obstacle
 avoidance, and control.

In summary, fast inference is crucial in various applications where real-time
 decision making, scalability, energy efficiency, cost-effectiveness, user
 experience, and reduced latency are critical factors.

6. Groq – Free API

Groq offers various models for different applications, allowing 1,000 requests per day and 6,000 tokens per minute.

Some models available include:

DeepSeek R1 Distill Llama 70B
Gemma 2 9B Instruct

All available models: Link
Documentation: Link

Advantages

High request limits.
Diverse model options.

Pricing: Free tier available.

Example Code

import os
from groq import Groq
client = Groq(
   api_key=os.environ.get("GROQ_API_KEY"),
)
chat_completion = client.chat.completions.create(
   messages=[
       {
           "role": "user",
           "content": "Explain the importance of fast language models",
       }
   ],
   model="llama-3.3-70b-versatile",
)
print(chat_completion.choices[0].message.content)

Output

Fast language models are crucial for various applications and industries, and
their importance can be highlighted in several ways:

1. **Real-Time Processing**: Fast language models enable real-time processing
of large volumes of text data, which is essential for applications such as:

* Chatbots and virtual assistants (e.g., Siri, Alexa, Google Assistant) that
need to respond quickly to user queries.

* Sentiment analysis and opinion mining in social media, customer feedback,
and review platforms.

* Text classification and filtering in email clients, spam detection, and content moderation.

2. **Improved User Experience**: Fast language models provide instant responses, which is vital for:

* Enhancing user experience in search engines, recommendation systems, and
content retrieval applications.

* Supporting real-time language translation, which is essential for global
communication and collaboration.

* Facilitating quick and accurate text summarization, which helps users to
quickly grasp the main points of a document or article.

3. **Efficient Resource Utilization**: Fast language models:

* Reduce the computational resources required for training and deployment,
making them more energy-efficient and cost-effective.

* Enable the processing of large volumes of text data on edge devices, such
as smartphones, smart home devices, and wearable devices.

4. **Competitive Advantage**: Organizations that leverage fast language models can:

* Respond faster to changing market conditions, customer needs, and competitor activity.

* Develop more accurate and personalized models, which can lead to improved
customer engagement, retention, and acquisition.

5. **Research and Development**: Fast language models accelerate the research
and development process in natural language processing (NLP) and artificial
intelligence (AI), allowing researchers to:

* Quickly test and validate hypotheses, which can lead to new breakthroughs
and innovations.

* Explore new applications and domains, such as multimodal processing,
explainability, and interpretability.

6. **Scalability and Flexibility**: Fast language models can be easily scaled
up or down to accommodate varying workloads, making them suitable for:

* Cloud-based services, where resources can be dynamically allocated and
deallocated.

* On-premises deployments, where models need to be optimized for specific
hardware configurations.

7. **Edge AI and IoT**: Fast language models are essential for edge AI and
IoT applications, where:

* Low-latency processing is critical for real-time decision-making, such as
in autonomous vehicles, smart homes, and industrial automation.

* Limited computational resources and bandwidth require efficient models that
can operate effectively in resource-constrained environments.

In summary, fast language models are essential for various applications,
industries, and use cases, as they enable real-time processing, improve user
experience, reduce computational resources, and provide a competitive
advantage.

7. Scaleway Generative Free API

Scaleway offers a variety of generative models for free, with 100 requests per minute and 200,000 tokens per minute.

Some models available include:

BGE-Multilingual-Gemma2
Llama 3.1 70B Instruct

All available models: Link
Documentation: Link

Advantages

Generous request limits.
Variety of models.

Pricing: Free beta until March 2025.

Example Code

from openai import OpenAI

# Initialize the client with your base URL and API key
client = OpenAI(
   base_url="https://api.scaleway.ai/v1",
   api_key="<SCW_API_KEY>"
)
# Create a chat completion for Llama 3.1 8b instruct
completion = client.chat.completions.create(
   model="llama-3.1-8b-instruct",
   messages=[{"role": "user", "content": "Describe a futuristic city with advanced technology and green energy solutions."}],
   temperature=0.7,
   max_tokens=100
)
# Output the result
print(completion.choices[0].message.content)

Output

**Luminaria City 2125: A Beacon of Sustainability**

Perched on a coastal cliff, Luminaria City is a marvel of futuristic
 architecture and innovative green energy solutions. This self-sustaining
 metropolis of the year 2125 is a testament to humanity's ability to engineer
 a better future.

**Key Features:**

1. **Energy Harvesting Grid**: A network of piezoelectric tiles covering the
 city's streets and buildings generates electricity from footsteps,
 vibrations, and wind currents. This decentralized energy system reduces
 reliance on fossil fuels and makes Luminaria City nearly carbon-neutral.

2. **Solar Skiescraper**: This 100-story skyscraper features a unique double-
glazed facade with energy-generating windows that amplify solar radiation,
 providing up to 300% more illumination and 50% more energy for the city's
 homes and businesses.

3. **Floating Farms**: Aerodynamically designed and vertically integrated
 cities of the future have floating aerial fields providing urban
 communities' with access to fresh locally sourced goods such as organics.

4. **Smart-Grid Management**: An advanced artificial intelligence system,
 dubbed SmartLum, oversees energy distribution, optimizes resource
 allocation, and adjusts energy production according to demand.

5. **Water Management**: Self-healing, concrete-piezoelectric stormwater
 harvesting systems ensure pure drinking water for residents, using the
 potential energy generated by vibrations in stormwater flow for generating
 electrical energy for Luminaria.

6. **Algae-Based Oxygenation**: A 10-kilometer-long algae-based bio-reactor
 embedded in the city's walls and roof helps purify the atmosphere, produce
 oxygen, and create valuable bio-energy molecules.

7. **Electric-Vehicle Infrastructure**: From sleek personal magnetometers to
 large-scale omnibus systems, sustainable urban transportation is entirely
 electric, effortlessly integrated with Luminaria City's omnipresent AI
 network.

8. **Sky Tree**: A slender, aerodynamically-engineered skyscraper extends
 high into the atmosphere, acting as a giant wind turbine and rainwater
 harvester.

9. **Botanical Forestal Architecture**: The innovative "Forest Walls"
 integrate living plants, water-collecting surfaces, and carbon capture
 infrastructure to sustain life in a unique symbiotic process.

10. **Advanced Public Waste Systems**: An ultra-efficient system assimilates,
 recycles and combusts the city's waste efficiently and sustainably due to
 advanced waste-pre-treatment facilities.

**Luminaria City: The Model for a Sustainable Future**

Luminaria City showcases humanity's ability to reimagine urban planning and
 technologies to preserve a thriving planet. By harnessing advanced
 technologies, harnessed new, and maximizing human symbiosis with nature,
 this stunning metropolis will inspire cities around the world to embark on
 their own sustainable journey to a brighter future.

8. OVH AI Endpoints – Free API

OVH provides access to various AI models for free, allowing 12 requests per minute. Some models available include:

CodeLlama 13B Instruct
Llama 3.1 70B Instruct

Documentation and All available models:https://endpoints.ai.cloud.ovh.net/

Advantages

Easy to use.
Variety of models.

Pricing: Free beta available.

Example Code

import os
from openai import OpenAI
client = OpenAI(
   base_url='https://llama-2-13b-chat-hf.endpoints.kepler.ai.cloud.ovh.net/api/openai_compat/v1',
   api_key=os.getenv("OVH_AI_ENDPOINTS_ACCESS_TOKEN")
)
def chat_completion(new_message: str) -> str:
   history_openai_format = [{"role": "user", "content": new_message}]
   return client.chat.completions.create(
       model="Llama-2-13b-chat-hf",
       messages=history_openai_format,
       temperature=0,
       max_tokens=1024
   ).choices.pop().message.content
if __name__ == '__main__':
   print(chat_completion("Write a story in the style of James Joyce. The story should be about a trip to the Irish countryside in 2083, to see the beautiful scenery and robots.d"))

Output

Sure, I'd be happy to help! Here's a story in the style of James Joyce, set
 in the Irish countryside in 2083: As I stepped off the pod-train and onto
 the lush green grass of the countryside, the crisp air filled my lungs and
 invigorated my senses. The year was 2083, and yet the rolling hills and
 sparkling lakes of Ireland seemed unchanged by the passage of time. The only
 difference was the presence of robots, their sleek metallic bodies and
 glowing blue eyes a testament to the advancements of technology. I had come
 to this place seeking solace and inspiration, to lose myself in the beauty
 of nature and the wonder of machines. As I wandered through the hills, I
 came across a group of robots tending to a field of crops, their delicate
 movements and precise calculations ensuring a bountiful harvest. One of the
 robots, a sleek and agile model with wings like a dragonfly, fluttered over
 to me and offered a friendly greeting. "Good day, traveler," it said in a
 melodic voice. "What brings you to our humble abode?" I explained my desire
 to experience the beauty of the Irish countryside, and the robot nodded
 sympathetically. "Ah, yes," it said. "There is much to see and explore here.
 Would you like a guided tour?" I eagerly accepted the offer, and the robot
 led me on a journey through the rolling hills and sparkling lakes. We saw
 towering waterfalls and ancient ruins, and the robot shared stories of the
 history and culture of the land. As we walked, the sun began to set, casting
 a golden glow over the landscape. As the stars began to twinkle in the night
 sky, the robot and I sat down on a hill overlooking the countryside. "This
 is a special place," the robot said, its voice filled with a sense of
 wonder. "A place where nature and technology coexist in harmony." I nodded
 in agreement, feeling a sense of awe and gratitude for this wondrous place.
 And as I looked out at the stars, I knew that this trip to the

9. Together Free API

Together is a collaborative platform for accessing various LLMs, with no specific limits mentioned. Some models available include:

Llama 3.2 11B Vision Instruct
DeepSeek R1 Distil Llama 70B

All available models: Link
Documentation: Link

Advantages

Access to a range of models.
Collaborative environment.

Pricing: Free tier available.

Example Code

from together import Together
client = Together()
stream = client.chat.completions.create(
 model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
 messages=[{"role": "user", "content": "What are the top 3 things to do in New York?"}],
 stream=True,
)
for chunk in stream:
 print(chunk.choices[0].delta.content or "", end="", flush=True)

Output

The city that never sleeps - New York! There are countless things to see and
 do in the Big Apple, but here are the top 3 things to do in New York:

1. **Visit the Statue of Liberty and Ellis Island**: Take a ferry to Liberty
 Island to see the iconic Statue of Liberty up close. You can also visit the
 Ellis Island Immigration Museum to learn about the history of immigration in
 the United States. This is a must-do experience that offers breathtaking
 views of the Manhattan skyline.

2. **Explore the Metropolitan Museum of Art**: The Met, as it's
 affectionately known, is one of the world's largest and most famous museums.
 With a collection that spans over 5,000 years of human history, you'll find
 everything from ancient Egyptian artifacts to modern and contemporary art.
 The museum's grand architecture and beautiful gardens are also worth
 exploring.

3. **Walk across the Brooklyn Bridge**: This iconic bridge offers stunning
 views of the Manhattan skyline, the East River, and Brooklyn. Take a
 leisurely walk across the bridge and stop at the Brooklyn Bridge Park for
 some great food and drink options. You can also visit the Brooklyn Bridge's
 pedestrian walkway, which offers spectacular views of the city.

Of course, there are many more things to see and do in New York, but these
 three experiences are a great starting point for any visitor.

Additional suggestions:

- Visit the Top of the Rock Observation Deck for panoramic views of the city.

- Take a stroll through Central Park, which offers a peaceful escape from the
 hustle and bustle of the city.

- Catch a Broadway show or a performance at one of the many music venues in
 the city.

- Explore the vibrant neighborhoods of Chinatown, Little Italy, and Greenwich
 Village.

- Visit the 9/11 Memorial & Museum to pay respects to the victims of the 9/11 attacks.

Remember to plan your itinerary according to your interests and the time of
 year you visit, as some attractions may have limited hours or be closed due
 to weather or other factors.

10. Cohere – Free API

Cohere provides access to powerful language models for various applications, allowing 20 requests per minute and 1,000 requests per month. Some models available include:

Command-R
Command-R+

All available models: Link
Documentation: Link

Advantages

Easy to use.
Focus on NLP tasks.

Pricing: Free tier available.

Example Code

import cohere
co = cohere.ClientV2("<<apiKey>>")
response = co.chat(
   model="command-r-plus",
   messages=[{"role": "user", "content": "hello world!"}]
)
print(response)

Output

id='703bd967-fbb0-4758-bd60-7fe01b1984c7' finish_reason='COMPLETE'
 prompt=None message=AssistantMessageResponse(role='assistant',
 tool_calls=None, tool_plan=None, content=
[TextAssistantMessageResponseContentItem(type='text', text='Hello! How can I
 help you today?')], citations=None)
 usage=Usage(billed_units=UsageBilledUnits(input_tokens=3.0,
 output_tokens=9.0, search_units=None, classifications=None),
 tokens=UsageTokens(input_tokens=196.0, output_tokens=9.0)) logprobs=None

11. GitHub Models – Free API

GitHub offers a collection of various AI models, with rate limits dependent on the subscription tier.

Some models available include:

AI21 Jamba 1.5 Large
Cohere Command R

Documentation and All available models: Link

Advantages

Access to a wide range of models.
Integration with GitHub.

Pricing: Free with a GitHub account.

Example Code

import os
from openai import OpenAI
token = os.environ["GITHUB_TOKEN"]
endpoint = "https://models.inference.ai.azure.com"
model_name = "gpt-4o"
client = OpenAI(
   base_url=endpoint,
   api_key=token,
)
response = client.chat.completions.create(
   messages=[
       {
           "role": "system",
           "content": "You are a helpful assistant.",
       },
       {
           "role": "user",
           "content": "What is the capital of France?",
       }
   ],
   temperature=1.0,
   top_p=1.0,
   max_tokens=1000,
   model=model_name
)
print(response.choices[0].message.content)

Output

The capital of France is **Paris**.

12. Fireworks AI – Free API

Fireworks offer a range of various powerful AI models, with Serverless inference up to 6,000 RPM, 2.5 billion tokens/day

Some models available include:

Llama-v3p1-405b-instruct.
deepseek-r1

All available models: Link
Documentation: Link

Advantages

Cost-effective customization
Fast Inferencing.

Pricing: Free credits are available for $1.

Example Code

from fireworks.client import Fireworks
client = Fireworks(api_key="<FIREWORKS_API_KEY>")
response = client.chat.completions.create(
model="accounts/fireworks/models/llama-v3p1-8b-instruct",
messages=[{
  "role": "user",
  "content": "Say this is a test",
}],
)
print(response.choices[0].message.content)

Output

I'm ready for the test! Please go ahead and provide the questions or prompt
 and I'll do my best to respond.

Benefits of Using LLM-Free APIs

Accessibility: No need for deep AI expertise or infrastructure investment.
Customization: Fine-tune models for specific tasks or domains.
Scalability: Handle large volumes of requests as your business grows.

Tips for Efficient Use of LLM-Free APIs

Choose the Right Model: Start with simpler models for basic tasks and scale up as needed.
Monitor Usage: Use dashboards to track token consumption and set spending limits.
Optimize Tokens: Craft concise prompts to minimize token usage while still achieving desired outcomes.

Conclusion

With the availability of these free APIs, developers and businesses can easily integrate advanced AI capabilities into their applications without significant upfront costs. By leveraging these resources, you can enhance user experiences, automate tasks, and drive innovation in your projects. Start exploring these APIs today and unlock the potential of AI in your applications.

Harsh Mishra

Harsh Mishra is an AI/ML Engineer who spends more time talking to Large Language Models than actual humans. Passionate about GenAI, NLP, and making machines smarter (so they don’t replace him just yet). When not optimizing models, he’s probably optimizing his coffee intake. 🚀☕

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Top 12 Free APIs for AI Development

Table of contents

Understanding LLM APIs

Pricing and Tokens

Free API for LLMs Resources

1. OpenRouter – Free API

Advantages

Example Code

Output

2. Google AI Studio – Free API

Advantages

Example Code

Output

3. Mistral (La Plateforme) – Free API

Advantages

Example Code

Output

4. HuggingFace Serverless Inference – Free API

Advantages

Example Code

Output

5. Cerebras – Free API

Advantages

Example Code

Output

6. Groq – Free API

Advantages

Example Code

Output

7. Scaleway Generative Free API

Advantages

Example Code

Output

8. OVH AI Endpoints – Free API

Advantages

Example Code

Output

9. Together Free API

Advantages

Example Code

Output

10. Cohere – Free API

Advantages

Example Code

Output

11. GitHub Models – Free API

Advantages

Example Code

Output

12. Fireworks AI – Free API

Advantages

Example Code

Output

Benefits of Using LLM-Free APIs

Tips for Efficient Use of LLM-Free APIs

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us