Imagine transforming any text into a captivating voice at the touch of a button. ElevenLabs is revolutionizing this experience with its state-of-the-art voice synthesis and AI-driven audio solutions, setting new standards in the AI industry. This article takes you through ElevenLabs’ remarkable features, offers a step-by-step demo on effectively using its API, and highlights various real-world applications. Let’s discover how you can fully leverage the power of ElevenLabs and elevate your audio content to new heights.
The ElevenLabs API is a set of programmatic interfaces provided by ElevenLabs, enabling developers to integrate advanced voice synthesis and audio processing capabilities into their applications. Here are the key features and functionalities of the ElevenLabs API:
The API is designed to be easily integrated with applications using RESTful web services, and it requires an API key for authentication and access.
Here’s the overview of the features:
ElevenLabs offers state-of-the-art voice synthesis technology, enabling the creation of lifelike speech from text. The platform supports multiple languages and accents, ensuring a broad reach for global applications.
The TTS feature transforms written text into natural-sounding audio. With high-quality voice outputs, it is ideal for applications in audiobooks, podcasts, and accessibility tools.
Voice cloning allows users to replicate a specific voice. This feature is particularly useful for media production, gaming, and personalized user experiences.
This feature enables real-time conversion of one voice to another, which can be applied in live streaming, virtual assistants, and customer support solutions.
ElevenLabs provides the capability to create custom voice models, tailored to specific needs. This feature is beneficial for branding, content creation, and interactive applications.
Also read: An end-to-end Guide on Converting Text to Speech and Speech to Text
Make sure Python is installed on your computer. You can download and install Python from the official Python website.
import requests
CHUNK_SIZE = 1024
url = "https://api.elevenlabs.io/v1/text-to-speech/EXAVITQu4vr4xnSDxMaL"
headers = {
"Accept": "audio/mpeg",
"Content-Type": "application/json",
"xi-api-key": ""
}
data = {
"text": '''Born and raised in the charming south,
I can add a touch of sweet southern hospitality
to your audiobooks and podcasts''',
"model_id": "eleven_monolingual_v1",
"voice_settings": {
"stability": 0.5,
"similarity_boost": 0.5
}
}
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
with open('output.mp3', 'wb') as f:
for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
if chunk:
f.write(chunk)
print("Audio saved as output.mp3")
else:
print(f"Error: {response.status_code}")
print(response.text)
Output
You can choose to use a different voice by changing the voice_id, which should be passed in the URL; you can find the available voices here.
import requests
url = "https://api.elevenlabs.io/v1/sound-generation"
payload = {
"text": "Car Crash",
"duration_seconds": 123,
"prompt_influence": 123
}
headers = { "Accept": "audio/mpeg",
"Content-Type": "application/json",
"xi-api-key": ""
}
response = requests.post(url, json=data, headers=headers)
if response.status_code == 200:
with open('output_sound.mp3', 'wb') as f:
for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
if chunk:
f.write(chunk)
print("Audio saved as output_sound.mp3")
else:
print(f"Error: {response.status_code}")
print(response.text)
Output
You can replace the text in the payload to generate different sorts of sound effects using Elevenlabs API
import requests
import json
CHUNK_SIZE = 1024 # Size of chunks to read/write at a time
XI_API_KEY = ""
VOICE_ID = "N2lVS1w4EtoT3dr4eOWO" # ID of the voice model to use
AUDIO_FILE_PATH = "output.mp3" # Path to the input audio file
OUTPUT_PATH = "output_new.mp3" # Path to save the output audio file
# Construct the URL for the Speech-to-Speech API request
sts_url = f"https://api.elevenlabs.io/v1/speech-to-speech/{VOICE_ID}/stream"
# Set up headers for the API request, including the API key for authentication
headers = {
"Accept": "application/json",
"xi-api-key": XI_API_KEY
}
# Set up the data payload for the API request, including model ID and voice settings
# Note: voice settings are converted to a JSON string
data = {
"model_id": "eleven_english_sts_v2",
"voice_settings": json.dumps({
"stability": 0.5,
"similarity_boost": 0.8,
"style": 0.0,
"use_speaker_boost": True
})
}
# Set up the files to send with the request, including the input audio file
files = {
"audio": open(AUDIO_FILE_PATH, "rb")
}
# Make the POST request to the STS API with headers, data, and files, enabling streaming response
response = requests.post(sts_url, headers=headers, data=data, files=files, stream=True)
# Check if the request was successful
if response.ok:
# Open the output file in write-binary mode
with open(OUTPUT_PATH, "wb") as f:
# Read the response in chunks and write to the file
for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
f.write(chunk)
# Inform the user of success
print("Audio stream saved successfully.")
else:
# Print the error message if the request was not successful
print(response.text)
Output
I took the output from text to speech model and gave it as an input for the Speech-To-Speech model, you can notice that the voice has changed in the new output audio file.
Also read: Speech to Text Conversion in Python – A Step-by-Step Tutorial
ElevenLabs offers an AI voice technology suite with various features, such as converting text to speech, cloning voices, modifying voices in real-time, and creating custom voice models. Following the instructions in this guide will help you explore and leverage ElevenLabs’ functionalities for numerous creative and practical applications.
Ans. ElevenLabs guarantees the safety and privacy of voice data through strong encryption and adherence to data protection laws.
Ans. It is compatible with a variety of languages and dialects, accommodating a global user base. You can find the full list of supported languages in their official documentation.
Ans. Indeed, ElevenLabs provides a no-cost option with certain usage limitations. For comprehensive details on pricing and usage caps, check their pricing page.
Ans. Yes, definitely! ElevenLabs offers a RESTful API that can be seamlessly connected to numerous programming languages and platforms.