Text Detection From Images Using EasyOCR: Hands-on guide

Aman Preet Last Updated : 16 Dec, 2024

8 min read

Text detection from images is an essential technology in many applications, including document processing, image search, and machine translation. With the advancement of optical character recognition (OCR) technology, text detection has become more accurate and efficient, enabling businesses and organizations to extract useful information from images quickly. This article introduces EasyOCR, a powerful and user-friendly OCR library that can detect and extract text from various image formats. We will explore the features of EasyOCR, its advantages over other OCR libraries, and how you can implement it in real-world applications.

In this article, you will learn about EasyOCR, a simple tool for reading text from images using Python. We will look at the EasyOCR API, how it works with different languages, and the EasyOCR model that helps recognize text. This guide will help you understand how to use EasyOCR in your projects easily.

This article was published as a part of the Data Science Blogathon

What is OCR?
Purpose of OCR
What is EasyOCR?
Install Core Dependencies
Importing Libraries
Reading Images
Extracting Text from the Image
Drawing Results on Images
Conclusion
Frequently Asked Questions

What is OCR?

OCR is formerly known as Optical Character Recognition which is revolutionary for the digital world nowadays. OCR is actually a complete process under which the images/documents which are present in a digital world are processed and from the text are being processed out as normal editable text.

Purpose of OCR

OCR is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera into editable and searchable data.

Read this article about the Machine Learning Algrotihms

What is EasyOCR?

EasyOCR is actually a python package that holds PyTorch as a backend handler. It detects the text from images but in my reference, while using it I found that it is the most straightforward way to detect text from images also when high end deep learning library(PyTorch) is supporting it in the backend which makes it accuracy more credible. EasyOCR supports 42+ languages for detection purposes. EasyOCR is created by the company named Jaided AI company.

Install Core Dependencies

Pytorch

Installing PyTorch as a complete package can be a little tricky so I would recommend traversing through the official site of PyTorch. When you will open its official site then that’s what you will see in its interface as in the image below.

Install core dependencies easyocr — Image Source: **PyTorch**

Now, if you will look closely at the above image one can find out that there are numerous options available for us to choose from and get the command most compatible according to our choices.

Let me show you a representation of what I’m trying to mention!.

Install core dependencies pytorch — Image Source: **PyTorch**

In the above representation, one can notice that I have chosen the Package: pip and Compute platform: CPU and based on my choices I got the command as – pip install torch torchvision torchaudio. After getting this command it would be like walking on a cake, simply just run this command on your command prompt and your PyTorch library will be installed successfully.

EasyOCR

After installing the PyTorch library successfully it’s quite easy to install the EasyOCR library, one just has to run the following command:

pip3 install easyocr

Then your command prompt interface will be like:

Importing Libraries

import os
import easyocr
import cv2
from matplotlib import pyplot as plt
import numpy as np

Reading Images

Taking an online image: Here we will take an image from a URL (online)

IMAGE_PATH = 'https://blog.aspose.com/wp-content/uploads/sites/2/2020/05/Perform-OCR-using-C.jpg'

In the above code snippet, one can notice that the IMAGE_PATH holds the URL of the image.

Taking image as input locally: Here we will take an image from the local system.

IMAGE_PATH = 'Perform-OCR.jpg'

In the above code snippet, one can notice that I have taken the image locally i.e. from the local system.

Extracting Text from the Image

English text detection

reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result

Output:

[[[[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR']]

Adding an image for your preference.

Extracting text from the image EasyOCR — Image Source: LaptrinhX

Now finally, we have extracted the text from the given image

Let’s break down code line by line:

Here, we are using the Reader class from easyocr class and then passing [‘en’] as an attribute which means that now it will only detect the English part of the image as text, if it will find other languages like Chinese and Japanese then it will ignore those text.
Now, as in the above line, we have set the attribute for language so, here we are loading the IMAGE_PATH in the readText() function and one will find out a parameter which is “paragraph” here it is set as False which means that now easyOCR will not combine the results i.e. if easyocr will encounter multiple texts it will not combine them instead it will show them separately.
Getting the result in the form of a 2-D NumPy array.

Turkish text detection

# Changing the image path
IMAGE_PATH = 'Turkish_text.png'

# Same code here just changing the attribute from ['en'] to ['zh']
reader = easyocr.Reader(['tr'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result

Output:

[[[[89, 7], [717, 7], [717, 108], [89, 108]],
  'Most Common Texting Slang in Turkish'],
 [[[392, 234], [446, 234], [446, 260], [392, 260]], 'test'],
 [[[353, 263], [488, 263], [488, 308], [353, 308]], 'yazmak'],
 [[[394, 380], [446, 380], [446, 410], [394, 410]], 'link'],
 [[[351, 409], [489, 409], [489, 453], [351, 453]], 'bağlantı'],
 [[[373, 525], [469, 525], [469, 595], [373, 595]], 'tag etiket'],
 [[[353, 674], [483, 674], [483, 748], [353, 748]], 'follov takip et']]

For your preference I’m adding the image to which I have done this Turkish text detection!

Fact

EasyOCR currently supports 42 languages I have provided the set of all those languages with their notations. Have fun with it guys!

Afrikaans (af), Azerbaijani (az), Bosnian (bs), Czech (cs), Welsh (cy), Danish (da), German (de), English (en), Spanish (es), Estonian (et), French (fr), Irish (ga), Croatian (hr), Hungarian (hu), Indonesian (id), Icelandic (is), Italian (it), Japanese (ja), Korean (ko), Kurdish (ku), Latin (la), Lithuanian (lt), Latvian (lv), Maori (mi), Malay (ms), Maltese (mt), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt),Romanian (ro), Slovak (sk), Slovenian (sl), Albanian (sq), Swedish (sv),Swahili (sw), Thai (th), Tagalog (tl), Turkish (tr), Uzbek (uz), Vietnamese (vi), Chinese (zh) – Source: JaidedAI

EasyOCR provides enough flexibility to choose Text detection with GPU or without.

Extracting text from image with GPU

# Changing the image path
IMAGE_PATH = 'Turkish_text.png'

reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[89, 7], [717, 7], [717, 75], [89, 75]],
  'Most Common Texting Slang',
  0.8411301022318493),
 ([[296, 60], [504, 60], [504, 108], [296, 108]],
  'in Turkish',
  0.9992136162168752),
 ([[392, 234], [446, 234], [446, 260], [392, 260]], 'text', 0.955612246445849),
 ([[353, 263], [488, 263], [488, 308], [353, 308]],
  'yazmak',
  0.8339281200424168),
 ([[394, 380], [446, 380], [446, 410], [394, 410]],
  'link',
  0.8571656346321106),
 ([[351, 409], [489, 409], [489, 453], [351, 453]],
  'baglanti',
  0.9827189297769966),
 ([[393, 525], [446, 525], [446, 562], [393, 562]], 'tag', 0.999996145772132),
 ([[373, 559], [469, 559], [469, 595], [373, 595]],
  'etiket',
  0.9999972515293261),
 ([[378, 674], [460, 674], [460, 704], [378, 704]],
  'follow',
  0.9879666041306504),
 ([[353, 703], [483, 703], [483, 748], [353, 748]],
  'takip et',
  0.9987622244733467)]

Extracting text from image without GPU

# Changing the image path
IMAGE_PATH = 'Perform-OCR.jpg'

reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR', 0.990493426051807)]
# Where 0.9904.. is the confidence level of detection

Note: If you don’t have the GPU and yet you are not setting it as False then you will get the following warning:

Checkout this article about the Optical Character Recognition (OCR)

Drawing Results on Images

Draw results for single-line text – Example 1

top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX

In the above code snippet,

We are trying to get the coordinates to draw the bounding box and text over our image on which we have to perform our detection.
In the top_left variable, we are getting the coordinate of the top_left corner in the form of tuple accessing from results. Similarly, we can see that in the bottom_right coordinate.
Getting the coordinate of text from 2-d array format
Choosing font of text as FONT_HERSHEY_SIMPLEX from cv2 package.

img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,bottom_right, font, 0.5,(0,255,0),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

Now, as if we have got the coordinates let’s just plot them!

Reading the image using the cv2 imread() function
Drawing the rectangle using top_left and bottom_right coordinates and giving a descent color((0,255,0)) and thickness(3).
Drawing text over the image by using top_left coordinate (just above the rectangle- bounding box)
Showing the image

Output:

Draw Results for single-line text – Example 2

IMAGE_PATH = 'sign.png'
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[19, 181], [165, 181], [165, 201], [19, 201]],
  'HEAD PROTECTION',
  0.9778256296390029),
 ([[31, 201], [153, 201], [153, 219], [31, 219]],
  'MUST BE WORN',
  0.9719649866726915),
 ([[39, 219], [145, 219], [145, 237], [39, 237]],
  'ON THIS SITE',
  0.9683973478739152)]

Getting the coordinates

top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX

Drawing Text and Bounding Boxes

img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,top_left, font, 0.5,(0,0,255),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

Output:

But hold on! What if we want to see the all text detection in an image itself?

That’s what I’ll do in this section!

Draw Results for Multiple Lines

img = cv2.imread(IMAGE_PATH)
spacer = 100
for detection in result: 
    top_left = tuple(detection[0][0])
    bottom_right = tuple(detection[0][2])
    text = detection[1]
    img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
    img = cv2.putText(img,text,(20,spacer), font, 0.5,(0,255,0),2,cv2.LINE_AA)
    spacer+=15

plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

In the above code snippet, we just need to focus on few points:

Instead of detecting one-line text, here we are looping through all the detection as we want to plot multiple lines of text
While giving the coordinates on cv2.putText we are using an extra variable which is “spacer” this spacer later in the code is being incremented to +15 which is helping to restrict the text to collide over each other.
This spacer variable will help the text to remain sorted and equally spaced.

Output:

The conclusion of the model also concludes my discussion for today 🙂

Conclusion

In conclusion, EasyOCR is an excellent tool for text detection from images, providing a simple and effective way to extract text from images with high accuracy. The library’s easy-to-use interface and powerful algorithms make it an ideal solution for businesses and organizations needing to process large volumes of documents and images quickly.

If you want to learn more about OCR technology and other data science tools and techniques, we invite you to explore our Blackbelt program. Our comprehensive training program provides in-deptha instruction on various data science topics, including OCR, machine learning, and artificial intelligence. With our Blackbelt program, you can gain the skills and knowledge you need to advance your career and stay at the forefront of this exciting field. Sign up today to take the first step toward becoming a data scientist!

Frequently Asked Questions

Q1.What is EasyOCR used for?

EasyOCR is used for extracting text from images or scanned documents. It supports multiple languages and works well for tasks like document reading and image-based text recognition.

Q2. Is EasyOCR better than Tesseract?

EasyOCR is often considered better for complex text recognition tasks, especially for multi-language support and reading text in different fonts or handwriting. However, Tesseract is more popular for basic OCR tasks and is open-source.

Q3. Can I train EasyOCR?

Yes, you can train EasyOCR on custom datasets if you need better accuracy for specific languages or text styles. It supports custom model training with some technical setup.

Aman Preet

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Harish Nagpal

Well written

Ran

Very helpful! Thank you!

Omkar Gupta

Downloading detection model, please wait. This may take several minutes depending upon your network connection. Traceback (most recent call last):---------------------------| 2.0% Complete File "globtesting.py", line 14, in reader = easyocr.Reader(['en'], gpu=False) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\easyocr.py", line 90, in __init__ download_and_unzip(detection_models[detector_model]['url'], detection_models[detector_model]['filename'], self.model_storage_directory, verbose) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\utils.py", line 586, in download_and_unzip urlretrieve(url, zip_path, reporthook=reporthook) File "C:\Users\Onkar\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 283, in urlretrieve reporthook(blocknum, bs, size) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\utils.py", line 686, in progress_hook print(f'\r{prefix} |{bar}| {percent}% {suffix}', end = printEnd) File "C:\Users\Onkar\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2588' in position 12: character maps to

Reading list

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to CNN and implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Experiments with Generative Adversarial Networks

Zero and Few Shot Learning

Model Deployment

Text Detection From Images Using EasyOCR: Hands-on guide

Table of contents

What is OCR?

Purpose of OCR

What is EasyOCR?

Install Core Dependencies

Pytorch

EasyOCR

Importing Libraries

Reading Images

Extracting Text from the Image

English text detection

Turkish text detection

Extracting text from image with GPU

Drawing Results on Images

Draw results for single-line text – Example 1

Draw Results for single-line text – Example 2

Draw Results for Multiple Lines

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)