Text Detection From Images Using EasyOCR: Hands-on guide

Aman Preet Last Updated : 11 Nov, 2024
7 min read

Text detection from images is an essential technology in many applications, including document processing, image search, and machine translation. With the advancement of optical character recognition (OCR) technology, text detection has become more accurate and efficient, enabling businesses and organizations to extract useful information from images quickly. This article introduces EasyOCR, a powerful and user-friendly OCR library that can detect and extract text from various image formats. We will explore the features of EasyOCR, its advantages over other OCR libraries, and how you can implement it in real-world applications.

This article was published as a part of the Data Science Blogathon

What is OCR?

OCR is formerly known as Optical Character Recognition which is revolutionary for the digital world nowadays. OCR is actually a complete process under which the images/documents which are present in a digital world are processed and from the text are being processed out as normal editable text.

Purpose of OCR

OCR is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera into editable and searchable data.

What is EasyOCR?

EasyOCR is actually a python package that holds PyTorch as a backend handler. It detects the text from images but in my reference, while using it I found that it is the most straightforward way to detect text from images also when high end deep learning library(PyTorch) is supporting it in the backend which makes it accuracy more credible. EasyOCR supports 42+ languages for detection purposes. EasyOCR is created by the company named Jaided AI company.

EasyOCR What
Source: Github

1. Install Core Dependencies

Pytorch

Installing PyTorch as a complete package can be a little tricky so I would recommend traversing through the official site of PyTorch. When you will open its official site then that’s what you will see in its interface as in the image below.

Install core dependencies easyocr
Image Source: PyTorch

Now, if you will look closely at the above image one can find out that there are numerous options available for us to choose from and get the command most compatible according to our choices.

Let me show you a representation of what I’m trying to mention!.

Install core dependencies pytorch
Image Source: PyTorch

In the above representation, one can notice that I have chosen the Package: pip and Compute platform: CPU and based on my choices I got the command as – pip install torch torchvision torchaudio. After getting this command it would be like walking on a cake, simply just run this command on your command prompt and your PyTorch library will be installed successfully.

EasyOCR

After installing the PyTorch library successfully it’s quite easy to install the EasyOCR library, one just has to run the following command:

pip3 install easyocr

Then your command prompt interface will be like:

command prompt easyocr

2. Importing Libraries

import os
import easyocr
import cv2
from matplotlib import pyplot as plt
import numpy as np

3. Reading Images

  • Taking an online image: Here we will take an image from a URL (online)
IMAGE_PATH = 'https://blog.aspose.com/wp-content/uploads/sites/2/2020/05/Perform-OCR-using-C.jpg'

In the above code snippet, one can notice that the IMAGE_PATH holds the URL of the image.

  • Taking image as input locally: Here we will take an image from the local system.
IMAGE_PATH = 'Perform-OCR.jpg'

In the above code snippet, one can notice that I have taken the image locally i.e. from the local system.

4. Extracting Text from the Image

English text detection

reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result

Output:

[[[[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR']]

Adding an image for your preference.

Extracting text from the image EasyOCR
Image Source: LaptrinhX

Now finally, we have extracted the text from the given image

Let’s break down code line by line:

  1. Here, we are using the Reader class from easyocr class and then passing [‘en’] as an attribute which means that now it will only detect the English part of the image as text, if it will find other languages like Chinese and Japanese then it will ignore those text.
  2. Now, as in the above line, we have set the attribute for language so, here we are loading the IMAGE_PATH in the readText() function and one will find out a parameter which is “paragraph” here it is set as False which means that now easyOCR will not combine the results i.e. if easyocr will encounter multiple texts it will not combine them instead it will show them separately.
  3. Getting the result in the form of a 2-D NumPy array.

Turkish text detection

# Changing the image path
IMAGE_PATH = 'Turkish_text.png'
# Same code here just changing the attribute from ['en'] to ['zh']
reader = easyocr.Reader(['tr'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result

Output:

[[[[89, 7], [717, 7], [717, 108], [89, 108]],
  'Most Common Texting Slang in Turkish'],
 [[[392, 234], [446, 234], [446, 260], [392, 260]], 'test'],
 [[[353, 263], [488, 263], [488, 308], [353, 308]], 'yazmak'],
 [[[394, 380], [446, 380], [446, 410], [394, 410]], 'link'],
 [[[351, 409], [489, 409], [489, 453], [351, 453]], 'bağlantı'],
 [[[373, 525], [469, 525], [469, 595], [373, 595]], 'tag etiket'],
 [[[353, 674], [483, 674], [483, 748], [353, 748]], 'follov takip et']]

For your preference I’m adding the image to which I have done this Turkish text detection!

turkish class 101

Fact

EasyOCR currently supports 42 languages I have provided the set of all those languages with their notations. Have fun with it guys!

Afrikaans (af), Azerbaijani (az), Bosnian (bs), Czech (cs), Welsh (cy), Danish (da), German (de), English (en), Spanish (es), Estonian (et), French (fr), Irish (ga), Croatian (hr), Hungarian (hu), Indonesian (id), Icelandic (is), Italian (it), Japanese (ja), Korean (ko), Kurdish (ku), Latin (la), Lithuanian (lt), Latvian (lv), Maori (mi), Malay (ms), Maltese (mt), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt),Romanian (ro), Slovak (sk), Slovenian (sl), Albanian (sq), Swedish (sv),Swahili (sw), Thai (th), Tagalog (tl), Turkish (tr), Uzbek (uz), Vietnamese (vi), Chinese (zh) – Source: JaidedAI

EasyOCR provides enough flexibility to choose Text detection with GPU or without.

Extracting text from image with GPU

# Changing the image path
IMAGE_PATH = 'Turkish_text.png'
reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[89, 7], [717, 7], [717, 75], [89, 75]],
  'Most Common Texting Slang',
  0.8411301022318493),
 ([[296, 60], [504, 60], [504, 108], [296, 108]],
  'in Turkish',
  0.9992136162168752),
 ([[392, 234], [446, 234], [446, 260], [392, 260]], 'text', 0.955612246445849),
 ([[353, 263], [488, 263], [488, 308], [353, 308]],
  'yazmak',
  0.8339281200424168),
 ([[394, 380], [446, 380], [446, 410], [394, 410]],
  'link',
  0.8571656346321106),
 ([[351, 409], [489, 409], [489, 453], [351, 453]],
  'baglanti',
  0.9827189297769966),
 ([[393, 525], [446, 525], [446, 562], [393, 562]], 'tag', 0.999996145772132),
 ([[373, 559], [469, 559], [469, 595], [373, 595]],
  'etiket',
  0.9999972515293261),
 ([[378, 674], [460, 674], [460, 704], [378, 704]],
  'follow',
  0.9879666041306504),
 ([[353, 703], [483, 703], [483, 748], [353, 748]],
  'takip et',
  0.9987622244733467)]
  • Extracting text from image without GPU
# Changing the image path
IMAGE_PATH = 'Perform-OCR.jpg'
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR', 0.990493426051807)]
# Where 0.9904.. is the confidence level of detection

Note: If you don’t have the GPU and yet you are not setting it as False then you will get the following warning:

GPU

5. Drawing Results on Images

Draw results for single-line text – Example 1

top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX

In the above code snippet,

  1. We are trying to get the coordinates to draw the bounding box and text over our image on which we have to perform our detection.
  2. In the top_left variable, we are getting the coordinate of the top_left corner in the form of tuple accessing from results. Similarly, we can see that in the bottom_right coordinate.
  3. Getting the coordinate of text from 2-d array format
  4. Choosing font of text as FONT_HERSHEY_SIMPLEX from cv2 package.
img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,bottom_right, font, 0.5,(0,255,0),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

Now, as if we have got the coordinates let’s just plot them!

  1. Reading the image using the cv2 imread() function
  2. Drawing the rectangle using top_left and bottom_right coordinates and giving a descent color((0,255,0)) and thickness(3).
  3. Drawing text over the image by using top_left coordinate (just above the rectangle- bounding box)
  4. Showing the image

Output:

OCR output

Draw Results for single-line text – Example 2

IMAGE_PATH = 'sign.png'
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[19, 181], [165, 181], [165, 201], [19, 201]],
  'HEAD PROTECTION',
  0.9778256296390029),
 ([[31, 201], [153, 201], [153, 219], [31, 219]],
  'MUST BE WORN',
  0.9719649866726915),
 ([[39, 219], [145, 219], [145, 237], [39, 237]],
  'ON THIS SITE',
  0.9683973478739152)]

Getting the coordinates

top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX

Drawing Text and Bounding Boxes

img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,top_left, font, 0.5,(0,0,255),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

Output:

Draw Results for single-line text

But hold on! What if we want to see the all text detection in an image itself?

That’s what I’ll do in this section!

Draw Results for Multiple Lines

img = cv2.imread(IMAGE_PATH)
spacer = 100
for detection in result: 
    top_left = tuple(detection[0][0])
    bottom_right = tuple(detection[0][2])
    text = detection[1]
    img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
    img = cv2.putText(img,text,(20,spacer), font, 0.5,(0,255,0),2,cv2.LINE_AA)
    spacer+=15
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

In the above code snippet, we just need to focus on few points:

  1. Instead of detecting one-line text, here we are looping through all the detection as we want to plot multiple lines of text
  2. While giving the coordinates on cv2.putText we are using an extra variable which is “spacer” this spacer later in the code is being incremented to +15 which is helping to restrict the text to collide over each other.
  3. This spacer variable will help the text to remain sorted and equally spaced.

Output:

Draw results for multiple lines

The conclusion of the model also concludes my discussion for today 🙂

Conclusion

In conclusion, EasyOCR is an excellent tool for text detection from images, providing a simple and effective way to extract text from images with high accuracy. The library’s easy-to-use interface and powerful algorithms make it an ideal solution for businesses and organizations needing to process large volumes of documents and images quickly.

If you want to learn more about OCR technology and other data science tools and techniques, we invite you to explore our Blackbelt program. Our comprehensive training program provides in-deptha instruction on various data science topics, including OCR, machine learning, and artificial intelligence. With our Blackbelt program, you can gain the skills and knowledge you need to advance your career and stay at the forefront of this exciting field. Sign up today to take the first step toward becoming a data scientist!

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Responses From Readers

Clear

Harish Nagpal
Harish Nagpal

Well written

Ran
Ran

Very helpful! Thank you!

Omkar Gupta
Omkar Gupta

Downloading detection model, please wait. This may take several minutes depending upon your network connection. Traceback (most recent call last):---------------------------| 2.0% Complete File "globtesting.py", line 14, in reader = easyocr.Reader(['en'], gpu=False) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\easyocr.py", line 90, in __init__ download_and_unzip(detection_models[detector_model]['url'], detection_models[detector_model]['filename'], self.model_storage_directory, verbose) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\utils.py", line 586, in download_and_unzip urlretrieve(url, zip_path, reporthook=reporthook) File "C:\Users\Onkar\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 283, in urlretrieve reporthook(blocknum, bs, size) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\utils.py", line 686, in progress_hook print(f'\r{prefix} |{bar}| {percent}% {suffix}', end = printEnd) File "C:\Users\Onkar\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2588' in position 12: character maps to

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details