Text detection from images is an essential technology in many applications, including document processing, image search, and machine translation. With the advancement of optical character recognition (OCR) technology, text detection has become more accurate and efficient, enabling businesses and organizations to extract useful information from images quickly. This article introduces EasyOCR, a powerful and user-friendly OCR library that can detect and extract text from various image formats. We will explore the features of EasyOCR, its advantages over other OCR libraries, and how you can implement it in real-world applications.
In this article, you will learn about EasyOCR, a simple tool for reading text from images using Python. We will look at the EasyOCR API, how it works with different languages, and the EasyOCR model that helps recognize text. This guide will help you understand how to use EasyOCR in your projects easily.
This article was published as a part of the Data Science Blogathon
OCR is formerly known as Optical Character Recognition which is revolutionary for the digital world nowadays. OCR is actually a complete process under which the images/documents which are present in a digital world are processed and from the text are being processed out as normal editable text.
OCR is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera into editable and searchable data.
Read this article about the Machine Learning Algrotihms
EasyOCR is actually a python package that holds PyTorch as a backend handler. It detects the text from images but in my reference, while using it I found that it is the most straightforward way to detect text from images also when high end deep learning library(PyTorch) is supporting it in the backend which makes it accuracy more credible. EasyOCR supports 42+ languages for detection purposes. EasyOCR is created by the company named Jaided AI company.
Installing PyTorch as a complete package can be a little tricky so I would recommend traversing through the official site of PyTorch. When you will open its official site then that’s what you will see in its interface as in the image below.
Now, if you will look closely at the above image one can find out that there are numerous options available for us to choose from and get the command most compatible according to our choices.
Let me show you a representation of what I’m trying to mention!.
In the above representation, one can notice that I have chosen the Package: pip and Compute platform: CPU and based on my choices I got the command as – pip install torch torchvision torchaudio. After getting this command it would be like walking on a cake, simply just run this command on your command prompt and your PyTorch library will be installed successfully.
After installing the PyTorch library successfully it’s quite easy to install the EasyOCR library, one just has to run the following command:
pip3 install easyocr
Then your command prompt interface will be like:
import os
import easyocr
import cv2
from matplotlib import pyplot as plt
import numpy as np
IMAGE_PATH = 'https://blog.aspose.com/wp-content/uploads/sites/2/2020/05/Perform-OCR-using-C.jpg'
In the above code snippet, one can notice that the IMAGE_PATH holds the URL of the image.
IMAGE_PATH = 'Perform-OCR.jpg'
In the above code snippet, one can notice that I have taken the image locally i.e. from the local system.
reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result
Output:
[[[[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR']]
Adding an image for your preference.
Now finally, we have extracted the text from the given image
Let’s break down code line by line:
# Changing the image path
IMAGE_PATH = 'Turkish_text.png'
# Same code here just changing the attribute from ['en'] to ['zh']
reader = easyocr.Reader(['tr'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result
Output:
[[[[89, 7], [717, 7], [717, 108], [89, 108]], 'Most Common Texting Slang in Turkish'], [[[392, 234], [446, 234], [446, 260], [392, 260]], 'test'], [[[353, 263], [488, 263], [488, 308], [353, 308]], 'yazmak'], [[[394, 380], [446, 380], [446, 410], [394, 410]], 'link'], [[[351, 409], [489, 409], [489, 453], [351, 453]], 'bağlantı'], [[[373, 525], [469, 525], [469, 595], [373, 595]], 'tag etiket'], [[[353, 674], [483, 674], [483, 748], [353, 748]], 'follov takip et']]
For your preference I’m adding the image to which I have done this Turkish text detection!
Fact
EasyOCR currently supports 42 languages I have provided the set of all those languages with their notations. Have fun with it guys!
Afrikaans (af), Azerbaijani (az), Bosnian (bs), Czech (cs), Welsh (cy), Danish (da), German (de), English (en), Spanish (es), Estonian (et), French (fr), Irish (ga), Croatian (hr), Hungarian (hu), Indonesian (id), Icelandic (is), Italian (it), Japanese (ja), Korean (ko), Kurdish (ku), Latin (la), Lithuanian (lt), Latvian (lv), Maori (mi), Malay (ms), Maltese (mt), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt),Romanian (ro), Slovak (sk), Slovenian (sl), Albanian (sq), Swedish (sv),Swahili (sw), Thai (th), Tagalog (tl), Turkish (tr), Uzbek (uz), Vietnamese (vi), Chinese (zh) – Source: JaidedAI
EasyOCR provides enough flexibility to choose Text detection with GPU or without.
# Changing the image path
IMAGE_PATH = 'Turkish_text.png'
reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH)
result
Output:
[([[89, 7], [717, 7], [717, 75], [89, 75]], 'Most Common Texting Slang', 0.8411301022318493), ([[296, 60], [504, 60], [504, 108], [296, 108]], 'in Turkish', 0.9992136162168752), ([[392, 234], [446, 234], [446, 260], [392, 260]], 'text', 0.955612246445849), ([[353, 263], [488, 263], [488, 308], [353, 308]], 'yazmak', 0.8339281200424168), ([[394, 380], [446, 380], [446, 410], [394, 410]], 'link', 0.8571656346321106), ([[351, 409], [489, 409], [489, 453], [351, 453]], 'baglanti', 0.9827189297769966), ([[393, 525], [446, 525], [446, 562], [393, 562]], 'tag', 0.999996145772132), ([[373, 559], [469, 559], [469, 595], [373, 595]], 'etiket', 0.9999972515293261), ([[378, 674], [460, 674], [460, 704], [378, 704]], 'follow', 0.9879666041306504), ([[353, 703], [483, 703], [483, 748], [353, 748]], 'takip et', 0.9987622244733467)]
# Changing the image path
IMAGE_PATH = 'Perform-OCR.jpg'
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result
Output:
[([[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR', 0.990493426051807)] # Where 0.9904.. is the confidence level of detection
Note: If you don’t have the GPU and yet you are not setting it as False then you will get the following warning:
Checkout this article about the Optical Character Recognition (OCR)
top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX
In the above code snippet,
img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,bottom_right, font, 0.5,(0,255,0),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()
Now, as if we have got the coordinates let’s just plot them!
Output:
IMAGE_PATH = 'sign.png'
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result
Output:
[([[19, 181], [165, 181], [165, 201], [19, 201]], 'HEAD PROTECTION', 0.9778256296390029), ([[31, 201], [153, 201], [153, 219], [31, 219]], 'MUST BE WORN', 0.9719649866726915), ([[39, 219], [145, 219], [145, 237], [39, 237]], 'ON THIS SITE', 0.9683973478739152)]
Getting the coordinates
top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX
Drawing Text and Bounding Boxes
img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,top_left, font, 0.5,(0,0,255),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()
Output:
But hold on! What if we want to see the all text detection in an image itself?
That’s what I’ll do in this section!
img = cv2.imread(IMAGE_PATH)
spacer = 100
for detection in result:
top_left = tuple(detection[0][0])
bottom_right = tuple(detection[0][2])
text = detection[1]
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,(20,spacer), font, 0.5,(0,255,0),2,cv2.LINE_AA)
spacer+=15
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()
In the above code snippet, we just need to focus on few points:
Output:
The conclusion of the model also concludes my discussion for today 🙂
In conclusion, EasyOCR is an excellent tool for text detection from images, providing a simple and effective way to extract text from images with high accuracy. The library’s easy-to-use interface and powerful algorithms make it an ideal solution for businesses and organizations needing to process large volumes of documents and images quickly.
If you want to learn more about OCR technology and other data science tools and techniques, we invite you to explore our Blackbelt program. Our comprehensive training program provides in-deptha instruction on various data science topics, including OCR, machine learning, and artificial intelligence. With our Blackbelt program, you can gain the skills and knowledge you need to advance your career and stay at the forefront of this exciting field. Sign up today to take the first step toward becoming a data scientist!
EasyOCR is used for extracting text from images or scanned documents. It supports multiple languages and works well for tasks like document reading and image-based text recognition.
EasyOCR is often considered better for complex text recognition tasks, especially for multi-language support and reading text in different fonts or handwriting. However, Tesseract is more popular for basic OCR tasks and is open-source.
Yes, you can train EasyOCR on custom datasets if you need better accuracy for specific languages or text styles. It supports custom model training with some technical setup.
Well written
Very helpful! Thank you!
Downloading detection model, please wait. This may take several minutes depending upon your network connection. Traceback (most recent call last):---------------------------| 2.0% Complete File "globtesting.py", line 14, in reader = easyocr.Reader(['en'], gpu=False) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\easyocr.py", line 90, in __init__ download_and_unzip(detection_models[detector_model]['url'], detection_models[detector_model]['filename'], self.model_storage_directory, verbose) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\utils.py", line 586, in download_and_unzip urlretrieve(url, zip_path, reporthook=reporthook) File "C:\Users\Onkar\AppData\Local\Programs\Python\Python38\lib\urllib\request.py", line 283, in urlretrieve reporthook(blocknum, bs, size) File "E:\Python\Pratics\venv\lib\site-packages\easyocr\utils.py", line 686, in progress_hook print(f'\r{prefix} |{bar}| {percent}% {suffix}', end = printEnd) File "C:\Users\Onkar\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2588' in position 12: character maps to