Deep learning techniques are popularly used in unstructured data such as text data or image data. And before working on any type of data, one should have a good understanding of it. So in this article, we will discuss images and see how they get stored on a computer. We will learn about pixel values and cover two popular formats of images – Grayscale and RGB.
Note: If you prefer learning the concept of Grayscale and RGB in an Audio-Visual format, you may watch the video below that explains the entire article. If not, you may continue reading.
Let’s take an example. Here we have taken a black-and-white image, also known as a Grayscale image.
This is the image of the number 8. Now, if we zoom in further and if you look closely, you can see that the images are getting distorted, and you will see some small square boxes on this image.
These small boxes are called Pixels. We often use the term- the dimension of the image is X x Y. What does that actually mean? This means that the dimension of the image is simply the number of pixels across the image’s height(x) and width(y). In this case, if you count, it would be 24 pixels across the height and 16 pixels across the width. Hence the dimension of this image will be 24 x 16. Although we see an image in this format, the computer store image in the form of numbers.
Each of these pixels is denoted as a numerical value, and these numbers are called Pixel Values. These pixel values denote the intensity of the pixels. For a grayscale or b&w image, we have pixel values ranging from 0 to 255.
The smaller numbers closer to zero represent the darker shade while the larger numbers closer to 255 represent the lighter or the white shade.
So every image in a computer is saved in this form where you have a matrix of numbers, and this matrix is also known as a Channel.
Now can you guess the shape of this matrix? Well, it will be the same as the number of pixel values across the height and width of the image. In this case, the shape of the matrix would be 24 x 16
Now let’s quickly summarize the points that we’ve learned so far.
Now that we know how grayscale images are stored in a computer, let’s look at an example of a colored image. So let’s take an example of a colored image. This is an image of a dog-
This image comprises many different colors. Almost all colors can be generated from the three primary colors – Red, Green, and Blue. Therefore, we can say that each colored image is a unique composition of these three colors or 3 channels – Red, Green, and Blue.
This means that in a colored image, the number of matrices or the number of channels will be more. In this particular example, we have 3 matrices – 1 matrix for red, known as the Red channel,
another metric for green, known as the Green channel,
and finally, a third matrix for the blue color, known as the Blue channel.
Each of these metrics would again have values ranging from 0 to 255, where each of these numbers represents the intensity of the pixels. Or in other words, these values represent different shades of red, green, and blue. All of these channels or matrices superimpose over one another to form the shape of the image when loaded into a computer. The computer reads this image as –
where N is the number of pixels across the height, M is the number of pixels across the width, and 3 represents the number of channels. In this case, we have 3 channels R, G, and B. So, in our example, the shape of the colored image would be – 6 x 5 x 3 since we have 6 pixels across the height, 5 across the width, and there are 3 channels present.
Hope you all now understand how computers store images. We’ve learned important terms related to the topic, such as pixels, channels, pixel values, etc. In this article, we have covered the two most common image formats – Grayscale and RGB. There are other formats of images as well, which I will cover in the next article!
Dear Himanshi, Thanks for your article. I would like to share my knowledge as well regarding the Grayscale, B&W, and Colored images. As mentioned in your article, B&W and grayscale images can be called the same, I would rather consider that statement as wrong because B&W image pixels have values of either 0(black) or 255(white). Whereas grayscale image pixels can have values from 0 to 255 (shades of gray) but the three channels of the image will have the same pixel value (R=G=B). Let me know if you would like to share your inference on my comment. Thank you. Regards, Ganesh
Thanks a lot, Ganesh for adding this interesting information. It's always good to learn about new things.
Thank you for sharing that content
I have confusion with this line below . "But the three channels of the image will have the same pixel value (R=G=B)"