In the last article, we defined Sounds and Colors; we also saw how computers represent their values in terms of bits and bytes. This article will extend the previous article’s content and briefly talk about Images and Video representation in a computer system. This article is also the last article on the series of Data and Data representations in a computer system.

The points we don’t discuss in this article are image types, file types, and compression mechanisms. Here we will cover everything about bitmap images.

Now, let’s jump into the content.


What are images? We can define images in two different ways; if we define images in a non-technical way, we can say an image is just a mechanism for capturing a moment in our life; that is it, right? Let’s bring it to the technical terms; images are nothing but a grid of pixels in technical terms. Woohoo, What?! 🤔 What is a pixel, and what do I mean by grid?

Let’s address such questions one by one; I am sure most of us heard about the words 8MP, 16MP, 24MP, Full HD, 4k, and so on so forth, when we are trying to measure the quality of a camera. But the question is, what are we really referring to? Simple answer, we are referring to the total number of pixels in an image. For example, a 24MP (Mega Pixel) camera can capture an image with 24 million pixels; likewise, an 8MP camera can capture 8 million pixels.

Now, I think you know what the definition of pixels from the above explanation is, if you don’t let me give it to you. A pixel is nothing but the smallest element of a picture. That means an image is composed of pixels, and those pixels are laid out in a table form in a rows and columns fashion. For example, a 24MP image can be represented as 6000x4000 pixels. 6000 pixels row-wise and 4000 pixels column-wise. When we define images like this (in terms of rows and columns), we are specifying the pixel dimensions; in other words, the resolution. The higher the resolution, the more accurate the image will appear.

If you really want to know what pixels look like, open up an image on your laptop or your smartphone and zoom it to the maximum size; then, you will start to notice a meaningless square color. I guess we all played the Jigsaw Image puzzle at some point in our lives, right? Images are just like that. Pixels are the smallest part of the image placed at the correct place; take it as the final and accurate result of the jigsaw puzzle.

What do pixels do?

When we take a picture, we are capturing the colors and the intensity of the colors. So, pixels store that information. Each pixel has its own color and intensity value. Those color and intensity values will be measured in numbers and represented in computer memory in binary, just like other data types.

To answer how images will be represented in a computer system, we first need to answer the image’s color depth (How many bits we use to represent a single color). Let’s take one small image and see how pixels will be laid out and their pixel values in a different color depth. I hope you will remember what we have discussed about colors in our last article if you don’t refer to it here.

In the above Black and White picture, one pixel has the value of either 1 or 0, nothing else. This kind of image is called Monochrome.

In the above Grayscale image, each pixel will have a value of 0 – 255, just as we discussed in the previous article.

In the above RGB image, each pixel will have 0 – 255 values for each color channel.

If you ask me, as the resolution and color depth get higher, isn’t our file size getting higher? Yeah, but that’s the cost we need to pay. To give you a glimpse of how much file size a picture will take in different color depths, let’s look at the following table based on the above images.

The above pictures have a resolution of 480x320 (153,600 pixels); how many bytes do we need?

Color Depth# of bits# of bytes
1 bit per pixel153,600 * 1 bit19,200 bytes
8 bit per pixel153,600 * 8 bit153,600 bytes
24 bit per pixel153,600 * 24 bit

460,800 bytes

(153,600 * 24 / 8)

So, we need compression mechanisms. As I said, we don’t talk about compression mechanisms here, but if you want to know more, you can research it by yourself.


Again, let’s try to define videos in Technical and non-technical terms. Non-technically, videos are moving images. Technically, videos are a series of images combined with sound technology.

To make you understand the concept, try to take a series of pictures of a moving object, then start from the first and slide them fast; congratulations! You got yourself a video. Simple, right? That’s what videos are.

Videos are usually measured by FPS (Frames Per Second), i.e., how many pictures will be taken per second. The more images per second, the real the video will appear. Let’s look at the following illustration to grasp the idea.

The above illustration shows us the difference between two videos having 10FPs and 5FPs; you can clearly see there is a difference, right?

As we have seen in the images section, as the FPS gets higher, the file size will increase, and we need compression mechanisms here as well.

So, in general, this is how we represent Images and Videos in a computer system. Here we will rest data representations in a computer system.