πŸ’Ώ

3. Data

πŸ‘‰
Become job-ready by solving real-world challenges and build your professional cybersecurity skills with the National Cyber League!

What is Digital Data

Data is what makes computers useful. You have likely heard the term "data" to refer to the byproduct of a user's interaction with a computer system, such as a digital file saved from a camera application or a record of someone's Internet search history. However, "digital data" has a broader definition that includes other types of digital information, such as the computer software themselves. For the sake of brevity, we will use the term "data" to refer to the broader "digital data".

Data is an abstract thing - it's not a physical object. The same data can be physically represented in different ways. A digital image could be saved on a laptop hard drive, or it could be saved on a flash drive, or even on a CD. Regardless of how the data is physically stored, the digital image is still the same. This is what allows us to use a phone to post an image online that can be viewed by a different type of phone, or laptop, or another completely different of device entirely.

You likely interact with non-digital data on a daily basis. If you write a letter on a piece of paper and then use a copier to make a duplicate, you now have two different representations of the data (the letter) on two different mediums (pen/pencil on writing paper and ink on printer paper). Maybe you even get the letter printed with a Braille printer. Either way, the information of the letter is the same, but the physical representation of how the information is recorded is different. As humans, these different representations of the data are easy for us to use, but that's not the case for computers.

It is easier to build a computer that only needs to identify two different possible values per character - this is a binary system, a system where only two different symbols can be used to represent data. Despite this limitation, it is possible for binary to store any value that we could store using our normal human symbols.

Before we dive into how binary works, it is important to remember that symbols do not have any inherent meaning. We give them meaning. The symbol 1 represents "one" in the Arabic Numeral system. In Chinese, "one" is represented with the symbol δΈ€ and in Cyrillic, "one" is represented with А. If you find using the 0 and 1 symbols confusing, try substituting them with entirely different characters - try a and b or you could even make up new symbols entirely.

⚠️
Below this, values in the red background represent decimal numbers and values in the blue background represent binary numbers.

Counting in Decimal

Our normal number system using the Arabic Numerals is a decimal system (dec- meaning "ten"). Try recalling how you normally count. You first start with the lowest value, 0. You then increment to the next symbol until you reach the symbol that represents the highest possible value for a single digit (...1, 2, 3, 4, 5, 6, 7, 8, 9). Once you reach the maximum value for a single digit, you increase the digit to the left by one and set the current digit back to zero, which brings you to 10.

When you reach 19, you increment the "tens" column by one and reset the "ones" column back to zero, which makes 20.

When you reach 99, the "ones" column has now reached the max value, so you reset it back to zero, but when you try to increment the "tens" column, you find that has also reached the max value, so you set the "tens" column to zero and increment the "hundreds" column by one, which makes 100.

Each one of these columns is a power of ten. This is because there are ten possible values for each digit. The "ones" column is actually 10010^0, the "tens" column is actually 10110^1, the "hundreds" column is actually 10210^2, the "thousands" column is actually 10310^3, and so forth. These same exact principle apply to counting in binary.

When you subsequently need to determine the value of the number, you take the digit in each column, multiply it by the value of the column and then all each of those values together.

For example, the value of the number 327 is equivalent to (3βˆ—102)+(2βˆ—101)+(7βˆ—100)(3 * 10^2) + (2 * 10^1) + (7 * 10^0).

Simplified, this would be (3βˆ—100)+(2βˆ—10)+(7βˆ—1)(3 * 100) + (2 * 10) + (7 * 1).

Counting in Binary

In the binary number system (bi- meaning "two"), there are only two values: 0 and 1. You start with 0 and then follow that with 1.

At this point, you have reached the maximum value for the column, so you now need to increment the left column by one and reset the current column back to zero, which would make the number 10.

You then increment the rightmost column by one to get 11.

Now when you try to increment the second column, you find that it is at max value, so you need to create a third column, and increment that by one and set the other two columns to zero, making 100.

In decimal, these columns represent powers of ten, but in binary, they represent powers of two. Starting from the right side: the first column is 202^0 (the "ones" column), the second column is 212^1 (the "twos" column), the third column is 222^2 (the "fours" column), the fourth column is 232^3 (the "eights" column), the fifth column is 242^4 (the "sixteens" column), and so forth.

When you need to calculate the value of a binary number, the process is the same as with a decimal number. You take the digit in each column, multiply it by the value of the column and then all each of those values together.

For example, the value of the number 101110 is equivalent to (1βˆ—25)+(0βˆ—24)+(1βˆ—23)+(1βˆ—22)+(1βˆ—21)+(0βˆ—20)(1 * 2^5) + (0 * 2^4) + (1 * 2^3) + (1 * 2^2) + (1 * 2^1) + (0 * 2^0).

Simplified, this would be (1βˆ—32)+(0βˆ—16)+(1βˆ—8)+(1βˆ—4)+(1βˆ—2)+(0βˆ—1)(1 * 32) + (0 * 16) + (1 * 8) + (1 * 4) + (1 * 2) + (0 * 1)

Try counting in binary and then converting the binary number back into decimal. This counter below should help you if get stuck.

image

The first row in purple shows the decimal value of each column in exponential form.

The second row in blue shows the decimal value of each column.

The third row in light orange shows the binary value of each digit. The combination of all of the digits is the current binary number in the counter.

The dark orange box in the bottom right shows the value of the current number in decimal notation.

How Computers Store Data

We mentioned earlier that it is easier to build a machine that only needs to identify two possible values. One way to do this, by using a magnetic material. An example is a hard disk drive, which contains magnetic surface that is divided into millions or billions of tiny areas. Each one of those areas can be magnetized to store a value of 1 or demagnetized to store a value of 0. A metal arm, called a hard disk head, then spins around the magnetic surface to read or write values, similar to a needle on a recording player.

Another way to store data is by using transistors. With a solid state drive, instead of splitting magnetic material into billions of tiny areas, billions of transistors are used that can store either a positive or negative charge, representing a 1 or 0 value respectively.

How Binary Can Represent Anything

As humans, we have invented many different symbols to convey information. In English-speaking countries, you have the English alphabet (a-z), Arabic Numerals (0-9), and special symbols (!"#$%&'*+,-./:;<=>?).

Imagine that you could only use Arabic Numerals to communicate with people. How would you form words? A very simple approach to solving this problem would be to use numbers to represent characters in the alphabet. "1" could be "A", "2" could be "B" and so forth. You could then substitute the normal alphabet entirely with numbers. Here is an example:

 h  e  l  l  o  w  o  r  l  d
08 05 12 12 15 23 15 18 12 04

Now imagine you want to represent additional information other than the alphabet - maybe some punctuation or maybe you want to differentiate between lower case and upper case characters. To do this, you would just increase the range of the numbers that you use. Instead of 1-26, you'd need to use 1-52 to cover lower + upper case characters and even more if you wanted additional punctuation.

Because binary can only use 1s and 0s, the values for each symbol would need to be different (eg. 1000 in binary instead of 08 in decimal for the h), but the same theory applies. There are multiple different character encoding systems that describe how to represent characters using numbers. Common ones include ASCII encoding and UTF-8, which vary in which characters they support.

ASCII maps numeric values to non-numeric symbols. The ASCII table uses 7-digit binary numbers (bytes) which can represent a value from 0 to 127.

⚠️
An 8th bit was reserved to use 1 bit for error checking

Below you can see the ASCII table, which shows how decimal values map to different non-numeric symbols. To convert a binary number into a character using the ASCII table, you would convert the binary number to decimal and then find the corresponding character on the ASCII table.

For example, the binary number 1001011, would convert to decimal 75.

(1βˆ—26)+(0βˆ—25)+(0βˆ—24)+(1βˆ—23)+(0βˆ—22)+(1βˆ—21)+(1βˆ—20)(1 * 2^6) + (0 * 2^5) + (0 * 2^4) + (1 * 2^3) +(0 * 2^2) + (1 * 2^1) + (1 * 2^0).

(1βˆ—64)+(0βˆ—32)+(0βˆ—16)+(1βˆ—8)+(0βˆ—4)+(1βˆ—2)+(1βˆ—1)(1 * 64) + (0 * 32) + (0 * 16) + (1 * 8) + (0 *4) + (1 * 2) + (1 * 1).

This would correspond to the uppercase letter K.

image

Become job-ready by solving real-world challenges and build your professional cybersecurity skills with the National Cyber League.