The Language of a Computer

When you press A on your keyboard, the computer displays A on the screen. But what is

actually stored inside the computer’s main memory? What is the language of the

computer? How does it store whatever you type on the keyboard?

Remember that a computer is an electronic device. Electrical signals are used inside the

computer to process information. There are two types of electrical signals: analog and

digital. Analog signals are continuous wave forms used to represent such things as

sound. Audio tapes, for example, store data in analog signals. Digital signals represent

information with a sequence of 0s and 1s. A 0 represents a low voltage, and a 1

represents a high voltage. Digital signals are more reliable carriers of information than

analog signals and can be copied from one device to another with exact precision. You

might have noticed that when you make a copy of an audio tape, the sound quality of

the copy is not as good as the original tape. On the other hand, when you copy a CD,

the copy is as good as the original. Computers use digital signals.

Because digital signals are processed inside a computer, the language of a computer, called

machine language, is a sequence of 0s and 1s. The digit 0 or 1 is called a binary digit,

or bit. Sometimes a sequence of 0s and 1s is referred to as a binary code or a binary

number.

Bit: A binary digit 0 or 1.

A sequence of eight bits is called a byte. Moreover, 210 bytes = 1024 bytes is called

a kilobyte (KB). Table 1-1 summarizes the terms used to describe various numbers

of bytes.

6 | Chapter 1: An Overview of Computers and Programming Languages

Every letter, number, or special symbol (such as * or {) on your keyboard is encoded as a

sequence of bits, each having a unique representation. The most commonly used

encoding scheme on personal computers is the seven-bit American Standard Code

for Information Interchange (ASCII). The ASCII data set consists of 128 characters

numbered 0 through 127. That is, in the ASCII data set, the position of the first character

is 0, the position of the second character is 1, and so on. In this scheme, A is encoded as

the binary number 1000001. In fact, A is the 66th character in the ASCII character code,

but its position is 65 because the position of the first character is 0. Furthermore, the

binary number 1000001 is the binary representation of 65. The character 3 is encoded as

0110011. Note that in the ASCII character set, the position of the character 3 is 51, so

the character 3 is the 52nd character in the ASCII set. It also follows that 0110011 is the

binary representation of 51. For a complete list of the printable ASCII character set, refer

to Appendix C.

The number system that we use in our daily life is called the decimal system, or base 10.

Because everything inside a computer is represented as a sequence of 0s and 1s, that is,

binary numbers, the number system that a computer uses is called binary, or base 2. We

indicated in the preceding paragraph that the number 1000001 is the binary representation

of 65. Appendix E describes how to convert a number from base 10 to base 2 and vice versa.

1 TABLE 1-1 Binary Units

Unit Symbol Bits/Bytes

Byte 8 bits

Kilobyte KB 210 bytes ¼ 1024 bytes

Megabyte MB 1024 KB ¼ 210 KB ¼ 220 bytes ¼ 1,048,576 bytes

Gigabyte GB 1024 MB ¼ 210 MB ¼ 230 bytes ¼ 1,073,741,824 bytes

Terabyte TB 1024 GB ¼ 210 GB ¼ 240 bytes ¼

1,099,511,627,776 bytes

Petabyte PB 1024 TB ¼ 210 TB ¼ 250 bytes ¼

1,125,899,906,842,624 bytes

Exabyte EB 1024 PB ¼ 210 PB ¼ 260 bytes ¼

1,152,921,504,606,846,976 bytes

Zettabyte ZB 1024 EB ¼ 210 EB ¼ 270 bytes ¼

1,180,591,620,717,411,303,424 bytes

The Language of a Computer | 7

Inside the computer, every character is represented as a sequence of eight bits, that is, as

a byte. Now the eight-bit binary representation of 65 is 01000001. Note that we added 0

to the left of the seven-bit representation of 65 to convert it to an eight-bit representation.

Similarly, the eight-bit binary representation of 51 is 00110011.

ASCII is a seven-bit code. Therefore, to represent each ASCII character inside the

computer, you must convert the seven-bit binary representation of an ASCII character

to an eight-bit binary representation. This is accomplished by adding 0 to the left of the

seven-bit ASCII encoding of a character. Hence, inside the computer, the character

A is represented as 01000001, and the character 3 is represented as 00110011.

There are other encoding schemes, such as EBCDIC (used by IBM) and Unicode,

which is a more recent development. EBCDIC consists of 256 characters; Unicode

consists of 65,536 characters. To store a character belonging to Unicode, you need

two bytes.