1.1 Introduction Data representation considers how a computer uses numbers to represent data inside the computer. Three types of data are considered at this stage: 1. Numbers including positive, negative and fractions. 2. Text. 3. Graphics. Topic 1 Data Representation
Binary (Base 2) The binary system only requires two symbols. 0 and 1 are used. The columns in binary represent: 2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 e.g. the binary number 0 0 0 1 0 1 0 1 is equal to 16 + 4 + 1 = 21 in decimal. The number 1110 = 1.2 TheBinary Number System 128s 64s 32s 16s 8s 4s 2s units 8+4+2 = 14 in decimal
Try the following. Show your working: The number 1110 = 8+4+2 = 14 in decimal
Remember the units used in the binary system. 1 byte = 1 Kilobyte = 1 Terabyte = 1 Gigabyte = 1 Megabyte = 2048 Kilobytes = ? A. 1024 Megabytes B. 1 Gigabyte C. 2 Megabytes D. 4096 bytes 8 bits 1024 bytes 1024 Kilobytes 1024 Megabytes 1024 Gigabytes ☺ 3 Gigabytes = ? A. 24 Terabytes B. 3072 Megabytes C. 24 Kilobytes D. 3072 Terabytes ☺
Here are some useful terms used in binary Bit Byte Least significant bit(LSB) Most significant bit(MSB) B inary dig it (1 or 0) Group of 8 bits 2 8 = 256 values Bit furthest to the left Bit furthest to the right (units)
Why use binary? The computer is a two-state (binary) machine. All components inside a computer and all backing storage devices have only two states. e.g. • a switch is on or off. • a transistor conducts or does not conduct. • a signal is a pulse of electricity or no pulse. • an area of a magnetic disk is positive or negative. • with laser technology light can reflect in two different directions. Binary, using the numbers 0 and 1, can be represented by a two state system.
1. A simple two-state system is less complex to represent using electrical signals than our decimal ten-state system. Degradation in signal levels does not corrupt the information as easily and so there is less chance of errors. 2. A two state system is easy to store magnetically and optically. 3. Calculations are simpler. There are only four rules for addition. These can be easily built into the electronic circuits. 0 + 0 = 0 + 1 = 1 + 0 = 1 + 1 = 0 1 1 0 carry 1 Advantages of using Binary
The disadvantages of using binary are that: 1. A binary number has more digits than its decimal equivalent. i.e. it will be longer. This is not a problem for the computer but it makes it harder for us to read and work with. 2. Binary is more difficult than decimal for us to read as we are more used to decimal.
1.3 Integers. An integer is a whole number, positive or negative. Every integer stored in the computer is allocated the same amount of space, whether it is a large integer or a small integer. The number of bits allocated determines the range of numbers which can be stored. If one byte was allowed then the largest integer would be: 11111111 which is 255 in decimal or 2 8 - 1 Two bytes would allow: 2 16 -1 possibilities = a range from 0 to 65535.
If a computer only had to store positive integers then we could easily convert each number into its binary equivalent as you saw in the examples earlier. Negative integers. However, negative numbers have to be stored too and we need to find a method of representing a –ve sign using 1s and 0s. Modern computers use the Two’s complement method to represent integers.
Two's Complement. With this method we take the most significant bit (the one on the far left) and treat it as a negative number. The following examples illustrate the principle using 4 bit numbers to help you understand. A modern computer would use 32 bit numbers for integers. In your NABS and final exams you are likely to be asked to use 8 bit numbers and you will practise with these later.
Binary Decimal 1000 -8 1001 -7 1010 -6 1011 -5 1100 -4 1101 -3 1110 -2 1111 -1 0000 0 0001 +1 0010 +2 0011 +3 0100 +4 0101 +5 0110 +6 0111 +7 In this table the 1 at the far left represents -8 (negative 8). Make sure that you understand this concept Note that the range is still 2 4 = 16 numbers = -8 to +7 Two’s Complement
Range and Accuracy of Two’s Complement 1. The range of numbers which can be stored depends on the number of bits being used. 4 bit numbers have a range 8 bit numbers have a range -8 to +7 -128 to +127 3. Numbers stored using two’s complement are always 100% accurate . 2. In a modern computer 32 bits are used stored integers. This gives a range of 2 32 around -2,147,483,648 to +2,147,483,647
8 Bit Two’s complement numbers Here is an example of how to work out the Two’s complement for the number -80 -80 = -128 64 32 16 8 4 2 1 1 0 0 1 1 0 0 0 1. The number is negative so put a 1 in the first column. 128 80 48 2. Subtract the 80 from 128. 3. Now make 48 from the remaining columns using normal binary rules. 32 16 16 0
Express the following numbers using 8 bit Two’s complement: 1. -45 2. -21 3. -16 4. 127 5. -129 -128 64 32 16 8 4 2 1 1 1 0 1 0 0 1 1 1 1 1 0 1 0 1 1 1 1 0 1 1 0 0 0 0 1 1 1 1 1 1 1 Number out of range
1.3 Floating Point Numbers. Real numbers (numbers with a decimal point in them) are stored using floating point representation . This is like standard form/scientific notation used in decimal. 2. The point has been moved 4 to the left so we need to multiply by 2 4 . The power 4 = 100 in binary. 1101.101 = .1101101 x 2 100 1. The binary point is moved to the far left.
The general form of this representation is m x b e where m = b = e = mantissa (the number) base exponent (the power) As the base is always 2 and the point is always at the far left, we only need to store the mantissa and the exponent, so the number 1101.101 becomes: 1101101 100 Mantissa Exponent
Range and Precision of Floating Point numbers 1. The range of numbers which can be stored depends on the number of bits being used for the exponent . The exponent has no effect on precision. 2. The precision of the numbers being stored depends on the number of bits being used for the mantissa . The mantissa has no effect on range. 3. In a modern computer, floating point allows: A 4 byte mantissa -2 31 to +2 31 A 1 byte exponent -128 to 127 In decimal this means accuracy to 9 significant figures and a range from 10- 38 to 10 38 .
80 characters would need a 7 bit code. This would allow 2 7 different codes = 1.4 Text. Western world alphabets need around 80 characters. Text is made up of characters and each character is allocated its own binary code. The set of characters that can be represented by a computer is known as the character set. These are made up of 26 upper case letters, 26 lower case letters, 10 digits 0-9, and around 20 punctuation marks. 128
It is useful to have a standard code so that text can be transferred between different types of computer easily without the need for translation. ASCII and Unicode are two of the most common codes in use today. ASCII (American Standard Code for Information Interchange) is a 7 bit code allowing 128 characters. These include 96 displayable characters and 32 control characters which control the display devices. Examples of these include: Code 13 = Carriage Return Code 9 = TAB Code 10 = Line feed Code 8 = Backspace
ASCII is often extended to 8 bit which allows 2 8 = 256 different characters. These include alphabetic characters in foreign languages and accented characters. This standard became known as extended ASCII and then ISO 8859. ASCII was designed to cope with Western based character sets such as English, French, German but did not include Japanese or Arabic symbol shapes. The increase in worldwide communication led to a need for a larger standard code to cope with other foreign alphabets, technical symbols etc.
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. www.unicode.org Unicode Unicode use a 16 bit code for each character. This provides a unique code for up to 2 16 = 65,536 characters. Unicode includes all the ASCII character codes to ensure compatibility.
Unicode v ASCII takes up more space to store Unicode than it does to store ASCII. Can represent many more characters than ASCII. Unicode Advantage – Disadvantage –
The graphic is seen as a matrix of (picture elements) pixels and the colour of each pixel is represented by a binary code. 000111000 000111000 111111111 000111000 001000100 110000011 This simple graphic of a match stick man could be stored as a series of binary numbers. ███ ███ ████████ ███ █ █ ██ ██ In black and white mode, each pixel requires a one bit code: 1.5 Bit mapped graphic representation 0 for white 1 for black
Resolution refers to the number of pixels in the width and height of the image. The more pixels there are in the image the higher the resolution. A typical 15’’ TFT screen could have a resolution of 1024 x768 = 786,432 pixels Bit depth refers to to the number of bits needed to represent the colour of each pixel. Greyscale simply means shades of grey and so each shade needs its own code. A 2 colour image would require a bit code. e.g. 1 0 = red 1 = green
A 16 colour image would need a bit code(=2 4 ). 4 0000 = red 0001 = green 0010 = blue 0011 = yellow 0100 = orange 0101 = etc. 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Increasing the number of colours that are available increases the size of the code for each colour. 1 2 4 16 16 65,536 24 (true colour) 16 million 8 256 No of colours available = 2 x Bit depth x (No of bits in code)
Calculating memory requirements Here is an example of how to calculate memory requirements for an image on a screen 800x600 using 16 million colours. Number of pixels = 800 x 600 = 480000 pixels Bit Depth is 2 x = 16 million so bit depth = i.e. you need a 24 bit code to represent the colour for each pixel. 24 The file size is 480000 x 24 bits = Keep dividing by 1024 until to you have an appropriate unit. 1440000/1024 = 1406.25 Divide by 8 to find the number of bytes. = 1440000 Bytes bits 11520000 bits KB /1024 = 1.4 MB
Remember that the size of an image depends on the number of pixels and the bit depth. 1. Find the number of pixels. 2. Find the bit depth. (express answer in bits) 3. Multiply the pixels by the bit depth to give an answer in bits. 4. Divide by 8 to give the answer in bytes. 5. Keep dividing by 1024 to find the answer in KB, MB or GB. 131.2 KB 937.5 KB 768 KB 256 65,536 16 No of colours File size 1024 x 768 800 x 600 640 x 420 Resolution
Sometimes you are given the bit depth in the question e.g. 24 bit colour. This makes the question easier. More on Bit depth If you are only told how many colours can be represented then unfortunately you have to calculate the bit depth using the equation: 2 x = number of colours Use a calculator to do this if necessary. where x is the bit depth.
A higher bit depth allows more colours so the quality of photographs etc will improve. If asked to work out how many images can be stored on a backing storage medium then remember to round down your answer as you would want to store complete images. Here is a worked example: the file size will increase. Disadvantage:
How many 8.4 MB images can be stored on a 1 GB memory stick? 1. Make sure that each number is using the same units. So 1 GB = 1024 MB 2. Divide the capacity by the number of images 1024/8.4 = 121.904 3. Round down the answer (Remember that you wouldn’t store a part of an image!) You can store 121 images on a 1 GB memory stick.
Bit Map graphics - Advantages 2. It is easy to draw freehand shapes. 1. File sizes are large as the content of every pixel has to be stored (even blank (background) pixels). 1. You can edit individual pixels in the image. 2. Resolution dependent - when a graphic is created at a particular resolution it cannot then take advantage of a higher resolution device. It becomes "blocky" if enlarged.
It is difficult to manipulate shapes on the screen. (e.g. move, scale, rotate or layer)
Bit Map graphics - Disadvantages
1.6 Vector graphic representation. A graphic is seen as being made up of a series of objects. A mathematical description of each object is stored as a set of instructions or formulae. A straight line can be stored as a set of two co-ordinate pairs, a line colour, thickness, pattern A square has co-ordinates for four points, four co-ordinate pairs, line colour, thickness, pattern, fill pattern and layer. This information allows the objects to be represented accurately. and layer.
Vector graphics - Advantages 1. Resolution independent - a graphic created at a particular resolution can take advantage of a higher resolution device. It will still look in proportion. 2. It is easy to manipulate shapes on the screen. (e.g. move, scale, rotate or layer) 3. File sizes are generally smaller as values do not need to be held for every pixel. 1. It is difficult to represent freehand shapes as the computer needs to describe them mathematically. 2. You cannot edit individual pixels . Vector graphics - Disadvantages 4. Objects can be grouped to form larger objects that can then be manipulated as a single object
Comparing file size At any given resolution and bit depth, the file size will be the same. It doesn’t matter what is actually on the screen. The content of every pixel has to be stored. Bit mapped & vector graphics - File size The more objects there are on the screen the bigger the file size will be. Vector - Bit-mapped -
A vector graphic has to be converted into a bit map before it is displayed on the screen or printed out. This is called rasterising or rendering . Bit mapped packages often have the word Paint or Photo associated with them. Graphics on screen and at the printer Bit mapped and Vector are different ways of representing graphics in RAM and on disk. It is important to remember that monitors and printers always display a graphic as a bit-map . e.g. Adobe Photoshop. Vector packages often contain the words Draw or Design e.g. Corel Draw.