318489_01_IGCSE_CO_SE_OL_001-014.
indd Page 9 12/07/22 1:09 PM F-0250                                                               /145/HO02580/work/indd
            1.2 Text, sound and images
            1.2.1 Text
            All keyboard characters (including control codes) are represented in a
            computer using 7-bit American Standard Code for Information Interchange
            (ASCII code) or 8-bit Extended ASCII code character set. For example,
            each ASCII value is found in a stored table when a key is pressed on
            the keyboard. The main drawback of the ASCII code system is it can’t be
            used to represent non-Western languages, such as Chinese or Japanese
            characters. One way round this is to use Unicode, which can support up to
            4 bytes per character (that is, up to 32 bits per character).
            1.2.2 Sound
            Sound is analogue data. To store sound in a computer, it is necessary to
            convert the analogue data into a digital format. The digital data can then
            be played back through a loudspeaker once it has been converted back to
            electrical signals (see Chapter 3 for more details).
            To convert sound to digital, the sound waves must be sampled at regular
            time intervals. The amplitude (loudness) of the sound uses a number of
            bits to represent the range (for example, 0 to 15 bits). The greater the
            number of bits used to represent the amplitude, the greater the accuracy
            of the sampled sound. The number of bits per sample is called the
            sampling resolution; the sampling rate is the number of sound samples
            taken per second. Look at these two diagrams to show the difference.
            In the first diagram, only 8 bits (0 to 7) are used to represent the amplitude,
            whereas 16 bits are used in the second diagram. This means the second
            diagram allows 16 distinct values to represent amplitude, whereas the first
            diagram only has eight values to represent the same amplitude range.
                               7
                                                                                                              This amplitude value
                               6                                                                              is between 4 and 5
                                                                                                              (therefore, not very
             Sound amplitude
                               5
                                                                                                              accurate, since the value
                               4                                                                              4 has to be taken).
                               3
                               0
                                   0    1   2   3   4   5   6   7   8    9 10 11 12 13 14 15 16 17 18 19 20
                                                                        Time intervals
                               15
                               14                                                                             The same amplitude
                               13                                                                             value is now exactly
                               12
                               11
                                                                                                              11 (therefore, it is a
                                                                                                              much more accurate
             Sound amplitude
                               10
                                9                                                                             representation).
                                8
                                7
                                6
                                5
                                4
                                3
                                2
                                1
                                0
                                    0   1   2   3   4   5   6   7   8    9 10 11 12 13 14 15 16 17 18 19 20
                                                                        Time intervals
318489_01_IGCSE_CO_SE_OL_001-014.indd Page 10 12/07/22 1:09 PM F-0250                        /145/HO02580/work/indd
            1.2.3 Representation of (bitmap) images
            Bitmap images are made up of pixels (picture elements). An image
            is made up of a two-dimensional matrix of pixels. Each pixel can be
            represented as a binary number, so bitmap images are stored as a series of
            binary numbers, so that:
            l   a black and white image only requires 1 bit per pixel (1 = white,
                0 = black)
            l   if each pixel is represented by 2 bits, there are 22 (= 4) possible values
                (00, 01, 10 and 11) – therefore, four colours could be represented or
                four shades of grey
            l   if each pixel is represented by 3 bits, there are 23 (= 8) possible values
                – therefore, eight colours could be represented or eight shades of grey;
                and so on.
            The number of bits to represent each possible colour is called the colour
            depth. Image resolution refers to the number of pixels that make up an
            image, for example 4096 × 3072 (= 12 582 912) pixels could be used to
            make up an image. Each pixel will be represented by a number of bits (for
            example, a colour depth of 32 bits).
            1.3 Data storage and file compression
            1.3.1 Measurement of data storage
            Recall that a bit refers to each binary digit and is the smallest unit; four
            bits make up a nibble (an old unit) and eight bits make up a byte. Memory
            size and storage size are both measured in terms of bytes
            Data storage and memory is measured in terms of bytes:
            l   1 KiB (kibibyte) = 210 bytes
            l   1 MiB (mebibyte) = 220 bytes
            l   1 GiB (gibibyte) = 230 bytes
            l   1 TiB (tebibyte) = 240 bytes
            l   1 PiB (pebibyte) = 250 bytes
            l   1 EiB (exbibyte) = 260 bytes
            1.3.2 Calculation of file size
            The file size of an image is calculated by:
            image resolution (number of pixels) × colour depth (in bits)
            For example, a photograph is taken by a camera that uses a colour depth
            of 32 bits; the photograph is 1024 × 1080 pixels in size. We can work out
            the file size as follows:
            1024 × 1080 × 32 = 35 389 440 bits ≡ 4 423 680 bytes ≡ 4.22 MiB
            The file size of a sound file is calculated by:
            sample rate (in Hz) × sample resolution (bits) × length of sample (secs)
            For example, an audio file which is 60 minutes in length uses a sample
            rate of 44 100 and a sample resolution of 16 bits. We can work out the file
            size as follows:
            44 100 × 16 × (60 × 60) = 2 540 160 000 bits ≡ 317 520 000 bytes ≡ 302.8 MiB
318489_01_IGCSE_CO_SE_OL_001-014.indd Page 11 12/07/22 1:09 PM F-0250                                                                            /145/HO02580/work/indd
            1.3.3 Data compression
            Files are often compressed to save storage used, reduce streaming and
            downloading/uploading times, reduce the bandwidth requirements and
            reduce costs (for example, if storing files using cloud storage).
            1.3.4 Lossy and lossless file compression
            Two common types of (file) compression are lossy and lossless.
             Lossy                                                                      Lossless
             l File compression algorithms eliminate unnecessary data.                  l Data from the original uncompressed file can
             l The original file cannot be reconstructed once it has been                      be reconstructed following compression.
               compressed.                                                              l No data is lost following the application of the
             l The files are smaller than those produced by lossless                           lossless algorithms.
               algorithms.                                                              l Most common example is RLE.
             l Examples include MPEG and JPEG.
            Lossy file compression
            Examples of lossy file compression include the following.
                                                                 Lossy compression
                                MP3                                      MP4                                           JPEG
                Reduces music file size by about 90%.    Used to reduce multimedia file size           Reduces file size of an image thus
                                                         rather than just sound (MP3).                 reducing storage requirement.
                Some quality of sound is lost but most
                is retained.                             Allows movies to be streamed over             Human eyes don’t detect differences
                                                         the internet with reasonable quality.         in colour shades as well as brightness.
                Removes sounds outside human ear
                range.                                                                                 Separating pixel colour from
                                                                                                       brightness allows images to be split
                Eliminates softer sounds using                                                         into 8 × 8 blocks; this allows certain
                perceptual music shaping.                                                              information to be discarded without
                                                                                                       losing too much image quality.
            Lossless file compression
            Run length encoding (RLE) is an example of lossless compression. It
            works by:
            l    reducing the size of a string of adjacent, identical data items
            l    the repeating unit is encoded into two values:
                 l first value represents number of identical data items
                 l second value represents code (such as ASCII) of data item.
            Using RLE on text data
            For example aaaaaaa/bbbbbbbbbb/c/d/c/d/c/d/eeeeeeee becomes:
            255 08 97 // 255 10 98 // 99 /100 /99 /100 /99 /100 // 255 08 101
                                           This is the ASCII code of the repeating unit
                           This is the number of times the character unit is repeated
                255 is a flag indicating that the two values that follow are the number
                of repeating units and the ASCII code of the repeating unit
318489_01_IGCSE_CO_SE_OL_001-014.indd Page 12 12/07/22 1:09 PM F-0250                       /145/HO02580/work/indd
            Using RLE with images
            This example shows how the file size of a colour image can be reduced
            using RLE.
            The figure below shows an object in four colours. Each colour is made up
            of red, green and blue (RGB) according to the code on the right.
                                          Square         Components
                                          colour   Red     Green        Blue
                                                   0          0          0
                                                   255      255         255
                                                   0        255          0
                                                   255        0          0
            This produces the following data:
            2 0 0 0 4 0 255 0 3 0 0 0 6 255 255 255 1 0 0 0 2 0 255 0 4 255 0 0 4 0
            255 0 1 255 255 255 2 255 0 0 1 255 255 255 4 0 255 0 4 255 0 0 4 0 255
            0 4 255 255 255 2 0 255 0 1 0 0 0 2 255 255 255 2 255 0 0 2 255 255 255
            3 0 0 0 4 0 255 0 2 0 0 0
            The original image (8 × 8 square) would need three bytes per square (to
            include all three RGB values). Therefore, the uncompressed file for this
            image is:
            8 × 8 × 3 = 192 bytes.
            The RLE code has 92 values, which means the compressed file will be 92
            bytes in size. This gives a file reduction of about 52%. It should be noted
            that the file reductions in reality will not be as large as this due to other
            data which needs to be stored with the compressed file (for example, a
            file header).