0% found this document useful (0 votes)

10 views62 pages

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Lecture 3 of COMP 30660 focuses on data representation in computer systems, covering numerical data, character codes, and error detection techniques. It explains the basic units of data, such as bits, bytes, and words, and delves into integer and floating-point representations, including the IEEE-754 standard. The lecture also discusses character encoding schemes like ASCII and Unicode, as well as methods for data recording and transmission, highlighting the importance of error detection and correction in data integrity.

Uploaded by

1457981717

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views62 pages

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Uploaded by

1457981717

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

COMP 30660: Computer Architecture and Organization (CONV)

Lecture 3: Data Representation in

Computer Systems http://www.flickr.com/photos/sarahseverson/

Madhusanka Liyanage
School of Computer Science
University College Dublin, Ireland
madhusanka@ucd.ie
1
Learning Objectives

• Understand the fundamentals of numerical data

representation in digital computers.
• Gain familiarity with the most popular character codes.
• Become aware of the differences between how data is
stored in computer memory and how it is transmitted
over networks.
• Understand the concepts of error detecting and
correcting codes.

2
Data and Information

• Data can be defined as a representation of facts,

concepts, or instructions in a formalized manner, which
should be suitable for communication, interpretation, or
processing by human or electronic machine.
• Information is organized or classified data, which has
some meaningful values for the receiver.
• Information is the processed data on which decisions
and actions are based.

3
Basic Unit of Data

• Use to indicate the capacity of some standard

data storage system or communication channels.
• Units derived from
– bit
– Byte
– Nibble
– Crumb
– Word

4
Bit

• A bit is the most basic unit of data in a computer.

– It is a state of “on” or “off” in a digital circuit.
– Sometimes these states are “high” or “low”
voltage instead of “on” or “off”

5
Byte

• A byte is a group of eight bits.

– A byte is the smallest
possible addressable unit
of computer storage.
– The term, “addressable,”
means that a particular
byte can be retrieved
according to its location in
memory.

6
Nibble
• A group of four bits is called a nibble (or nybble).
– Half a byte
– Bytes, therefore, consist of two nibbles: a
“high-order/Upper nibble” and a “low-
order/lower nibble”.
– Nibble is most often used in the context of
hexadecimal number representations, since a
nibble has the same amount of information as
one hexadecimal digit.

7
Crumb
• A pair of two bits or a quarter byte was called a
crumb.
– Quarter of a byte
– Often used in early 8-bit computing.

8
Word

• A word is a contiguous group of

bytes.
– Words can be any number of
bits or bytes.
– Word sizes of 16, 32, or 64
bits are most common.
– In a word-addressable
system, a word is the
smallest addressable unit of
storage.
– The number of bits in a word
is usually defined by the size
of the registers in the
computer's CPU

9
Data Representation

• The computer work with binary numbers

• Therefore, the numbers, letters, and other
symbols must be converted into their binary
equivalents.
Integers

12
Integer Representation (Recap)

• The Representation of a positive integer number

is quite straight forward
– but we are interested to represent positive as well
as negative numbers.
• Add a sign bit to representation
• For a Positive number, the sign bit set to 0 and
for negative number the sign bit is set to 1.
Integer Representation (Recap)

▪ An integer can be represented by fixed point

representation
▪ The left most bit is considered as sign bit.
▪ The magnitude of the number represent by the
rest of the bits

14
Integer Representation (Recap)

▪ The magnitude of the number can be

represented in following three ways:
1. Signed magnitude representation.
2. Signed 1’s complement representation.
3. Signed 2’s complement representation.
But how to represent the Floating-
Point numbers?

16
Floating-Point Representation

• The signed magnitude, one’s

complement, and two’s
complement representation that
we have just presented deal with
integer values only.
• Without modification, these
formats are not useful in
scientific or business applications
that deal with real number
values.
• Floating-point representation
solves this problem.

17
Floating-Point: Scientific Notation
• Scientific notation is a way of expressing numbers
that are too large or too small to be conveniently
written in decimal form.
– For example:
0.125 = 1.25  10-1
5,000,000 = 5.0  106

18
Scientific Notation
• Scientific Notation: has a single digit to the left of the decimal point.
• Numbers written in scientific notation have three components:

19
Floating-Point Representation
• Computers use a form of scientific notation for
floating-point representation
• Computer representation of a floating-point number
consists of three fixed-size fields:

• This is the standard arrangement of these fields.

20
Floating-Point Representation

• The one-bit sign field is the sign of the stored value.

• The size of the exponent field, determines the range
of values that can be represented.
• The size of the significand (mantissa) determines the
precision of the representation.

21
Example:
For illustrative purposes, we use a 14-bit model with a 5-bit
exponent and an 8-bit significand.
• Example:
– Express 3210 in the simplified 14-bit floating-
point model.
• We know that 32 is 25. So in (binary) scientific
notation 32 = 1.0 x 25
• Using this information, we put 101 (= 510) in the
exponent field and 1 in the significand as shown.

22
Example: synonymous forms
32 = 1.0 x 25 = 0.1 x 26 = 0.01 x 27 = 0.001 x 28 = 0.0001 x 29

• The illustrations shown at

the right are all equivalent
representations for 32
using our simplified model.
• Not only these
synonymous
representations waste
space, but they can also
cause confusion.

23
Floating-Point Representation: Negative
exponents

• Another problem with our system is that we have made

no allowances for negative exponents.
• E.g. no way to express 0.25 =1/4 = 1.0 x 2-2 = 0.1 x 2-1
– Notice that there is no sign in the exponent field!

24
IEEE-754 Representation
• A technical standard for floating-point arithmetic by
the Institute of Electrical and Electronics Engineers
(IEEE).
• The standard defines several interchange formats,

26
IEEE-754 Representation: How to Solve
synonymous Issue
• To resolve the problem of synonymous forms,
IEEE-754 establish a rule that the first digit of
the significand must be 1 (and integer part
should be zero).
• e.g. 32 = 1.0 x 25 = 0.1 x 26
• This results in a unique pattern for each floating-point
number.
– In the IEEE-754 standard, this 1 is implied meaning
that a 1 is assumed after the binary point.

27
IEEE-754 Representation: How to
Solve negative exponents
• To provide for negative exponents, IEEE-754 uses a
biased exponent.
• A bias is a number that is approximately midway in
the range of values expressible by the exponent.
• Exponent filed in IEEE-754 is filled by adding the
bias to the real exponent value
– So, Need to subtract the bias from the value in the
exponent field to determine its true value.
• Exponent values less than bias are negative,
representing fractional numbers.
28
IEEE-754 Representation
• The IEEE-754 single precision floating point
standard uses bias of 127 over its 8-bit exponent.

• The double precision standard has a bias of 1023

over its 11-bit exponent.

29
Example 1:
– Express 3210 in the revised 14-bit
floating-point model with a 5-bit
exponent and an 8-bit significand. Use
16 as bias.
• We know that 32 = 1.0 x 25 = 0.1 x 26.
• To use our excess 16 biased exponent, we add 16 to
6, giving 2210 (=101102).
• Graphically:

30
Example 2:Representation
– Express 0.062510 in the revised 14-bit
floating-point model with a 5-bit
exponent and an 8-bit significand. Use
16 as bias.
• We know that 0.0625 is 2-4. So, in (binary) scientific
notation 0.0625 = 1.0 x 2-4 = 0.1 x 2 -3.
• To use our excess 16 biased exponent, we add
16 to -3, giving 1310 (=011012).

31
Example 3 (To Do):Representation
– Express -26.62510 in the revised 14-bit
floating-point model with a 5-bit
exponent and an 8-bit significand. Use 16
as bias.
• We find 26.62510 = 11010.1012. Normalizing, we have:
26.62510 = 0.11010101 x 2 5.
• To use our excess 16 biased exponent, we add 16 to 5,
giving 2110 (=101012).
• We also need a 1 in the sign bit (for a negative
number).

32
What about Characters?

33
Character Codes

34
Character Codes

• Calculations are not useful until their results can

be displayed in a manner that is meaningful to
people.
• Also need to store the results of calculations and
provide a meaning for data input.
• Thus, human-understandable characters must be
converted to computer-understandable bit patterns
(and vise versa) using some sort of character
encoding scheme.
• Character Codes are used for this purpose
35
Character Codes :
Binary-coded decimal (BCD)
• The earliest computer coding systems used six bits.
• Binary-coded decimal (BCD) was one of these early
codes.
• In BCD, each digit is represented by a fixed number
of bits, usually four or eight.
• It was used by IBM mainframes in the 1950s and
1960s.
• As computers have evolved, character codes have
evolved.
• Larger computer memories and storage devices
permit richer character codes.

36
Character Codes : EBCDIC

• In 1964, BCD was extended to an 8-bit code,

Extended Binary-Coded Decimal Interchange
Code (EBCDIC).
• EBCDIC was one of the first widely-used computer
codes that supported upper and lowercase
alphabetic characters, in addition to special
characters, such as punctuation and control
characters.
• EBCDIC and BCD are still in use by IBM
mainframes today.
37
ASCII (American Standard Code for
Information Interchange)
• Other computer manufacturers chose the 7-bit
ASCII (American Standard Code for Information
Interchange) as a replacement for 6-bit codes.
• Until recently, ASCII was the dominant character
code outside the IBM mainframe world.

39
The ASCII Code

40
41
Unicode
Unicode

• Many of today’s systems embrace Unicode, a 16-bit

system that can encode the characters of every
language in the world.
• Defines 144,697 characters covering 159 modern and
historic scripts, as well as symbols, emoji, and non-
visual control and formatting codes.
• Maintained by the Unicode Consortium

43
Unicode

• The Unicode codes-

pace allocation is
shown at the right.
• The lowest-numbered
Unicode characters
comprise the ASCII
code.
• The highest provide for
user-defined codes.

44
Data Recording and Transmission

45
Codes for Data Recording and
Transmission
• When character codes or numeric values are stored in
computer memory, their values are unambiguous (Fixed).
• However, this is not always the case when data is stored
on magnetic disk or transmitted over a distance of more
than a few feet.
– Owing to the physical irregularities of data
storage and transmission media, bytes can
become distorted or garbled.
• Data errors are reduced by use of suitable coding
methods as well as through the use of various error-
detection techniques.
46
Codes for Data Recording
and Transmission
• To transmit data, pulses of “high” and “low” voltage
are sent across communications media.
• To store data, changes are induced in the magnetic
polarity of the recording medium.
• The period of time during which a bit is transmitted,
or the area of magnetic storage within which a bit is
stored is called a bit cell.

47
Non-Return-to-Zero (NRZ)

• The simplest data recording and transmission code

is the non-return-to-zero (NRZ) code.
• NRZ encodes 1 as “high” and 0 as “low.”
• The coding of OK (in ASCII) is shown below.

The problem with NRZ code is that long strings of

zeros and ones cause synchronization loss.
48
Non-return-to-zero-invert (NRZI)

• Non-Return-to-Zero-Invert (NRZI) reduces this

synchronization loss by providing a transition (either
low-to-high or high-to-low) for each binary 1 and no
transition for binary zero (0)

Although it prevents loss of synchronization over long

strings of binary ones, NRZI coding does nothing to
prevent synchronization loss within long strings of zeros
49
Manchester coding

• Manchester coding (also known as phase modulation)

prevents this problem by encoding a binary one with an
“up” transition and a binary zero with a “down” transition.

50
Error Detection and Correction

51
2.8 Error Detection and Correction

• It is physically impossible for any data recording or

transmission medium to be 100% perfect 100% of the
time over its entire expected useful life.
• As more bits are packed onto a square centimeter of
disk storage, as communications transmission speeds
increase, the likelihood of error is increasing.
• Thus, error detection and correction is critical to
accurate data transmission, storage and retrieval.

52
Types of Error

• Single bit error

– Only one bit in the
data unit has
changed.
• Burst error
– Two or more bits
in the data unit
has changed.

53
Error detection/correction

• Error detection
– Check if any error has occurred
– Don’t care the number of errors
– Don’t care the positions of errors

• Error correction
– Need to know the number of errors
– Need to know the positions of errors
– More difficult

10.54
Error Detection

• Error detecting code is to include

only enough redundancy to allow
the receiver to deduce that an error
occurred, but not which error, and
have it request a retransmission.
• Error detection uses the concept of
redundancy, which means adding
extra bits for detecting error at the
destination.
55
Redundancy

• For error detection, a

shorter group of bits may
be appended to the end
of each unit.
• This technique is called
Redundancy because the
extra bits are redundant
to the information.
• They are discarded as
soon as the accuracy of
the transmission has
been determined.

56
Error Detection Techniques

• Some popular techniques for error detection are:

– Parity check
– Checksum
– Cyclic redundancy check
– Cryptographic hash function

57
Parity check

• Check bit or parity bit will be added.

• Two methods
– Even parity checking
– Odd parity checking
• Even parity checking
– 1 is added to the block if the data
contains odd number of 1’s,
– 0 is added if the data contains even
number of 1’s
– Adding the parity bit makes the total
number of 1’s in the data even, that is
why it is called even parity checking.
• Odd parity checking
– 0 is added to the block if the data
contains odd number of 1’s,
– 1 is added if the data contains even
number of 1’s
– Adding the parity bit makes the total
number of 1’s in the data odd, that is • Can detect on Odd
why it is called odd parity checking. numbers of errors
• Only useful for detecting
errors 58
Checksum
• A small data block derived
from transmitted/stored digital
data for the purpose of
detecting errors that may have
been introduced during its
transmission or storage.
• The procedure which
generates this checksum is
called a checksum function
or checksum algorithm.
• E.g. a checksum of a message
can be a modular arithmetic
sum of message code words of
a fixed word length

59
Home work

• Find out what is

– Cyclic redundancy check
– Cryptographic hash function

60
Summery

• Understand the fundamentals of numerical data

representation in digital computers.
• Gain familiarity with the most popular character
codes.
• Become aware of the differences between how
data is stored in computer memory and how it is
transmitted over telecommunication lines.
• Understand the concepts of error detecting and
correcting codes.

61
Thank You

Cao Iii PDF
No ratings yet
Cao Iii PDF
16 pages
Unit 2
No ratings yet
Unit 2
85 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
64 pages
COA Lecture 1
No ratings yet
COA Lecture 1
49 pages
COMPX203 Computer Systems: Number Representation
No ratings yet
COMPX203 Computer Systems: Number Representation
33 pages
Chap 02
No ratings yet
Chap 02
16 pages
Transforming Data Into Information: Syed Mohsin Ali Sheerazi
No ratings yet
Transforming Data Into Information: Syed Mohsin Ali Sheerazi
51 pages
07 Datarepresentation 150216185458 Conversion Gate02
No ratings yet
07 Datarepresentation 150216185458 Conversion Gate02
43 pages
Architecture of Computers: Vistula University
No ratings yet
Architecture of Computers: Vistula University
30 pages
L2-Variables and Floating Point Number System
No ratings yet
L2-Variables and Floating Point Number System
38 pages
Computer Science Course Outline
No ratings yet
Computer Science Course Outline
66 pages
03-Data Representation
No ratings yet
03-Data Representation
6 pages
Computer Science Basics
No ratings yet
Computer Science Basics
102 pages
Tin học đại cương - Unit 1 (part 2)
No ratings yet
Tin học đại cương - Unit 1 (part 2)
83 pages
CH2 - Data Representation
No ratings yet
CH2 - Data Representation
29 pages
w4 One PDF
No ratings yet
w4 One PDF
40 pages
COA - Unit 2 Data Representation 1
No ratings yet
COA - Unit 2 Data Representation 1
59 pages
1 Numberrepresentation
No ratings yet
1 Numberrepresentation
36 pages
LEC03 Data II
No ratings yet
LEC03 Data II
45 pages
Data Representation
No ratings yet
Data Representation
28 pages
Computer Data Representation Basics
No ratings yet
Computer Data Representation Basics
25 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Module 1 Part 2
No ratings yet
Module 1 Part 2
12 pages
CSI 03 Tim
No ratings yet
CSI 03 Tim
73 pages
Introduction To Computer Organization: Don Johnson
No ratings yet
Introduction To Computer Organization: Don Johnson
5 pages
Unit1 Data Representation - 1
No ratings yet
Unit1 Data Representation - 1
35 pages
Introduction To Numerical Computing: Statistics 580 Number Systems
No ratings yet
Introduction To Numerical Computing: Statistics 580 Number Systems
35 pages
Binary Data Representation Guide
No ratings yet
Binary Data Representation Guide
27 pages
Lec 4
No ratings yet
Lec 4
15 pages
ARCh Presentation1
No ratings yet
ARCh Presentation1
12 pages
Finite Word Length Effects
No ratings yet
Finite Word Length Effects
31 pages
Lecture02-Data Representation 2
No ratings yet
Lecture02-Data Representation 2
38 pages
Coa Module-Iii
No ratings yet
Coa Module-Iii
13 pages
Week-2 Data Representation
No ratings yet
Week-2 Data Representation
15 pages
Lec 2 Unit-1
No ratings yet
Lec 2 Unit-1
65 pages
Binary Number Systems Explained
No ratings yet
Binary Number Systems Explained
34 pages
Data Representation
No ratings yet
Data Representation
5 pages
Unit Ii
No ratings yet
Unit Ii
11 pages
Lecture 2
No ratings yet
Lecture 2
27 pages
Data - Representation - UNIT 2 PDF
No ratings yet
Data - Representation - UNIT 2 PDF
30 pages
2.data - Representation - UNIT 2-2
No ratings yet
2.data - Representation - UNIT 2-2
42 pages
Chapter1 2
No ratings yet
Chapter1 2
66 pages
CO III SEM UNIT V (1) Anu Degree Notes For Co
No ratings yet
CO III SEM UNIT V (1) Anu Degree Notes For Co
32 pages
Alqalam Foundation of Seq PPT 3b
No ratings yet
Alqalam Foundation of Seq PPT 3b
91 pages
CH 2
No ratings yet
CH 2
61 pages
Unit1 2
No ratings yet
Unit1 2
64 pages
Machine Level Representation of Data Part 3
100% (1)
Machine Level Representation of Data Part 3
32 pages
CSC 206 Lecture 3
No ratings yet
CSC 206 Lecture 3
13 pages
Week II - Data Representation and Number System
No ratings yet
Week II - Data Representation and Number System
63 pages
NMCNTT-03-Data Storage
No ratings yet
NMCNTT-03-Data Storage
101 pages
CENG 103 Intro To CENG Lecture Notes SB - 1
No ratings yet
CENG 103 Intro To CENG Lecture Notes SB - 1
25 pages
Unit1 2
No ratings yet
Unit1 2
98 pages
4.5 Fundamentals of Data Representation
No ratings yet
4.5 Fundamentals of Data Representation
11 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
Computer Number Representation Basics
No ratings yet
Computer Number Representation Basics
19 pages
Sheet 1-4
No ratings yet
Sheet 1-4
3 pages
Lec 1
No ratings yet
Lec 1
65 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
Lecture11 Slides 1
No ratings yet
Lecture11 Slides 1
52 pages
PDF&Rendition 1
No ratings yet
PDF&Rendition 1
83 pages
IPP Module 1
No ratings yet
IPP Module 1
230 pages
TB LoopsandConditionalsPracticeTest
No ratings yet
TB LoopsandConditionalsPracticeTest
34 pages
Unit III: Arrays and Pointers
No ratings yet
Unit III: Arrays and Pointers
46 pages
Core - Java by Venu Kumaar
100% (1)
Core - Java by Venu Kumaar
287 pages
Worksheet 01 - Introduction To Python
No ratings yet
Worksheet 01 - Introduction To Python
5 pages
SCJP 6 Mock Exam 2 Questions
No ratings yet
SCJP 6 Mock Exam 2 Questions
16 pages
Python Introduction
No ratings yet
Python Introduction
281 pages
L-14 PLC - 3
No ratings yet
L-14 PLC - 3
70 pages
L2 Slides - Intro To Python Programming - Y8
No ratings yet
L2 Slides - Intro To Python Programming - Y8
24 pages
Chapter 1 Information Representation
No ratings yet
Chapter 1 Information Representation
51 pages
C Programming and Assembler Lecture
No ratings yet
C Programming and Assembler Lecture
36 pages
"Hotel Management System": A Project Report On
No ratings yet
"Hotel Management System": A Project Report On
42 pages
Transputer Architecture: Reference Manual
No ratings yet
Transputer Architecture: Reference Manual
31 pages
SQL Data Types and Operations
No ratings yet
SQL Data Types and Operations
27 pages
CC2 Week-2
No ratings yet
CC2 Week-2
11 pages
Practicals (2nd Year Computer Science)
No ratings yet
Practicals (2nd Year Computer Science)
25 pages
Ada Unit-I
No ratings yet
Ada Unit-I
88 pages
Imp BCP
No ratings yet
Imp BCP
23 pages
CA ch2
No ratings yet
CA ch2
14 pages
Advanced C Concepts: 2501ICT Nathan
No ratings yet
Advanced C Concepts: 2501ICT Nathan
32 pages
Moog ServoDrives MSD - Parameter Manual en
No ratings yet
Moog ServoDrives MSD - Parameter Manual en
118 pages
Python Unit 1 To 5 - Final
No ratings yet
Python Unit 1 To 5 - Final
203 pages
Public Class ExampleProgram
No ratings yet
Public Class ExampleProgram
23 pages
Kapil Dev Saini - AssessmentCenterReport - 163
No ratings yet
Kapil Dev Saini - AssessmentCenterReport - 163
34 pages
B.Tech VIII BDA Chapter - 3 1
No ratings yet
B.Tech VIII BDA Chapter - 3 1
3 pages
Java 1
No ratings yet
Java 1
27 pages
1180 PDF
No ratings yet
1180 PDF
5 pages
BCM ChilliReference PDF
No ratings yet
BCM ChilliReference PDF
712 pages
02 Handout 1
No ratings yet
02 Handout 1
5 pages

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Uploaded by

Madhusanka Liyanage: Lecture 3: Data Representation in Computer Systems

Uploaded by

COMP 30660: Computer Architecture and Organization (CONV)

Lecture 3: Data Representation in

• Understand the fundamentals of numerical data

• Data can be defined as a representation of facts,

• Use to indicate the capacity of some standard

• A bit is the most basic unit of data in a computer.

• A byte is a group of eight bits.

• A word is a contiguous group of

• The computer work with binary numbers

• The Representation of a positive integer number

▪ An integer can be represented by fixed point

▪ The magnitude of the number can be

• The signed magnitude, one’s

• This is the standard arrangement of these fields.

• The one-bit sign field is the sign of the stored value.

• The illustrations shown at

• Another problem with our system is that we have made

• The double precision standard has a bias of 1023

• Calculations are not useful until their results can

• In 1964, BCD was extended to an 8-bit code,

• Many of today’s systems embrace Unicode, a 16-bit

• The Unicode codes-

• The simplest data recording and transmission code

The problem with NRZ code is that long strings of

• Non-Return-to-Zero-Invert (NRZI) reduces this

Although it prevents loss of synchronization over long

• Manchester coding (also known as phase modulation)

• It is physically impossible for any data recording or

• Single bit error

• Error detecting code is to include

• For error detection, a

• Some popular techniques for error detection are:

• Check bit or parity bit will be added.

• Find out what is

• Understand the fundamentals of numerical data

You might also like