Normalization
Normalization
                                  Redundanc
  Process of organizing           y
                          To
        the data          Avoid
                                   Anomal
                                   y
                                                                    Degrades
                                                                   the overall
                                                                    integrity
Normalization -                                                     of Data
Definition
• technique of organizing the data in the database
• systematic approach of decomposing tables to eliminate data
 redundancy(repetition) and undesirable characteristics like Insertion,
 Update and Deletion Anomalies
• multi-step process that puts data into tabular form, removing duplicated
 data from the relation tables
• Normalization rules divides larger tables into smaller tables and links
 them using relationships
    Purpose of Normalization
                      Ensuring data
 Eliminating          dependencies
                     make sense i.e
  redundant
                     data is logically
(useless) data           stored
                          Redundancy
                                  • Data redundancy occurs when
             SID SNAME      AGE
                                   the same piece of data is stored
             1   Ravi       18
                                   in two or more separate places
 Row level   2   Ram        19
Redundancy   3   Nithin     18    • having multiple copies of same
             4   Rahul      20     data in the database
             1   Ravi       18
                                  • This increases the size of
             2   Ram        19
                                   the database
             7   Sara       20
                            Redundancy
             SID SNAME      CID CNAME FID Faculty Name Dept
             1   Ravi       C1   DBMS   F1   John       CSE
             2   Ram        C2   JAVA   F2   Joe        IT
  Column
   level     3   Nithin     C2   JAVA   F2   Joe        IT
Redundancy   4   Rahul      C1   DBMS   F1   John       CSE
             5   Priya      C3   DAA    F3   Maria     ECE
             6   Karthick   C1   DBMS   F1   John      CSE
             7   Sara       C2   JAVA   F2   Joe        IT
            Anomaly
                Caused due to
 too much                       Poor Construction
redundanc                             of Tables
y
        Anomaly -
        Types
Insertion   Updation   Deletion
Anomaly     Anomaly    Anomaly
Insertion
Anomaly
    SID     SNAME      CID     CNAME    FID   Faculty Name    Dept
    1       Ravi       C1      DBMS     F1    John             CSE
    2       Ram        C2      JAVA     F2    Joe               IT
    3       Nithin     C2      JAVA     F2    Joe               IT          Inserting a row in
    4       Rahul      C1      DBMS     F1    John             CSE          this table makes
    5       Priya      C3      DAA      F3    Maria            ECE          to insert
    6       Karthick   C1      DBMS     F1    John             CSE          redundant data
    7       Sara       C2      JAVA     F2    Joe               IT          for every row
                8            Varun     C1       DBMS     F1          John      CSE
Updation
Anomaly
   SID     SNAME       CID    CNAME    FID   Faculty Name         Dept
   1       Ravi        C1     DBMS     F1    John                  CSE
   2       Ram         C2     JAVA     F2    Joe                    IT
   3       Nithin      C2     JAVA     F2    Joe                    IT   Updation of
   4       Rahul       C1     DBMS     F1    John                  CSE   Faculty name
   5       Priya       C3     DAA      F3    Maria                 ECE   information has to
   6       Karthick    C1     DBMS     F1    John                  CSE   be done in all
   7       Sara        C2     JAVA     F2    Joe                    IT   rows
                      C1     DBMS     F4     Steve          CSE
Deletion Anomaly
   SID   SNAME       CID    CNAME   FID    Faculty Name      Dept
                                                                            Details of
   1     Ravi        C1     DBMS    F1     John                  CSE
                                                                            students who
   2     Ram         C2     JAVA    F2     Joe                   IT
                                                                            completed the
   3     Nithin      C2     JAVA    F2     Joe                   IT
                                                                            course has to be
   4     Rahul       C1     DBMS    F1     John                  CSE
                                                                            deleted
   5     Priya       C3     DAA     F3     Maria                 ECE
   6     Karthick    C1     DBMS    F1     John                  CSE
   7     Sara        C2     JAVA    F2     Joe                   IT
                                                                                 Loss of
                                                                              related data
                                                                               when some
                                                                              other data is
            4       Rahul   C1      DBMS           F1     John        CSE
                                                                                 deleted
How Normalization
 Solves this issues?
      Student Course
         Relation
SID         SNAME      CID   CNAME   FID   Faculty Name   Dept
1           Ravi       C1    DBMS    F1    John            CSE   Decompose the tables
                                                                   into the 3 tables –
2           Ram        C2    JAVA    F2    Joe              IT
                                                                  Student, Course and
3           Nithin     C2    JAVA    F2    Joe              IT           Faculty
4           Rahul      C1    DBMS    F1    John            CSE
5           Priya      C3    DAA     F3    Maria           ECE
6           Karthick   C1    DBMS    F1    John            CSE
7           Sara       C2    JAVA    F2    Joe              IT
      Student                Course                      Faculty
SID   SNAME      CID   CID    CNAME   FID   FID   Faculty Name     Dept
1     Ravi       C1    C1     DBMS    F1    F1    John              CSE
2     Ram        C2    C2     JAVA    F2    F2    Joe                IT
3     Nithin     C2    C3     DAA     F3    F3    Maria             ECE
4     Rahul      C1
5     Priya      C3
6     Karthick   C1
7     Sara       C2
                                     Insertion anomaly
                                     Solved
    Student
SID       SNAME       CID
1         Ravi        C1
2         Ram         C2
3         Nithin      C2
4         Rahul       C1
                            Varun can be
5         Priya       C3     added in the
6         Karthick    C1    students table
7         Sara        C2
8             Varun
                                          Avoiding Updation
 Faculty
                                          anomaly
FID   Faculty Name           Dept
F1    John                    CSE
F2    Joe                      IT
F3    Maria                   ECE
                                                 Faculty
                                               Name Steve
                                               is updated
                                                with John
              F4     Steve          CSE
                                    Deletion anomaly Solved
    Student
SID       SNAME      CID
1         Ravi       C1
2         Ram        C2
3         Nithin     C2
4         Rahul      C1        Rahul
5         Priya      C3    completed the
                            course and it
6         Karthick   C1
                           can be deleted
7         Sara       C2
Say how all anomalies occur in this table
emp_id   emp_name emp_address   emp_dept
  101      Rick       Delhi      D001       Insert anomaly
  101      Rick       Delhi      D002          New employee Joins
  123     Maggie      Agra       D890          Not Assigned to any
  166     Glenn      Chennai     D900          dept emp_dept will be
  166     Glenn      Chennai     D004          null
                                              Delete anomaly
Update anomaly
                                                 company closes
  update the address of
                                              the department D890
     Rick
Normalization Types
First Normal Form
       1NF
First Normal Form (1NF)
                                2 NF
           Not in
            1NF
                        1NF
                        First
       Poor DB design
                        Step
                                                                   Every
                                                                   Table
                                                                 should be
                                                                    1NF
          First Normal Form (1NF)
4 Rules
            1. It should only have single(atomic) valued
               attributes/columns.
            2. Values stored in a column should be of the same domain
            3. All the columns in a table should have unique names.
            4. order in which data is stored, does not matter.
                            1 NF – Rule 1
    Each Column should have
         atomic values                Player_Inventory
                                      Relation
Entries like 3 shields, 5
arrows, 30 copper
coins, 7 rings violate
this rule
                     1NF – Rule 2
Column should contain
values that are of the
                                  Beatle_Height
      same type
     Do not inter – mix
   different types of values in
   any column
                        1NF – Rule 3
    Each column should
    have a unique name
                                Beatle_Height
Same name leads to
  confusion at the time of
  data retrieval
                1NF – Rule 4
Order in which data
is stored doesn’t
matter
              Using SQL query, data can
           be retrieved from table in any
           order
Example
:
            How to
          solve this ?
Example
:
           Split the
           values to
          individua
            l rows
Second Normal
       Form 2NF
          Second Normal
          Form
2 Rules
              1. The table should be in the First Normal
                Form.
              2. There should be no Partial
                 Dependency.
               Rule 1
table should be in the First Normal
   Form Already covered as 1NF
2NF – Rule 2 –   Partial
Dependency
                     Dependency ?
                                                         This is called
                                                         Dependency
                                                         or Functional
    Student
                                                         Dependency
Sid is unique which helps to fetch any data
SID             SNAME    SDEPT    SADDR       SPHONENO
1               Ravi     CSE      BANG        8900999890
2               Ram      IT       CBE         9345677788
              every other column depends on
                         the primary key
 Functional
Dependency
                   Functional Dependency
• relationship that exists between two attributes
• exists between the primary key and non-key attribute within a
 table
          Attribute 1             Uniquely Identifies             Attribute 2
  attributes of a table is said to be dependent on each other when an attribute of a
                   table uniquely identifies another attribute of the same table.
STUD_NO->STUD_NAME          STUD_NO->STUD_PHONE
       STUD_NO->STUD_NAME, STUD_NO->STUD_PHONE     Holds
                     STUD_NAME->STUD_ADDR         Do not Hold
                     Functional Dependency
Represented as
        A        B
        A Determines B or B is determined by A
        A - Determinant   B - Dependent
Eg:
      Emp_Id → Emp_Name
Functional Dependency - Types
   Trivial functional dependency
         • dependency of an attribute on a set of attributes
         • A -> B is a trivial functional dependency if B is a subset of A
E.g.,
{Student_Id, Student_Name} ->Student_Id
         • if we know the values of Student_Id and Student_Name then the
           value of Student_Id can be uniquely determined
         • Student_Id -> Student_Id & Student_Name -> Student_Name
           are trivial dependencies too
  Non trivial functional dependency
             • dependency of an attribute on a set of attributes
             • functional dependency X->Y holds true where Y is not a subset of X
E.g.,
     emp_id -> emp_name (emp_name is not a subset of emp_id)
    emp_id -> emp_address (emp_address is not a subset of emp_id)
    {emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}] is trivial
Completely non-trivial FD:
If an FD X->Y holds true where X intersection Y is null then this dependency is said to be a
completely non trivial function dependency.
What is Partial
 Dependency
Composite Key
– Player_ID and
                  Player_Inventory
Item_Type
Player_Inventory
                   Functional
                   Dependenc
                   y
Player_Inventory    Player_ratin
                    g
                        Player_ID
                    attribute in a table
                    depends on only a
                   part of the primary
  Partial          key and not on the
                         whole key
  Dependency
How to Remove Partial Dependency?
 Decompose the table into Player and
 Player_Inventory
      teacher_id           subject         teacher_age
         111                Maths              38
         111               Physics             38                      Its
         222               Biology             38                      in
         333               Physics             40                      1NF
         333              Chemistry            40
                                                                Candidate Keys: {teacher_id,
teacher_id         teacher_age                                  subject}
                                      teacher_id     subject
   111                  38               111          Maths
                                                                Non prime attribute: teacher_age
   111                  38               111         Physics
   222                  38               222         Biology      “no non-prime attribute is dependent on
   333                  40               333         Physics     the proper subset of any candidate key of
   333                  40               333        Chemistry                   the table
Third Normal Form
       3NF
               3NF
2 Rules   1. It should be in the Second
            Normal form.
          2. It should not have
            Transitive Dependency.
3NF - Example
 Player_Inventory
                    Add the Skill Level to the
                    Player relation
3NF - Example   Inconsistency occurs for the Beginner
                rating marked as 4 in the Skill level
I’m     What is
a      Transitive
3NF   Dependency
3NF - Example   Functional dependency exists as
                Transitive dependency
  3NF - Example                               Functional dependency exists as
                                              Transitive dependency
Some Attribute in a table depends on some non prime attribute and not
                                on prime attribute
   How to remove
Transitive Dependency?
Remove the Player_Rating from the
Player table
Boyce-Codd Normal
        Form BCNF
               BCNF
          1. It should be in the Third
2 Rules
            Normal Form.
          2. for any dependency A → B,
            A should be a super key.
BCNF
   for any dependency A → B, A should be a super
   key.
      EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE        EMP_DEPT_NO
                EMP_ID   →
Dependencie         EMP_COUNTRY
s
                EMP_DEPT →       {DEPT_TYPE,
                EMP_DEPT_NO}
 Candidate key: {EMP-ID, EMP-DEPT}
     not in BCNF because neither EMP_DEPT nor EMP_ID alone
                                  are keys
              BCNF
              A→ B
Non – prime   →
 Attribute
               Not in
               BCNF
Example1
                        Example1
Taking the Loophole definition of 2NF, this table is not in 2NF
Informal definition: 2NF says we can’t have a non-key attribute
that depends on the part of the key.
Formal definition: 2NF says we can’t have a non-prime attribute
that depends on part of a candidate key.
Example1
           Not in
           2 NF
Definitions
                      Example2
Every Attribute in the table is a prime attribute
Example2
           It is in 3NF
                      BCNF
Formal definition: With the exception of trivial functional
dependencies, every functional dependency in a table must be a
dependency on a super key.
Release_Year and _Month is not a super key.
It is not in BCNF
How to satisfy
  BCNF?
Change the Release_Month to contain the month
alone
Fourth Normal Form
        4NF
               4NF
       1. It should be in the Boyce-Codd Normal
 2
Rule     Form.
 s
       2. the table should not have any Multi-
         valued Dependency
      What is Multi-
I’m      Valued
4NF    Dependency
                    4NF
A      B is Multi-Valued Dependency
when
                     B1
            A
                    B2
                          B and C Should be independent of Each other
                           To have Multi-Valued Dependency, there
                                 should be atleast 3 columns
             4NF
Multiple Rows will solve the
problem
                       4NF
For a Table     with A,B, C columns
A       B multi-valued dependency occurs in 3 conditions
   A     B, for single value of A , more than one value of B
   exist
   Table Should have atleast 3 columns
   For this table A,B,C columns , B and C should be independent
4NF
Example
No
Relationship
                                      4NF
A functional dependency exists as a
Multivalued dependency
4NF
Fifth Normal Form
       5NF
                  5NF
       1. It should be in the Fourth
 2       Normal Form.
Rule
       2. It should not have any
 s
         Join Dependency
I’m   What is Join
5NF   Dependency
                    A      Relation with Join Dependency
 If    Join    B               C
Dependency                                              If this break down or
                                                       decomposition of table
exists, then                                           causes any deletion or
                                                              creation of
decompose                                                     entries……
 the tables         A
                   Join
                   Again
 Primary
 Key is a
Composite
   Key
Supplier    SPC      Customer
           Product
                               Customer 1
Supplier            Product
           Can
           Supply
                               Customer 2
                                Product 1
Customer            Supplier
           Buying
           from
                                Product 2
                                Supplier 1
 Product            Customer
           Used
           by
                                Supplier 2
Supplier    SPC      Customer
           Product
            Selling
            to
Supplier    SPC       Customer
           Product
Supplier     SC      Customer
           Product
           ACME Sells 72X SW
           Ford Uses 72X SW
           ACME supplies to
           Ford
ACME Sells 72X SW to Ford
We can get this only from
         this table so
   Don’t decompose
          the table
                 A
             B        C
Additional Information is
      Created Or                       Original
  Information is lost
                            Stick to    Table
                            the
Thank you