Data cleaning W3School
Bad data could be:
• Empty cells
• Data in wrong format
• Wrong data
• Duplicates
Empty cells: - NaN Na Null
NaN => not a number & Na => not available
Return a new Data Frame with no empty cells.
the dropna(inplace = True) will NOT return a new Data Frame,
but it will remove all rows containing NULL values from the
original Data Frame.
Replace Empty Values
Replace Only for Specified Columns
Data of Wrong Format
you have two options: remove the rows or
convert all cells in the columns into the same format: -
NaT => Not a time
Wrong Data
"Wrong data" does not have to be "empty cells" or "wrong format", it
can just be wrong.
1-
----------------------------------------------------------------------------------
2-
3-
Duplicates
Discovering Duplicates
:
Returns True for every row that is a duplicate, otherwise False.
Removing Duplicates
M.E.M
======>