100% found this document useful (1 vote)

260 views8 pages

Pandas Worksheets ALL

Uploaded by

uusepp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

260 views8 pages

Pandas Worksheets ALL

Uploaded by

uusepp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

1 Reading and Writing

Data with Pandas

Functions to read data are all named pd.read_* where *

is the file type. Series and DataFrames can be saved to disk
using their to_* method.

Usage Patterns
Use pd.read_clipboard() for one-off data extractions.
Use the other pd.read_* methods in scripts for repeatable analyses.

Reading Text Files into a DataFrame

Colors highlight how different arguments map from the data file to a
DataFrame.

Other arguments: Possible values of parse_dates:

• names: Set or override column names [0, 2]: Parse columns 0 and 2 as separate dates
• parse_dates: Accepts multiple argument types [[0, 2]]: Group columns 0 and 2 and parse as single date
• converters: Manually process each element in a column {'Date': [0, 2]}: Group columns 0 and 2, parse as single date in a
• comment: Character indicating commented line column named Date
• chunksize: Read only a certain number of rows each time Dates are parsed after the converters have been applied.

Parsing Tables from the Web

Writing Data Structures to Disk Writing Data Structures from and to a Database
Write data structures to disk: Read, using SQLAlchemy. Supports multiple databases:
> s_df.to_csv(filename) > from sqlalchemy import create_engine
> s_df.to_excel(filename) > engine = create_engine(database_url)
Write multiple DataFrames to single Excel file: > conn = engine.connect()
> writer = pd.ExcelWriter(filename) > df = pd.read_sql(query_str_or_table_name, conn)
> df1.to_excel(writer, sheet_name='First') Write:
> df2.to_excel(writer, sheet_name='Second') > df.to_sql(table_name, conn)
> writer.save()

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

A Series, which maps an index to values. It can be thought of as

an ordered dictionary or a Numpy array with row labels and a
len(s_df) Number of rows
s_df.head() and s_df.tail() First/last rows
s.unique() Series of unique values
s_df.describ
df.info() Memory usage
Summary stats

name.
df A DataFrame, which maps index and column labels to values. It Indexing and Slicing
is like a dictionary of Series (columns) sharing the same index, Use these attributes on Series and DataFrames for indexing, slicing,
or like a 2D Numpy array with row and column labels. and assignments:
s_df Applies to both Series and DataFrames. s_df.loc[ ] Refers only to the index labels
Manipulations of Pandas objects usually return copies. s_df.iloc[ ] Refers only to the integer location,
similar to lists or Numpy arrays
Creating Series and DataFrames s_df.xs(key, level=L) Select rows with label key in level L
of an object with MultiIndex.
Values
Series
n1 'Cary' 0 Masking and Boolean Indexing
> pd.Series(values, index=index, name=name)
n2 'Lynn' 1 Create masks with comparisons:
> pd.Series({'idx1' : val1,'idx2' : val2}
n3 'Sam 2 mask = df['X'] < 0
Where values, index, and name are sequences or arrays.
Index
Or isin, for membership mask:
DataFrame mask = df['X'].isin(list_of_valid_values)
Age Gender Coulmns
> pd.DataFrame(values, Use masks for indexing:
'Cary' 32 M
index=index, columns=col_names) df.loc[mask] = 0
'Lynn' 18 F
> pd.DataFrame({'col1' :
series1_or _seq,
'Sam 26 M Combine multiple masks with bitwise operators — and (&), or (|), or (^),

'col2': series2_or _seq})

Index not (~) — and group them with parentheses:

Where values is a sequence of sequences or a 2D array. mask = (df['X'] < 0) & (df['Y'] == 0)

Common Indexing and Slicing Patterns

Manipulating Series and DataFrames
rows and cols can be values, lists, Series, or masks.
Manipulating Columns
s_df.loc[rows] Some rows (all columns in a DataFrame)
df.rename(columns={old_name:new_name}) Renames column
df.loc[:, cols_list] All rows, some columns
df.drop(name_or_names, axis='columns') Drops column name
df.loc[rows, cols] Subset of rows and columns
Manipulating Index s_df.loc[mask] Boolean mask of rows (all columns)
s_df.reindex(new_index) Conform to new index df.loc[mask, cols] Boolean mask of rows, some columns
s_df.drop(labels_to_drop) Drops index labels
s_df.rename Using [ ] on Series and DataFrames
(index={old_label: new_label}) Renames index labels
s_df.sort_index() Sorts index labels On Series, [] refers to the index labels, or to a slice:
df.set_index(column_name_or_names) s['a'] Value
s_df.reset_index() Inserts index into columns, resets s[:2] Series, first two rows
index to default integer index On DataFrames, [ ] refers to columns labels:
df['X'] Series
Manipulating Values
df[['X', 'Y']] DataFrame
All row values and the index will follow:
df['new_or_old_col'] = series_or_array

df.sort_values(col_name, ascending=True)
df.sort_values(['X','Y'], ascending=[False, True]) Except with a slice or mask, as shown below:
df[:2] DataFrame, first two rows
Important Attributes and Methods df[mask] DataFrame, rows where mask is True

s_df.index Array-like row labels

Never chain brackets
df.columns Array-like column labels
NO > df[mask]['X'] = 1
s_df.values Numpy array, data
SettingWithCopyWarning
s_df.shape (n_rows, n_cols)
YES > df.loc[mask, 'X'] = 1
s.dtype, df.dtypes Type of Series or of each column

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

©2020 Enthought, Inc., licensed under the Creative Commons Attribution – Non-Commercial, No Derivatives 4.0
International License. To view a copy of this license, visit creativecommons.org/licenses/by-nc-nd/4.0/ www.enthought.com
3 Computation with
Series and DataFrames
Pandas objects do not behave exactly like Numpy arrays. They follow three main rules of binary operations.

Rule 1: Operations between multiple Pandas objects

implement auto-alignment based on index first.
Apply a Function to Each Series
Apply series_to_* function to every column by default
> s1 + s2 > s1.add(s2, fill_value=0) (across rows):
s1 s2 s1 s2 df.apply(series_to_value) → Series
a 1 NaN a NaN a 1 0 a 1 df.apply(series_to_series) → DataFrame

b 2 + b 4 = b 4 b 2 + b 4 = b 6
To apply the function to every row (across columns), set axis=1:
NaN c 5 c NaN 0 c 5 c 5
df.apply(series_to_series, axis=1)
Use add, sub, mul, and div, to set fill value.
Apply a Function to a DataFrame
Rule 2: Mathematical operators (+ - * / exp, log, ...) Apply a function that receives a DataFrame and returns a Series,
apply element by element on the values.
a DataFrame, or a single value:
df + 1 df.abs() np.log(df)
df.pipe(df_to_series) → Series
df.pipe(df_to_df) →
DataFrame
X Y X Y X Y X Y
df.pipe(df_to_value) → Value
a -2 -2 a -1 -1 a 1 1 a 0 0
b -2 -2 b -1 -1 b 1 1 b 0 0 What Happens with Missing Values?
c -2 -2 c -1 -1 c 1 1 c 0 0 Missing values are represented by NaN (not a number) or NaT
(not a time).
• They propagate in operations across Pandas objects
Rule 3: Reduction operations (mean, std, skew, kurt,
sum, prod, ...) are applied column by column by default. (1 + NaN → NaN).
> df.sum() Series • They are ignored in a "sensible" way in computations;
They equal 0 in sum, they're ignored in mean, etc.
X Y > df.sum() X
a → Y
• They stay NaN with mathematical operations such
as np.log(NaN) → NaN.
b
c
count: Number of non-null observations
Operates across rows by default (axis=0, or axis='rows').
sum: Sum of values
Operate across columns with axis=1 or axis='columns'.
mean: Mean of values
mad: Mean absolute deviation
Differences Between
median: Arithmetic median of values
Pandas Objects and Numpy Arrays
min: Minimum
When it comes to Pandas objects and Numpy arrays, aligning objects
max: Maximum
on the index (or columns) before calculations might be the most
mode: Mode
important difference. There are built-in methods for most common
prod: Product of values
statistical operations, such as mean or sum, and they apply across
std: Bessel-corrected sample standard deviation
one-dimension at a time. To apply custom functions, use one of
var: Unbiased variance
three methods to do tablewise (pipe), row or column-wise (apply),
sem: Standard error of the mean
or elementwise (applymap) operations.
skew: Sample skewness (3rd moment)
Apply a Function to Each Value kurt: Sample kurtosis (4th moment)
Apply a function to each value in a Series or DataFrame: quartile: Sample quantile (Value at %)
s.apply(value_to_value) → Series value_counts: Count of unique values
df.applymap(value_to_value) → DataFrame

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

©2020 Enthought, Inc., licensed under the Creative Commons Attribution – Non-Commercial, No Derivatives 4.0
International License. To view a copy of this license, visit creativecommons.org/licenses/by-nc-nd/4.0/ www.enthought.com
4 Plotting with Pandas
Series and DataFrames
Pandas uses Matplotlib to generate figures. Once a figure is generated with Pandas, all of Matplotlib's functions can be used
to modify the title, labels, legend, etc. In a Jupyter notebook, all plotting calls for a given plot should be in the same cell.

Parts of a Figure Setup

An Axes object is what we Import packages:
think of as a “plot”. It has a > import pandas as pd
title and two Axis objects > import matplotlib.pyplot as plt
that define data limits. Each Execute this at IPython prompt to display figures in new windows:
Axis can have a label. There > %matplotlib
can be multiple Axes objects Use this in Jupyter notebooks to display static images inline:
in a Figure. > %matplotlib inline
Use this in Jupyter notebooks to display zoomable images inline:
> %matplotlib notebook

Plotting with Pandas Objects

Series DataFrame Labels

With a Series, Pandas plots values against With a DataFrame, Pandas creates one line Use Matplotlib to override or add annotations:
the index: per column: > ax.set_xlabel('Time')
> ax = s.plot() > ax = df.plot() > ax.set_ylabel('Value')
> ax.set_title('Experiment A')
Note: When plotting the results of complex manipulations with groupby, it's often useful to Pass labels if you want to override the column
stack/unstack the resulting DataFrame to fit the one-line-per-column assumption. names and set the legend location:
> ax.legend(labels, loc='best')

Useful Arguments to Plot

• subplots=True: One subplot per column, instead of one line
• figsize: Set figure size, in inches
• x and y: Plot one column against another

Kinds of Plots

+
df.plot.scatter(x, y) df.plot.bar() df.plot.hist() df.plot.box()

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

©2020 Enthought, Inc., licensed under the Creative Commons Attribution – Non-Commercial, No Derivatives 4.0
International License. To view a copy of this license, visit creativecommons.org/licenses/by-nc-nd/4.0/ www.enthought.com
5 Manipulating
Dates and Times
Use a Datetime index for easy time-based indexing and slicing, as well as for powerful resampling and data alignment.
Pandas makes a distinction between timestamps, called Datetime objects, and time spans, called Period objects.

Converting Objects to Time Objects Creating Ranges of Timestamps

Convert different types like strings, lists, or arrays to Datetime with: > pd.date_range(start=None, end=None,
> pd.to_datetime(value) periods=None, freq=offset,
Convert timestamps to time spans and set the period “duration” with tz='Europe/London')
frequency offset. Specify either a start or end date, or both. Set number of "steps" with
> date_obj.to_period(freq=freq_offset) periods. Set "step size" with freq. Specify time zones with tz.

Frequency Offsets Save Yourself Some Pain: Use ISO 8601 Format
Used by date_range, period_range and resample: To be consistent and minimize the risk of error or confusion, use ISO

• B: Business day • A: Year end format YYYY-MM-DD when entering dates:

• D: Calendar day • AS: Year start NO > pd.to_datetime('12/01/2000') # 1st December

• W: Weekly • H: Hourly Timestamp('2000-12-01 00:00:00')

• M: Month end • S: Secondly NO > pd.to_datetime('13/01/2000') # 13th January!

• MS: Month start • L, ms: Milliseconds Timestamp('2000-01-13 00:00:00')

• BM: Business month end • U, us: Microseconds YES > pd.to_datetime('2000-01-13') # 13th January

• Q: Quarter end • N: Nanoseconds Timestamp('2000-01-13 00:00:00')

Creating Ranges of Periods

For more, look up "Pandas Offset Aliases" or check out the pandas.
> pd.period_range(start=None, end=None,
tseries.offsets and pandas.tseries.holiday modules.
periods=None, freq=offset)
Timestamps vs Periods
Resampling
> s_df.resample(freq_offset).mean()

resample returns a groupby-like object that must be aggregated

with mean, sum, std, apply, etc. (See also the Split-Apply-Combine
cheat sheet.)

VECTORIZED STRING OPERATIONS

Pandas implements vectorized string operations named Splitting and Replacing

after Python's string methods. Access them through the Split returns a Series of lists:
str attribute of string Series. > s.str.split()
Some String Methods Access an element of each list with get:
> s.str.split(char).str.get(1)
> s.str.lower() > s.str.strip()
> s.str.isupper() > s.str.normalize() Return a DataFrame instead of a list:
> s.str.split(expand=True)
> s.str.len()
Find and replace with string or regular expressions:
Index by character position:
> s.str.replace(str_or_regex, new)
> s.str[0]
> s.str.extract(regex)
True if a regular expression pattern or string is in a Series:
> s.str.findall(regex)
> s.str.contains(str_or_pattern)

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

©2020 Enthought, Inc., licensed under the Creative Commons Attribution – Non-Commercial, No Derivatives 4.0
International License. To view a copy of this license, visit creativecommons.org/licenses/by-nc-nd/4.0/ www.enthought.com
6 Combining
DataFrames
There are numerous tools for combining Series and
DataFrames together, with SQL-type joins and concatena-
tion. Use join if merging on indices, otherwise use merge.
Join on Index
> df.join(other)
Merge DataFrames on indexes. Set on=columns to join on index of
other and on columns of df. join uses pd.merge under the covers.
Merge on Column Values
> pd.merge(left, right, how='inner', on='id') Concatenating DataFrames
Ignores index, unless on=None. See the section on the how keyword. > pd.concat(df_list)

Use on if merging on same column in both DataFrames, otherwise use “Stacks” DataFrames on top of each other.

left_on, right_on. Set ignore_index=True to replace index with RangeIndex.

Note: Faster than repeated df.append(other_df).

MERGE TYPES: THE HOW KEYWORD

left left_on='X' right_on='Y' right

LONG X LONG X Y SHORT Y SHORT
0 aaaa a 0 aaaa a — — 0 b bb
left right how="outer"
1 bbbb b 1 bbbb b b bb 1 c cc
2 — — c cc

LONG X LONG X Y SHORT Y SHORT

left right how="inner" 0 aaaa a 0 bbbb b b bb 0 b bb
1 bbbb b 1 c cc

LONG X LONG X Y SHORT Y SHORT

left right how="left" 0 aaaa a 0 aaaa a — — 0 b bb
1 bbbb b 1 bbbb b b bb 1 c cc

LONG X LONG X Y SHORT Y SHORT

left right how="right" 0 aaaa a 0 bbbb b b bb 0 b bb
1 bbbb b 1 — — c cc 1 c ctc

CLEANING DATA WITH MISSING VALUES

Pandas represents missing values To find missing values, use:

as NaN (Not a Number), which comes > s_df.isnull() or > pd.isnull(obj)
from Numpy and is of type float64. > s_df.notnull() or > pd.notnull(obj)
To find and replace these missing To replace missing values, use:
values, you can use any number s_df.loc[s_df.isnull()] = 0 Use mask to replace NaN
of methods. s_df.interpolate(method='linear') Interpolate using different methods
s_df.fillna(method='ffill') Fill forward (last valid value)
s_df.fillna(method='bfill') Or backward (next valid value)
s_df.dropna(how='any') Drop rows if any value is NaN
s_df.dropna(how='all') Drop rows if all values are NaN
s_df.dropna(how='all', axis=1) Drop across columns instead of rows

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

2. Apply a function to each group

to aggregate, transform, or filter.
Apply/Combine: Aggregation
Perform computations on each group. The
shape changes; the categories in the grouping
columns become the index. Can use built-
3. Combine the results. in aggregation methods: mean, sum, size,
count, std, var, sem, describe, first,
The apply and combine steps are last, nth, min, max, for example:
typically done together in Pandas. > g.mean()
Apply/Combine: Filtering ... or aggregate using custom function:
Split: Group By > g.agg(series_to_value)
Returns a group only if condition is true.
Group by a single column: > g.filter(lambda x: len(x)>1) ... or aggregate with multiple functions at once:
> g = df.groupby(col_name) > g.agg([s_to_v1, s_to_v2])
Grouping with list of column names creates a ... or use different functions on different
DataFrame with a MultiIndex: columns:
> g = df.groupby(list_col_names) > g.agg({'Y': s_to_v1, 'Z': s_to_
Pass a function to group based on the index: v2})
> g = df.groupby(function)
Z
0 a
2 a

Z
1 b a
Split/Apply/Combine 3 b b
c
Z
4 ca

Other Groupby-Like Operations:

Window Functions
Apply/Combine: • resample, rolling, and ewm (exponential
General Tool: apply
weighted function) methods behave like
apply is more general than agg, transform,
GroupBy objects. They keep track of which
and filter. It can aggregate, transform or Split Apply Combine
row is in which “group.” Results must be
filter. The resulting dimensions can change, • Groupby • Apply
• Window • Group-specific aggregated with sum, mean, count, etc.
for example:
Functions transformations • resample is often used before rolling,
> g.apply(lambda x: x.describe()) • Aggregation
expanding, and ewm when using a Date-
• Group-specific Filtering
Time index.
Apply/Combine: Transformation
The shape and the index do not change. Split: What’s a GroupBy Object?
> g.transform(df_to_df) It keeps track of which rows are part of
Example, normalization: which group.
> def normalize(grp): > g.groups → Dictionary, where keys are
. return ( group names, and values are indices of rows in
. (grp - grp.mean()) a given group.
. / grp.var() It is iterable:
. ) > for group, sub_df in g:
. ...
> def normalize(grp):
. return ((grp - grp.mean())
. / grp.var())

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

©2020 Enthought, Inc., licensed under the Creative Commons Attribution – Non-Commercial, No Derivatives 4.0
International License. To view a copy of this license, visit creativecommons.org/licenses/by-nc-nd/4.0/ www.enthought.com
8 Reshaping DataFrames
and Pivot Tables
Let’s explore some tools for reshaping DataFrames from the wide to the long format and back. The long format can be tidy,
which means that each variable is a column, each observation is a row. It is easier to filter, aggregate, transform, sort, and
pivot. Reshaping operations often produces multi-level indices or columns, which can be sliced and indexed.

MultiIndex: A Multi-Level Hierarchical Index Long to Wide Format and Back with stack()
and unstack()
Often created as a result of:
> df.groupby(list_of_columns) Pivot column level to index, i.e. “stacking the columns” (wide to long):
> df.set_index(list_of_columns) > df.stack()
Contiguous labels are displayed together but apply to each row. The Pivot index level to columns, “unstack the columns” (long to wide):
concept is similar to multi-level columns. > df.unstack()
A MultiIndex allows indexing and slicing one or multiple levels at once. If there are multiple indices or column levels, use level number or
Using the Long example from the right: name to stack/unstack:
long.loc[1900] All 1900 rows > df.unstack(1) or > df.unstack('Month')
long.loc[(1900, 'March')] Value 2 A common use case for unstacking, plotting group data vs index
long.xs('March', level='Month') All March rows after groupby:
> (df.groupby(['A', 'B])['relevant'].mean()
Simpler than using boolean indexing, for example: .unstack().plot())
> long[long.Month == 'March']

Pivot Tables
> pd.pivot_table(df,
. index=cols, keys to group by for index
. columns=cols2, keys to group by for columns
. values=cols3, columns to aggregate

. aggfunc='mean') what to do with repeated values From Wide to Long with melt
Specify which columns are identifiers (id_vars, values will be repeated
Omitting index, columns, or values will use all remaining
for each row) and which are “measured variables” (value_vars, will
columns of df. You can “pivot” a table manually using groupby, stack,
become values in variable column. All remaining columns by default).
and unstack.
> pd.melt(df, id_vars=id_cols, value_vars=value_columns)
> pd.melt(team, id_vars=['Color'],
. value_vars=['A', 'B', 'C'],
. var_name='Team',
. value_name='Score')

df.pivot() vs pd.pivot_table
df.pivot() Does not deal with repeated values in index.
It's a declarative form of stack and unstack.
pd.pivot_table() Use if you have repeated values in index
(specify aggfunc argument).

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

Aspiring Data Scientist Guide
No ratings yet
Aspiring Data Scientist Guide
10 pages
NLP
No ratings yet
NLP
29 pages
12 - Programming (Light Theme)
No ratings yet
12 - Programming (Light Theme)
8 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
6 pages
MATLAB Programming Techniques Quick Reference
No ratings yet
MATLAB Programming Techniques Quick Reference
24 pages
Dagster Orchestration Guide
No ratings yet
Dagster Orchestration Guide
10 pages
Red Hat 3scale API Management 2.5
No ratings yet
Red Hat 3scale API Management 2.5
38 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Flask Restplus
No ratings yet
Flask Restplus
86 pages
#Big Data Analytics With Numpy in Python
No ratings yet
#Big Data Analytics With Numpy in Python
123 pages
CMOS Power System Design Guide
No ratings yet
CMOS Power System Design Guide
16 pages
Data Types
No ratings yet
Data Types
21 pages
Bringing The Best Out of Jupyter Notebooks For Data Science - by Parul Pandey - Towards Data Science
No ratings yet
Bringing The Best Out of Jupyter Notebooks For Data Science - by Parul Pandey - Towards Data Science
36 pages
Python 3
No ratings yet
Python 3
457 pages
Unix Commands: Whoami Who Mkdir
No ratings yet
Unix Commands: Whoami Who Mkdir
14 pages
Pandas Summarized Visually in 8
100% (2)
Pandas Summarized Visually in 8
8 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Subject IP
No ratings yet
Subject IP
9 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
32 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
41 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas
No ratings yet
Pandas
25 pages
Unit III Part 2 1725700061785
No ratings yet
Unit III Part 2 1725700061785
85 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Pandas
No ratings yet
Pandas
57 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Introduction To Pandas & Data Structures
No ratings yet
Introduction To Pandas & Data Structures
11 pages
Pandas
No ratings yet
Pandas
13 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Pandas AI ML Python Software Engineering
No ratings yet
Pandas AI ML Python Software Engineering
63 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Panda
No ratings yet
Panda
46 pages
ML Unit-2 Notes
No ratings yet
ML Unit-2 Notes
17 pages
Pandas
No ratings yet
Pandas
63 pages
Eda U2
No ratings yet
Eda U2
61 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
18 pages
Unit 4
No ratings yet
Unit 4
36 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Pandas
No ratings yet
Pandas
42 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Introduction To Python's Pandas
No ratings yet
Introduction To Python's Pandas
53 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Pandas
No ratings yet
Pandas
49 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Cerberis XGR User Guide (v1.1)
No ratings yet
Cerberis XGR User Guide (v1.1)
41 pages
Fortinet-Partner Training BGP-On-Loopback v72 Guide v0.4
No ratings yet
Fortinet-Partner Training BGP-On-Loopback v72 Guide v0.4
199 pages
Input Output Devices
No ratings yet
Input Output Devices
34 pages
Id Scribd Com
No ratings yet
Id Scribd Com
178 pages
Bugbounty en
No ratings yet
Bugbounty en
170 pages
UI & UX Design Course Plan
No ratings yet
UI & UX Design Course Plan
16 pages
A400 Startup Manual Part3 v7.1
No ratings yet
A400 Startup Manual Part3 v7.1
56 pages
EMPOWERMENT TECHNOLOGY - TQ (1st Quarter 2023)
No ratings yet
EMPOWERMENT TECHNOLOGY - TQ (1st Quarter 2023)
2 pages
Department of Computer Science, UAF: A) Major Courses
0% (1)
Department of Computer Science, UAF: A) Major Courses
1 page
Kyocera Taskalfa 7002I, 8002I, 9002I Kyocera Taskalfa 7052ci, 8052ci
No ratings yet
Kyocera Taskalfa 7002I, 8002I, 9002I Kyocera Taskalfa 7052ci, 8052ci
8 pages
Prabha Manual
No ratings yet
Prabha Manual
153 pages
Affiliate Marketing Action Plan
No ratings yet
Affiliate Marketing Action Plan
7 pages
Resume Sarthak Gupta
No ratings yet
Resume Sarthak Gupta
2 pages
Algorithm For Insertion in A Circular Queue
No ratings yet
Algorithm For Insertion in A Circular Queue
8 pages
Prasanna Kankanala - Automation Azure Consultant
No ratings yet
Prasanna Kankanala - Automation Azure Consultant
5 pages
Business IT Tools for Students
No ratings yet
Business IT Tools for Students
13 pages
ERAB Failure Reason
No ratings yet
ERAB Failure Reason
6 pages
Module 4 (ISTQB) - Shreejith
No ratings yet
Module 4 (ISTQB) - Shreejith
19 pages
Entity Relationship Model
No ratings yet
Entity Relationship Model
53 pages
Skill Badges 2024 - Techcps
No ratings yet
Skill Badges 2024 - Techcps
7 pages
Eustace Mullins A Writ For Martyrs Incomplete Ebook PDF
No ratings yet
Eustace Mullins A Writ For Martyrs Incomplete Ebook PDF
1 page
CQI Table
No ratings yet
CQI Table
11 pages
Breno Araripe: Java Developer Profile
No ratings yet
Breno Araripe: Java Developer Profile
2 pages
Visual Studio Code: Developer's Guide
No ratings yet
Visual Studio Code: Developer's Guide
2 pages
Cste 01
No ratings yet
Cste 01
4 pages
Process
No ratings yet
Process
122 pages
2 Introduction To Programming Using Python
No ratings yet
2 Introduction To Programming Using Python
8 pages
Full Stack Dev: Beginner to Expert
No ratings yet
Full Stack Dev: Beginner to Expert
95 pages
Primer Dumps
100% (2)
Primer Dumps
11 pages
Os Manual Prac-No 2
No ratings yet
Os Manual Prac-No 2
3 pages

Pandas Worksheets ALL

Uploaded by

Pandas Worksheets ALL

Uploaded by

1 Reading and Writing

Data with Pandas

Functions to read data are all named pd.read_* where *

Reading Text Files into a DataFrame

Other arguments: Possible values of parse_dates:

Parsing Tables from the Web

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

A Series, which maps an index to values. It can be thought of as

'col2': series2_or _seq})

Common Indexing and Slicing Patterns

s_df.index Array-like row labels

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

Rule 1: Operations between multiple Pandas objects

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

Parts of a Figure Setup

Plotting with Pandas Objects

Useful Arguments to Plot

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

Converting Objects to Time Objects Creating Ranges of Timestamps

• B: Business day • A: Year end format YYYY-MM-DD when entering dates:

• D: Calendar day • AS: Year start NO > pd.to_datetime('12/01/2000') # 1st December

• W: Weekly • H: Hourly Timestamp('2000-12-01 00:00:00')

• M: Month end • S: Secondly NO > pd.to_datetime('13/01/2000') # 13th January!

• MS: Month start • L, ms: Milliseconds Timestamp('2000-01-13 00:00:00')

• Q: Quarter end • N: Nanoseconds Timestamp('2000-01-13 00:00:00')

Creating Ranges of Periods

resample returns a groupby-like object that must be aggregated

VECTORIZED STRING OPERATIONS

Pandas implements vectorized string operations named Splitting and Replacing

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

left_on, right_on. Set ignore_index=True to replace index with RangeIndex.

MERGE TYPES: THE HOW KEYWORD

left left_on='X' right_on='Y' right

LONG X LONG X Y SHORT Y SHORT

LONG X LONG X Y SHORT Y SHORT

LONG X LONG X Y SHORT Y SHORT

CLEANING DATA WITH MISSING VALUES

Pandas represents missing values To find missing values, use:

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

2. Apply a function to each group

Other Groupby-Like Operations:

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

Take your Pandas skills to the next level! Register at enthought.com/pandas-mastery-workshop

You might also like