Unit5-Pandas 100 Ques-Ans
Unit5-Pandas 100 Ques-Ans
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y
0 1 5
1 2 6
2 3 7
3 4 8
Exercise 2:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df.head(3))
Copy
Output:
X Y
0 1 5
1 2 6
2 3 7
Exercise 3:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df['X'])
Copy
Output:
0 1
1 2
2 3
3 4
Name: X, dtype: int64
Exercise 4:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(filtered_df)
Copy
Output:
X Y
2 3 7
3 4 8
Exercise 5:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y Z
0 1 5 6
1 2 6 8
2 3 7 10
3 4 8 12
Exercise 6:
Solution:
import pandas as pd
data = {'X': [1, 2, 3, 4], 'Y': [5, 6, 7, 8], 'Z': [9, 10, 11, 12]}
df = pd.DataFrame(data)
df.drop(columns=['Z'], inplace=True)
print(df)
Copy
Output:
X Y
0 1 5
1 2 6
2 3 7
3 4 8
Exercise 7:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df.sort_values(by='X', inplace=True)
print(df)
Copy
Output:
X Y
3 1 5
2 2 6
1 3 7
0 4 8
Exercise 8:
Solution:
import pandas as pd
df = pd.DataFrame(data)
grouped_df = df.groupby('X').mean()
print(grouped_df)
Copy
Output:
Y
X
1 6.0
2 7.0
Exercise 9:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df.fillna(0, inplace=True)
print(df)
Copy
Output:
X Y
0 1.0 5.0
1 2.0 0.0
2 0.0 7.0
3 4.0 8.0
Exercise 10:
Solution:
import pandas as pd
df['X'] = pd.to_datetime(df['X'])
print(df)
Copy
Output:
X
0 2020-01-01
1 2020-01-02
2 2020-01-03
Exercise 11:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
col1 col2
0 1 4
1 2 5
2 3 6
Exercise 12:
Solution:
import pandas as pd
print(df.sum())
Copy
Output:
X 6
Y 15
dtype: int64
Exercise 13:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df.mean(axis=1))
Copy
Output:
0 2.5
1 3.5
2 4.5
dtype: float64
Exercise 14:
Solution:
import pandas as pd
df2 = pd.DataFrame(data2)
print(concatenated_df)
Copy
Output:
X Y
0 1 4
1 2 5
2 3 6
Exercise 15:
Solution:
import pandas as pd
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
print(merged_df)
Copy
Output:
Solution:
import pandas as pd
data = {'X': ['foo', 'foo', 'bar', 'bar'], 'Y': ['one', 'two', 'one', 'two'], 'Z': [1, 2, 3, 4]}
df = pd.DataFrame(data)
print(pivot_table)
Copy
Output:
Y one two
X
bar 3.0 4.0
foo 1.0 2.0
Exercise 17:
Solution:
import pandas as pd
data = {'X': ['foo', 'foo', 'bar', 'bar'], 'Y': ['one', 'two', 'one', 'two'], 'Z': [1, 2, 3, 4]}
df = pd.DataFrame(data)
print(wide_df)
Copy
Output:
Y one two
X
bar 3 4
foo 1 2
Exercise 18:
Solution:
import pandas as pd
df = pd.DataFrame(data)
correlation = df.corr()
print(correlation)
Copy
Output:
X Y
X 1.0 -1.0
Y -1.0 1.0
Exercise 19:
Solution:
import pandas as pd
df = pd.DataFrame(data)
Copy
Output:
014
125
236
Exercise 20:
Solution:
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y
0 2 8
1 4 10
2 6 12
Exercise 21:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y
0 1 2
1 3 4
Exercise 22:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y
0 1 4
1 2 5
2 3 6
Exercise 23:
Solution:
import pandas as pd
df = pd.DataFrame(data)
Copy
Output:
X Y
2 3 6
Exercise 24:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Cumulative_Sum'] = df['X'].cumsum()
print(df)
Copy
Output:
X Cumulative_Sum
0 1 1
1 2 3
2 3 6
3 4 10
Exercise 25:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df.dropna(inplace=True)
print(df)
Copy
Output:
X Y
0 1.0 4.0
1 2.0 5.0
Exercise 26:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y
0 1 5
1 2 6
2 3 0
3 4 0
Exercise 27:
Solution:
import pandas as pd
df = pd.DataFrame(data, index=index)
print(df)
Copy
Output:
Value
Group Number
X 1 10
2 20
Y 1 30
2 40
Exercise 28:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Rolling_Mean'] = df['X'].rolling(window=3).mean()
print(df)
Copy
Output:
X Rolling_Mean
0 1 NaN
1 2 NaN
2 3 2.0
3 4 3.0
4 5 4.0
5 6 5.0
Exercise 29:
Solution:
import pandas as pd
print(df)
Copy
Output:
X Y
0 1 2
1 3 4
2 5 6
Exercise 30:
Solution:
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y
0 1 3
1 2 4
2 5 6
Exercise 31:
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
print(df)
Copy
Output:
X Y Z
0 0.688292 0.950264 0.665916
1 0.497719 0.840536 0.923938
2 0.285218 0.091178 0.722034
3 0.037824 0.248689 0.584696
Exercise 32:
Solution:
import pandas as pd
data = {'X': [3, 1, 4, 1], 'Y': [2, 3, 1, 4]}
df = pd.DataFrame(data)
df['Rank'] = df['X'].rank()
print(df)
Copy
Output:
X Y Rank
0 3 2 3.0
1 1 3 1.5
2 4 1 4.0
3 1 4 1.5
Exercise 33:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['X'] = df['X'].astype(int)
print(df)
Copy
Output:
X
0 1
1 2
2 3
Exercise 34:
import pandas as pd
df = pd.DataFrame(data)
filtered_df = df[df['X'].str.contains('ba')]
print(filtered_df)
Copy
Output:
X
1 bar
2 baz
Exercise 35:
Solution:
import pandas as pd
print(df)
Copy
Output:
Transpose a DataFrame.
Solution:
import pandas as pd
df = pd.DataFrame(data)
transposed_df = df.T
print(transposed_df)
Copy
Output:
0 1 2
X 1 2 3
Y 4 5 6
Exercise 37:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df.set_index('X', inplace=True)
print(df)
Copy
Output:
Y
X
1 4
2 5
3 6
Exercise 38:
Reset the index of a DataFrame.
Solution:
import pandas as pd
df = pd.DataFrame(data)
df.set_index('X', inplace=True)
df.reset_index(inplace=True)
print(df)
Copy
Output:
X Y
0 1 4
1 2 5
2 3 6
Exercise 39:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df = df.add_prefix('col_')
print(df)
Copy
Output:
col_X col_Y
0 1 4
1 2 5
2 3 6
Exercise 40:
Solution:
import pandas as pd
df = pd.DataFrame(data, index=date_range)
filtered_df = df['2020-01-03':'2020-01-05']
print(filtered_df)
Copy
Output:
X
2020-01-03 3
2020-01-04 4
2020-01-05 5
Exercise 41:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df.drop_duplicates(inplace=True)
print(df)
Copy
Output:
X Y
0 1 4
1 2 5
3 3 6
Exercise 42:
Solution:
import pandas as pd
df = pd.DataFrame(data, index=index)
print(df)
Copy
Output:
Value
Group Number
X 1 10
2 20
Y 1 30
2 40
Exercise 43:
Solution:
import pandas as pd
data = {'X': [1, 3, 6, 10]}
df = pd.DataFrame(data)
df['Difference'] = df['X'].diff()
print(df)
Copy
Output:
X Difference
0 1 NaN
1 3 2.0
2 6 3.0
3 10 4.0
Exercise 44:
Solution:
import pandas as pd
df = pd.DataFrame(data, columns=columns)
print(df)
Copy
Output:
Group X Y
Type C1 C2 C1 C2
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12
Exercise 45:
Filter rows based on the length of strings in a column.
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(filtered_df)
Copy
Output:
Empty DataFrame
Columns: [X]
Index: []
Exercise 46:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Pct_Change'] = df['X'].pct_change()
print(df)
Copy
Output:
X Pct_Change
0 1 NaN
1 2 1.000000
2 3 0.500000
3 4 0.333333
Exercise 47:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y
0 1 4
1 2 5
2 3 6
Exercise 48:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(filtered_df)
Copy
Output:
X Y
1 2 6
2 3 7
Exercise 49:
Solution:
import pandas as pd
import numpy as np
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y zscore_A
0 1 4 -1.341641
1 2 5 -0.447214
2 3 6 0.447214
3 4 7 1.341641
Exercise 50:
Solution:
import pandas as pd
import numpy as np
print(df.describe())
Copy
Output:
X Y Z
count 5.000000 5.000000 5.000000
mean 60.600000 71.800000 42.600000
std 38.435661 13.971399 12.218838
min 5.000000 53.000000 28.000000
25% 40.000000 64.000000 34.000000
50% 69.000000 72.000000 41.000000
75% 91.000000 82.000000 55.000000
max 98.000000 88.000000 55.000000
Exercise 51:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Rank_A'] = df['X'].rank()
df['Rank_B'] = df['Y'].rank()
print(df)
Copy
Output:
X Y Rank_A Rank_B
0 3 2 3.0 2.0
1 1 3 1.5 3.0
2 4 1 4.0 1.0
3 1 4 1.5 4.0
Exercise 52:
import pandas as pd
df = pd.DataFrame(data)
filtered_df = df[df['X'].str.contains('ba|qu')]
print(filtered_df)
Copy
Output:
X
1 bar
2 baz
3 qux
Exercise 53:
Solution:
import pandas as pd
df = pd.DataFrame(data)
filtered_df = df[df['X'].str.contains('ba|qu')]
print(filtered_df)
Copy
Output:
X
1 bar
2 baz
3 qux
Exercise 54:
Create a DataFrame and calculate the kurtosis.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
print(df.kurt())
Copy
Output:
X 2.958407
Y -2.639654
Z 2.704430
dtype: float64
Exercise 55:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Cumulative_Product'] = df['X'].cumprod()
print(df)
Copy
Output:
X Cumulative_Product
0 1 1
1 2 2
2 3 6
3 4 24
Exercise 56:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Rolling_Std'] = df['X'].rolling(window=3).std()
print(df)
Copy
Output:
X Rolling_Std
0 1 NaN
1 2 NaN
2 3 1.0
3 4 1.0
4 5 1.0
5 6 1.0
Exercise 57:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Expanding_Mean'] = df['X'].expanding().mean()
print(df)
Copy
Output:
X Expanding_Mean
0 1 1.0
1 2 1.5
2 3 2.0
3 4 2.5
4 5 3.0
5 6 3.5
Exercise 58:
Create a DataFrame with random values and calculate the covariance matrix.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
print(df.cov())
Copy
Output:
X Y Z
X 0.054079 0.007398 -0.031403
Y 0.007398 0.053211 -0.020480
Z -0.031403 -0.020480 0.048057
Exercise 59:
Create a DataFrame with random values and calculate the correlation matrix.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
print(df.corr())
Copy
Output:
X Y Z
X 1.000000 -0.258187 0.541044
Y -0.258187 1.000000 -0.432419
Z 0.541044 -0.432419 1.000000
Exercise 60:
Create a DataFrame and calculate the rolling correlation between two columns.
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Rolling_Corr'] = df['X'].rolling(window=3).corr(df['Y'])
print(df)
Copy
Output:
X Y Rolling_Corr
0 1 6 NaN
1 2 5 NaN
2 3 4 -1.0
3 4 3 -1.0
4 5 2 -1.0
5 6 1 -1.0
Exercise 61:
import pandas as pd
df = pd.DataFrame(data)
df['Expanding_Var'] = df['X'].expanding().var()
print(df)
Copy
Output:
X Expanding_Var
0 1 NaN
1 2 0.500000
2 3 1.000000
3 4 1.666667
4 5 2.500000
5 6 3.500000
Exercise 62:
Solution:
import pandas as pd
df = pd.DataFrame(data, index=date_range)
monthly_df = df.resample('M').sum()
print(monthly_df)
Copy
Output:
X
2020-01-31 465
2020-02-29 1305
2020-03-31 2325
2020-04-30 855
Exercise 63:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X EMA
0 1 1.00000
1 2 1.50000
2 3 2.25000
3 4 3.12500
4 5 4.06250
5 6 5.03125
Exercise 64:
Solution:
import pandas as pd
import numpy as np
print(df.mode())
Copy
Output:
X Y Z
0 2 1.0 2.0
1 3 3.0 7.0
2 5 NaN NaN
3 6 NaN NaN
4 9 NaN NaN
Exercise 65:
Solution:
import pandas as pd
import numpy as np
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y zscore_A zscore_B
0 1 4 -1.341641 -1.341641
1 2 5 -0.447214 -0.447214
2 3 6 0.447214 0.447214
3 4 7 1.341641 1.341641
Exercise 66:
Create a DataFrame with random values and calculate the median.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
print(df.median())
Copy
Output:
X 0.787042
Y 0.477837
Z 0.696911
dtype: float64
Exercise 67:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df = df.apply(lambda x: x + 1)
print(df)
Copy
Output:
X Y
0 2 5
1 3 6
2 4 7
Exercise 68:
Create a DataFrame with hierarchical index and calculate the mean for each group.
Solution:
import pandas as pd
df = pd.DataFrame(data, index=index)
grouped_df = df.groupby('Group').mean()
print(grouped_df)
Copy
Output:
Value
Group
X 15.0
Y 35.0
Exercise 69:
Create a DataFrame and calculate the percentage of missing values in each column.
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(missing_percentage)
Copy
Output:
X 25.0
Y 25.0
dtype: float64
Exercise 70:
Solution:
import pandas as pd
df = pd.DataFrame(data)
print(df)
Copy
Output:
X Y Sum
0 1 4 5
1 2 5 7
2 3 6 9
Exercise 71:
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
Copy
Output:
X Y Z
0.25 0.174265 0.184036 0.520573
0.50 0.468040 0.315593 0.644571
0.75 0.767870 0.436426 0.771297
Exercise 72:
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
Q1 = df.quantile(0.25)
Q3 = df.quantile(0.75)
IQR = Q3 - Q1
print(IQR)
Copy
Output:
X 0.354244
Y 0.329573
Z 0.245520
dtype: float64
Exercise 73:
Create a DataFrame with datetime index and calculate the rolling mean.
Solution:
import pandas as pd
df = pd.DataFrame(data, index=date_range)
df['Rolling_Mean'] = df['X'].rolling(window=3).mean()
print(df)
Copy
Output:
X Rolling_Mean
2020-01-01 0 NaN
2020-01-02 1 NaN
2020-01-03 2 1.0
2020-01-04 3 2.0
2020-01-05 4 3.0
2020-01-06 5 4.0
2020-01-07 6 5.0
2020-01-08 7 6.0
2020-01-09 8 7.0
2020-01-10 9 8.0
Exercise 74:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Cumulative_Max'] = df['X'].cummax()
print(df)
Copy
Output:
X Cumulative_Max
0 1 1
1 2 2
2 3 3
3 2 3
4 1 3
Exercise 75:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Cumulative_Min'] = df['X'].cummin()
print(df)
Copy
Output:
X Cumulative_Min
0 1 1
1 2 1
2 3 1
3 2 1
4 1 1
Exercise 76:
Create a DataFrame with random values and calculate the cumulative variance.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(10, 3)
df['Cumulative_Var'] = df['X'].expanding().var()
print(df)
Copy
Output:
X Y Z Cumulative_Var
0 0.315669 0.900791 0.404858 NaN
1 0.462000 0.463257 0.922495 0.010706
2 0.328968 0.200027 0.967625 0.006548
3 0.630370 0.992849 0.231884 0.021460
4 0.574397 0.968600 0.926893 0.020023
5 0.204077 0.889864 0.589022 0.027130
6 0.386806 0.630882 0.242157 0.022759
7 0.319831 0.935747 0.829739 0.020630
8 0.786435 0.377739 0.879458 0.034407
9 0.523467 0.077937 0.764476 0.031194
Exercise 77:
Solution:
import pandas as pd
# Create a DataFrame
df = pd.DataFrame(data)
def custom_function(x):
return x * 2
print(df)
Copy
Output:
X Y
0 2 8
1 4 10
2 6 12
Exercise 78:
Create a DataFrame with random values and calculate the z-score for each element.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
print(df)
Copy
Output:
X Y Z
0 1.027393 0.656858 1.032853
1 0.674079 -1.277904 -0.220065
2 -0.996641 -0.298841 0.475217
3 -0.704831 0.919887 -1.288005
Exercise 79:
Create a DataFrame and calculate the cumulative sum for each group.
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Cumulative_Sum'] = df.groupby('X')['Y'].cumsum()
print(df)
Copy
Output:
X Y Cumulative_Sum
0 foo 1 1
1 bar 2 2
2 foo 3 4
3 bar 4 6
Exercise 80:
Create a DataFrame with random values and calculate the rank for each element.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
df = df.rank()
print(df)
Copy
Output:
X Y Z
0 4.0 3.0 3.0
1 3.0 2.0 2.0
2 1.0 4.0 1.0
3 2.0 1.0 4.0
Exercise 81:
Create a DataFrame and calculate the cumulative product for each group.
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Cumulative_Product'] = df.groupby('X')['Y'].cumprod()
print(df)
Copy
Output:
X Y Cumulative_Product
0 foo 1 1
1 bar 2 2
2 foo 3 3
3 bar 4 8
Exercise 82:
Create a DataFrame with random values and calculate the expanding sum.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
df['Expanding_Sum'] = df['X'].expanding().sum()
print(df)
Copy
Output:
X Y Z Expanding_Sum
0 0.815750 0.062819 0.699743 0.815750
1 0.128772 0.843222 0.411903 0.944522
2 0.857516 0.219424 0.234460 1.802038
3 0.011010 0.774375 0.259412 1.813048
Exercise 83:
Create a DataFrame and calculate the expanding minimum for each group.
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Expanding_Min'] = df.groupby('X')['Y'].expanding().min().reset_index(level=0,
drop=True)
print(df)
Copy
Output:
X Y Expanding_Min
0 foo 1 1.0
1 bar 2 2.0
2 foo 3 1.0
3 bar 4 2.0
Exercise 84:
Create a DataFrame with random values and calculate the expanding maximum for
each group.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
df['Expanding_Max'] = df.groupby('X')
['Y'].expanding().max().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Z Expanding_Max
0 0.751392 0.015856 0.313990 0.015856
1 0.812436 0.701808 0.069307 0.701808
2 0.148614 0.838726 0.290646 0.838726
3 0.764419 0.586510 0.470466 0.586510
Exercise 85:
Create a DataFrame and calculate the expanding variance for each group.
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Expanding_Var'] = df.groupby('X')['Y'].expanding().var().reset_index(level=0,
drop=True)
print(df)
Copy
Output:
X Y Expanding_Var
0 foo 1 NaN
1 bar 2 NaN
2 foo 3 2.0
3 bar 4 2.0
Exercise 86:
Create a DataFrame with random values and calculate the expanding standard
deviation.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
df['Expanding_Std'] = df['X'].expanding().std()
print(df)
Copy
Output:
X Y Z Expanding_Std
0 0.693184 0.088273 0.109510 NaN
1 0.031186 0.163005 0.803467 0.468103
2 0.294881 0.409395 0.278145 0.333272
3 0.918778 0.854961 0.791329 0.397322
Exercise 87:
Solution:
import pandas as pd
df = pd.DataFrame(data)
df['Expanding_Cov'] = df['X'].expanding().cov(df['Y'])
print(df)
Copy
Output:
X Y Expanding_Cov
0 1 4 NaN
1 2 3 -0.500000
2 3 2 -1.000000
3 4 1 -1.666667
Exercise 88:
Create a DataFrame with random values and calculate the expanding correlation.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(4, 3)
df['Expanding_Corr'] = df['X'].expanding().corr(df['Y'])
print(df)
Copy
Output:
X Y Z Expanding_Corr
0 0.094026 0.320246 0.044218 NaN
1 0.422531 0.002172 0.995907 -1.000000
2 0.265459 0.391239 0.589878 -0.751147
3 0.118812 0.061489 0.837821 -0.372750
Exercise 89:
import pandas as pd
df = pd.DataFrame(data)
df['Expanding_Median'] = df['X'].expanding().median()
print(df)
Copy
Output:
X Expanding_Median
0 1 1.0
1 2 1.5
2 3 2.0
3 4 2.5
4 5 3.0
5 6 3.5
Exercise 90:
Create a DataFrame with datetime index and calculate the expanding mean for
each group.
Solution:
import pandas as pd
data = {'X': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar'], 'Y': range(10)}
df = pd.DataFrame(data, index=date_range)
df['Expanding_Mean'] = df.groupby('X')
['Y'].expanding().mean().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Expanding_Mean
2020-01-01 foo 0 0.0
2020-01-02 bar 1 1.0
2020-01-03 foo 2 1.0
2020-01-04 bar 3 2.0
2020-01-05 foo 4 2.0
2020-01-06 bar 5 3.0
2020-01-07 foo 6 3.0
2020-01-08 bar 7 4.0
2020-01-09 foo 8 4.0
2020-01-10 bar 9 5.0
Exercise 91:
Create a DataFrame with random values and calculate the rolling sum for each
group.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(10, 3)
df['Rolling_Sum'] = df.groupby('X')
['Y'].rolling(window=3).sum().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Z Rolling_Sum
0 0.342706 0.579330 0.902681 NaN
1 0.182432 0.163406 0.156607 NaN
2 0.983085 0.052785 0.588865 NaN
3 0.756982 0.123991 0.704262 NaN
4 0.876875 0.710953 0.923588 NaN
5 0.359818 0.135520 0.277327 NaN
6 0.693156 0.590918 0.985834 NaN
7 0.892253 0.633529 0.169000 NaN
8 0.084238 0.007579 0.076730 NaN
9 0.663869 0.780832 0.644874 NaN
Exercise 92:
Create a DataFrame and calculate the rolling mean for each group.
Solution:
import pandas as pd
data = {'X': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar'], 'Y': range(10)}
df = pd.DataFrame(data)
df['Rolling_Mean'] = df.groupby('X')
['Y'].rolling(window=3).mean().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Rolling_Mean
0 foo 0 NaN
1 bar 1 NaN
2 foo 2 NaN
3 bar 3 NaN
4 foo 4 2.0
5 bar 5 3.0
6 foo 6 4.0
7 bar 7 5.0
8 foo 8 6.0
9 bar 9 7.0
Exercise 93:
Create a DataFrame with random values and calculate the rolling standard
deviation for each group.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(10, 3)
df['Rolling_Std'] = df.groupby('X')
['Y'].rolling(window=3).std().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Z Rolling_Std
0 0.154838 0.162793 0.808882 NaN
1 0.740167 0.920318 0.650240 NaN
2 0.033449 0.007883 0.249656 NaN
3 0.983601 0.261995 0.399816 NaN
4 0.883155 0.051084 0.125735 NaN
5 0.986930 0.470328 0.612276 NaN
6 0.981338 0.016731 0.627210 NaN
7 0.670522 0.247346 0.530971 NaN
8 0.978909 0.752500 0.903401 NaN
9 0.185614 0.362602 0.541459 NaN
Exercise 94:
Create a DataFrame and calculate the rolling variance for each group.
Solution:
import pandas as pd
data = {'X': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar'], 'Y': range(10)}
df = pd.DataFrame(data)
df['Rolling_Var'] = df.groupby('X')
['Y'].rolling(window=3).var().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Rolling_Var
0 foo 0 NaN
1 bar 1 NaN
2 foo 2 NaN
3 bar 3 NaN
4 foo 4 4.0
5 bar 5 4.0
6 foo 6 4.0
7 bar 7 4.0
8 foo 8 4.0
9 bar 9 4.0
Exercise 95:
Create a DataFrame with random values and calculate the rolling correlation for
each group.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(10, 3)
Copy
Output:
X Z Group Rolling_Corr
0 0.374540 0.950714 0.731994 A NaN
1 0.598658 0.156019 0.155995 A NaN
2 0.058084 0.866176 0.601115 A 0.992633
3 0.708073 0.020584 0.969910 A -0.095420
4 0.832443 0.212339 0.181825 A -0.180021
5 0.183405 0.304242 0.524756 B NaN
6 0.431945 0.291229 0.611853 B NaN
7 0.139494 0.292145 0.366362 A -0.869948
8 0.456070 0.785176 0.199674 B -0.984073
9 0.514234 0.592415 0.046450 B -0.788379
Exercise 96:
Create a DataFrame and calculate the rolling covariance for each group.
Solution:
import pandas as pd
data = {'X': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar'],
df = pd.DataFrame(data)
df['Rolling_Cov'] = rolling_cov
print(df)
Copy
Output:
X Y Z Rolling_Cov
0 foo 0 10 NaN
1 bar 1 11 NaN
2 foo 2 12 NaN
3 bar 3 13 NaN
4 foo 4 14 4.0
5 bar 5 15 4.0
6 foo 6 16 4.0
7 bar 7 17 4.0
8 foo 8 18 4.0
9 bar 9 19 4.0
Exercise 97:
Create a DataFrame with random values and calculate the rolling skewness for
each group.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(10, 3)
df['Rolling_Skew'] = df.groupby('X')
['Y'].rolling(window=3).skew().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Z Rolling_Skew
0 0.808397 0.304614 0.097672 NaN
1 0.684233 0.440152 0.122038 NaN
2 0.495177 0.034389 0.909320 NaN
3 0.258780 0.662522 0.311711 NaN
4 0.520068 0.546710 0.184854 NaN
5 0.969585 0.775133 0.939499 NaN
6 0.894827 0.597900 0.921874 NaN
7 0.088493 0.195983 0.045227 NaN
8 0.325330 0.388677 0.271349 NaN
9 0.828738 0.356753 0.280935 NaN
Exercise 98:
Create a DataFrame and calculate the rolling kurtosis for each group.
Solution:
import pandas as pd
data = {'X': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar'], 'Y': range(10)}
df = pd.DataFrame(data)
df['Rolling_Kurt'] = df.groupby('X')
['Y'].rolling(window=3).kurt().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Rolling_Kurt
0 foo 0 NaN
1 bar 1 NaN
2 foo 2 NaN
3 bar 3 NaN
4 foo 4 NaN
5 bar 5 NaN
6 foo 6 NaN
7 bar 7 NaN
8 foo 8 NaN
9 bar 9 NaN
Exercise 99:
Create a DataFrame with random values and calculate the rolling median for each
group.
Solution:
import pandas as pd
import numpy as np
data = np.random.rand(10, 3)
df['Rolling_Median'] = df.groupby('X')
['Y'].rolling(window=3).median().reset_index(level=0, drop=True)
print(df)
Copy
Output:
X Y Z Rolling_Median
0 0.542696 0.140924 0.802197 NaN
1 0.074551 0.986887 0.772245 NaN
2 0.198716 0.005522 0.815461 NaN
3 0.706857 0.729007 0.771270 NaN
4 0.074045 0.358466 0.115869 NaN
5 0.863103 0.623298 0.330898 NaN
6 0.063558 0.310982 0.325183 NaN
7 0.729606 0.637557 0.887213 NaN
8 0.472215 0.119594 0.713245 NaN
9 0.760785 0.561277 0.770967 NaN
Exercise 100:
Create a DataFrame and calculate the expanding sum for each group.
Solution:
import pandas as pd
data = {'X': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar'], 'Y': range(10)}
df = pd.DataFrame(data)
df['Expanding_Sum'] = df.groupby('X')
['Y'].expanding().sum().reset_index(level=0, drop=True)
print(df)
Output:
X Y Expanding_Sum
0 foo 0 0.0
1 bar 1 1.0
2 foo 2 2.0
3 bar 3 4.0
4 foo 4 6.0
5 bar 5 9.0
6 foo 6 12.0
7 bar 7 16.0
8 foo 8 20.0
9 bar 9 25.0