0% found this document useful (0 votes)
453 views3 pages

Pandas Commands

The document discusses various commands used for data wrangling in Pandas. It covers commands for creating and importing/exporting dataframes, examining and summarizing data, selecting subsets of data, handling missing values, sorting and grouping data. Some key commands include df.head() to view the top rows, df.describe() for summaries, df.loc and df.iloc for label-based and integer-based slicing, and df.groupby to group and aggregate data.

Uploaded by

vijayagajula1359
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
453 views3 pages

Pandas Commands

The document discusses various commands used for data wrangling in Pandas. It covers commands for creating and importing/exporting dataframes, examining and summarizing data, selecting subsets of data, handling missing values, sorting and grouping data. Some key commands include df.head() to view the top rows, df.describe() for summaries, df.loc and df.iloc for label-based and integer-based slicing, and df.groupby to group and aggregate data.

Uploaded by

vijayagajula1359
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

PANDAS

COMMANDS

Creating dataframe:

data_mobile = {'Mobile': ['iPhone', 'Samsung', 'Redmi'], 'Color': ['Red', 'White',


'Black'], 'Price': [High,
Medium,Low]}
df = pd.DataFrame(data_mobile,columns=['Mobile', 'Color', 'Price'])

Importing Data :

pd.read_csv('filename') - It is used to import the csv file into pandas dataframe.


pd.read_excel('filename') - it is used to import the excel sheet into pandas
dataframe.
pd.read_json('filename') - It is used to import the json file into pandas
dataframe.

Exporting Data:

df.to_csv('filename') - It is used to export the cleaned (or) modified data into


the file.
df.to_excel('filename') - It is used to export the modified data into the file.
df.to_json('filename') - It is used to export the modified data into the file.

Important commands :

* df.head(n) - Displays first n rows of the DataFrame.


* df.tail(n) – Displays last n rows of the DataFrame.
* df.shape - Gives the number of rows and columns.
* df.info() - Gives information of Index, Datatype and Memory.
* df.describe() -Summary statistics for numerical columns like
mean,std,25%,50%,75%,min,max.
df.columns - Gives the column names.
pd.set_option('display.max_columns',None) - displays all the columns.
df.dtypes - Gives the datatypes of each column.
df['column_name'].dtypes - Gives the datatype of the particular column.
df.axes - Gives the index range and column names of the dataframe.
df.ndim - Gives the dimensions of the table.
df.size - Gives the total no of rows.
df.values - Displays each row in form of list.
df.mean() - Gives the mean of each column
df.['column_name'].mean() - Gives mean of particular column
df.max() - gives the maximum values of all columns.include column name to know for
particular column.
df.min() - gives the minimum values of all columns.include column nameto know for
particular column.
df.std() - Gives the standard deviation of all columns.include column name to know
for particular column.
df.median() - Gives the middle values of all columns.include column name to know
for particular column.
df.count() - Gives the no of non-null values in each column.include column name to
know for particular column.
df.cumsum() - returns a DataFrame with the cumulative sum for each row.
df['column_name'].unique() - Gives all the unique values in the column.
df.column_name.replace('previous name','new name') - we can replace old with new
name in column.
df['column_name'].value_counts() - displays how many times a value repeated in
column.
df.sample(n=12) - displays 12 random rows from the data frame.
df.nlargest(n, 'value’) - select and order top n entries
df.nsmallest(n, ;value') - select and order bottom n entries.

LOC:

df.loc[1:3, ['cname1','cname2']] - Displays 1-3 rows of columns 1,2.


df.loc[1:3] - Displays 1,2,3 all rows of columns
df.iat[1, 2] - Access a single value for a row/column pair by index.
df.at[2, 'cname'] - Acess a single value for a row/column pair by column name.

ILOC:

df.iloc[0] - displays first row of data frame


df.iloc[1] - displays second row of data frame
df.iloc[-1] - displays last row of data frame
df.iloc[:,0] - displays the first column of dataframe.
df.iloc[:,1] - Select second column of data frame.

Handling Missing Values:

df.isnull() - The null value present in column is displayed as true,remaining all


are false.
df.notnull() - The null value present in column is displayed as false,remaining all
are true.
df.isnull().sum() - Counts all the null values present in each column and shows at
end of each column.
df.fillna(value) - Replace all NA/null data with value.
df.dropna()/df.dropna(thresh=n) - Drop rows with any column having NA/null
data./eliminates the rows which have the values less than n.
df.drop_duplicates() - removes duplicate rows (considers columns).
df.reset_index() - resets index of the row numbers, moves the actual index to
columns.
df.sort_index() - sorting the index.
df.rename(columns = {'y':'year'}) - rename the columns of the dataframe.
df.sort_values('cname') - Order rows by values of a column (low to high).
df.sort_values('cname',ascending=False) - Order rows by values of a column (high
to low).
df.groupby([column1,column2]) - Returns a groupbyobject values from multiple
columns.
df.groupby(column1)[column2].mean() - Returns the mean of the values in column2,
grouped by the values in column1 .
df.groupby(column1)[column2].median() - Returns the median of the values in
column2, grouped by the values in column1.

Extra commands:

pd.melt(df) - Gather columns into rows


df.pivot(columns='var', values='val') - spreads rows into columns.
pd.concat([df1,df2]) - Append rows of DataFrames
pd.concat([df1,df2], axis=1) - Append columns of DataFrames.

You might also like