Python is an open-source, high-level, multipurpose programming language. It offers tools for fast manipulation of large matrices and datasets (similar to MATLAB) and powerful data aggregation and statistics (akin to R), together with thousands of packages for machine learning, visualizations, and many others. As a result, a growing number of data scientists are adopting it for their workflows.
Structure: The course will be organized in two modules. Each module comprises three sessions, two hours each, that will mix frontal lectures and hands-on parts to work on.
Framework and requirements: You will be following the course on your own laptop. The first two modules will be teaching using Google Colab, with no installation required (you will only need a browser and a working internet connection). In the second part we will move to Jupyter Notebooks, to understand how to set up an real-world Python environment that can be used in the every day research work. There won't be system requirements, we should be able to set it up on Windows, MacOS, and Linux (you will have instructions and assistance for doing that!).
Homework: After every lecture, there will be some homework to complete recapitulating the concepts from the lecture. You are encouraged to complete week by week!
Material: The material will consist in jupyter notebooks and python scripts with the lecture content and exercises and it will be made available before the lectures using GitHub.
Syllabus for the course. Ideally, its incremental nature should ensure that each core concept that is introduced is then revisited and expanded on in every new lecture.
-
0.0. Introduction to Variables and Statements in Python
- Basic elements of Python syntax
- Variable types (numbers, strings)
- Arithmetic and logical operators
-
0.1. Fundamental Data Structures
- Lists, dictionaries, tuples, sets
- Creating, modifying, iterating over structures
-
0.2. Program Flow Control
- Conditional statements (
if/elif/else) - Loops (
for,while) - Practical use for automating repetitive operations
- Conditional statements (
-
0.3. Organizing Code into Functions
- Defining and calling functions
- Arguments and return values
- Principles of modularity
-
0.4. Classes and Objects
- Introduction to object-oriented programming
- Methods, attributes, properties
- Basic notions of inheritance
-
0.5. Installing and Managing the Local Python Environment
- Configuring a Python distribution
- Using virtual environments
- Installing packages with
piporconda
-
1.0. Introduction to pandas
- Reading and writing CSV files
- Indexing and data visualization
- Creating and modifying columns
-
1.1. Aggregation and Intelligent Grouping of Data
- Using
groupby - Statistical summaries
- Pivot tables and complex transformations
- Using
-
1.2. Creating Plots in Python
- Overview of
matplotlibandpandas.plot - Visualizing tabular data (line, bar, scatter plots)
- Overview of
-
1.3. Statistical Analysis with pandas
- Means, standard deviations, correlations
- Built-in statistical tools for dataset exploration