Pyecon
Pyecon
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Learning Python for econometrics                                        3
Essential
concepts
 Getting started          Knowledge after completing this course:
 Procedural
 programming
 Object-orientation
                              You have acquired a basic understanding of programming in general
Numerical
programming
                              with Python and a special knowledge of working with standard
 NumPy package                numerical packages.
 NumPy array
 Linear Algebra
                              You are able to study Python in depth and absorb new knowledge
Data formats and
handling                      for your scientific work with Python.
 Pandas
 Series
                              You know the capabilities and further possibilities to use Python
 DataFrame
                              in econometrics.
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Learning Python for econometrics                                       4
Essential
concepts
 Getting started          What you should not expect from this course:
 Procedural
 programming
 Object-orientation
                              A guide how to install or maintain an application.
Numerical
programming                   An introduction to programming for beginners.
 NumPy package
 NumPy array                  Non-scientific, general purpose programming (beyond the language
 Linear Algebra
                              essentials).
Data formats and
handling
 Pandas
                              Introduction to professional development tools.
 Series
 DataFrame
                              Few content and less effort...
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Course organisation                                                        5
Essential
concepts
 Getting started          This course can be seen as an applied lecture:
 Procedural
 programming
 Object-orientation       Lecture:
Numerical                 We try to explain the partly theoretical knowledge on Python by simple,
programming
 NumPy package            easy to understand examples. You can learn the subtleties by reading
 NumPy array
 Linear Algebra
                          good literature.
Data formats and          Exercises:
handling
 Pandas                   Digital work sheets in the form of Jupyter notebooks with applied
 Series
 DataFrame
                          tasks are available for each chapter. For all exercises there are sample
 Import/Export data       solutions available in separate notebooks.
Visual
illustrations             Self-tests:
 Matplotlib
 Figures and subplots
                          At the end of each of the five chapters there are typical exam questions.
 Plot types and styles
 Pandas visualization
                          Written exam:
Applications              There will be a final exam. This will be a pure multiple choice exam:
 Time series
                          60 questions, 90 minutes.
 Moving window
 Financial applications
                          After the successful participation in the exam you will receive 6 ECTS.
© 2018 PyEcon.org
                          Literature                                                             6
Essential
concepts
 Getting started          The programming language Python is already established and very well
 Procedural
 programming              in trend for numerical applications. Some keywords:
 Object-orientation
Numerical
programming
                              Data science,
 NumPy package
 NumPy array
                              Data wrangling,
 Linear Algebra
                              Machine learning,
Data formats and
handling
 Pandas
                              Numerical statistics,
 Series
 DataFrame
                              ...
 Import/Export data
© 2018 PyEcon.org
                          Software: Python 3                                                         7
Essential
concepts
 Getting started          We are using Python 3. There was a big revision in the migration
 Procedural
 programming              from Python 2 to version 3 and the new version is no longer backwards
 Object-orientation
                          compatible to the old version.
Numerical
programming
 NumPy package            Python 3 running [command line]
 NumPy array
 Linear Algebra           python3 --version
Data formats and
handling
 Pandas                   ## Python 3.6.6
 Series
 DataFrame
 Import/Export data       The normal execution mode is that the Python interpreter processes
Visual
illustrations
                          the instructions in the background – in other numeric programming
 Matplotlib               languages such as R this is known as batch mode. It executes program
 Figures and subplots
 Plot types and styles
                          code that is usually located in a source code file.
 Pandas visualization
                          The interpreter can also be started in an interactive mode. It is used
Applications
 Time series              for testing and analytical purposes in order to obtain fast results when
 Moving window
 Financial applications
                          performing simple applications.
© 2018 PyEcon.org
                          Software: IDEs                                                           8
Essential
concepts
 Getting started          For everyday work with Python it would be extremely tedious to make
 Procedural
 programming              all edits in interactive mode.
 Object-orientation
Numerical
                          There are a number of excellent integrated development environments
programming
 NumPy package
                          (IDEs) for Python, with two being emphasized here:
 NumPy array
 Linear Algebra
                               Jupyter (and IPython)
Data formats and
handling                       PyCharm (by IntelliJ)
 Pandas
 Series
 DataFrame                Of course, you can also use a simple text editor. However, you would
 Import/Export data
                          probably miss the comfort of an IDE.
Visual
illustrations
 Matplotlib
                          Installing, adding and maintaining Python is not trivial at the beginning.
 Figures and subplots     Therefore, as a beginner, you are well advised to download and install
 Plot types and styles
 Pandas visualization
                          the Python distribution Anaconda. Bonus: Many standard packages
Applications              are supplied directly or you can post-install them conveniently.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Following this course                                                     9
Essential
concepts
 Getting started          In this course – in a numerical and analytical context – we use only
 Procedural
 programming              Jupyter with the IPython kernel.
 Object-orientation
Numerical
                          That is why we have combined
programming
 NumPy package
 NumPy array
                            1   all the code from the slides, and
 Linear Algebra
                            2   all the exercises and solutions
Data formats and
handling
 Pandas
                          into interactive Jupyter notebooks that you can use online without
 Series
 DataFrame                having to install software locally on your computer. The GWDG has
 Import/Export data
                          set up a cloud-based Jupyter-Hub for you.
Visual
illustrations
 Matplotlib
                          You can access the working environment with your university credentials
 Figures and subplots     at
 Plot types and styles
 Pandas visualization     https://jupyter.gwdg.de/
Applications
 Time series
                          create a profile and get started right away – even using your smart
 Moving window            devices. However, so far you are still asked to upload the course
 Financial applications
                          notebooks by yourself or rewrite the code from scratch.
© 2018 PyEcon.org
                          Notebook workflow                                                      10
Essential
concepts
 Getting started          A Jupyter notebook is divided into individual, vertically arranged cells,
 Procedural
 programming              which can be executed separately:
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
                          The notebook approach is not novel and comes from the field of
                          computer algebra software.
© 2018 PyEcon.org
                          Notebook workflow                                                      11
Essential
concepts
 Getting started          Actually, an interactive Python interpreter called IPython is started “in
 Procedural
 programming              the core”.
 Object-orientation
Applications
                               magic commands.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Following this course                                                 12
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra            Finally, we wish you a lot of fun and success with and in this course!
Data formats and
handling
 Pandas                                              Practice makes perfect!
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots     Contribution and credits:
 Plot types and styles
 Pandas visualization
                          Fabian H. C. Raters
Applications
 Time series              Eike Manßen
 Moving window
 Financial applications
                          GWDG for the Jupyter-Hub
© 2018 PyEcon.org
                          Table of contents                                                13
Essential
concepts
 Getting started
 Procedural
 programming
                          1     Essential concepts          4     Visual illustrations
 Object-orientation
                          1.1 Getting started               4.1    Matplotlib
Numerical
programming               1.2 Procedural programming        4.2    Figures and subplots
 NumPy package
 NumPy array
                          1.3 Object-orientation            4.3    Plot types and styles
 Linear Algebra
                          2     Numerical programming       4.4    Pandas visualization
Data formats and
handling                  2.1 NumPy package                 5     Applications
 Pandas
 Series
                          2.2 NumPy array                   5.1 Time series
 DataFrame                2.3 Linear Algebra                5.2 Moving window
 Import/Export data
© 2018 PyEcon.org
                          Chapter 1                    14
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Essential concepts
Numerical
programming
 NumPy package
 NumPy array
                          1.1 Getting started
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 1.1          15
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Essential concepts
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Motivation for learning Python                                        16
Essential
concepts
 Getting started          Python can be described as
 Procedural
 programming
 Object-orientation
                              a dynamic, strongly typed, multi-paradigm and object-oriented
Numerical
programming
                              programming language,
 NumPy package
 NumPy array
                              for versatile, powerful, elegant and clear programming,
 Linear Algebra
                              with a general, high-level, multi-platform application scope,
Data formats and
handling
 Pandas
                              which is being used very successfully in the data science sector
 Series                       and very much in trend.
 DataFrame
 Import/Export data
Visual
                          Moreover, Python is relatively easy to learn and its successful language
illustrations
 Matplotlib
                          design supports novices to professional developers.
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          A short history of time                                               17
Essential
concepts
 Getting started           ... of the Python era:
 Procedural
 programming
 Object-orientation        The language was originally developed in 1991 by Guido van Rossum.
Numerical                  Its name was based on Monty Python’s Flying Circus. Its main identifi-
programming
 NumPy package             cation feature is the novel markup of code blocks – by indentation:
 NumPy array
 Linear Algebra
                           Indentation example
Data formats and
handling                   password = input("I am your bank. Password please: ")
 Pandas
 Series                    ## I am your bank. Password please: sparkasse
 DataFrame
 Import/Export data        if password == "sparkasse":
Visual                         print("You successfully logged in!")
illustrations              else:
 Matplotlib
                               print("Fail. Will call the police!")
 Figures and subplots
 Plot types and styles
 Pandas visualization      ## You successfully logged in!
Applications
 Time series
 Moving window
                           This increases the readability of code and should at the same time
 Financial applications    encourage the programmer in programming neatly. Since the source
                           code can be written more compactly with Python, an increased efficiency
                           in daily work can be expected.
© 2018 PyEcon.org
                          A short history of time                                              18
Essential
concepts
 Getting started           Overview of the Python development by versions and dates:
 Procedural
 programming
 Object-orientation
Numerical
programming
                           1990      1995      2000       2005       2010       2015       2020
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
                                              Python 2.7 lives forever   Python 2.7 will die
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
                                                                 Python 3.6
 Financial applications
© 2018 PyEcon.org
                          In comparison                                                       19
Essential
concepts
 Getting started          Comparing the way Python works with common programming languages,
 Procedural
 programming              we briefly discuss a selection of popular competitors:
 Object-orientation
Numerical                 C/C++:
programming
 NumPy package                CPython is interpreted, not compiled.
 NumPy array
 Linear Algebra               C/C++ are strongly static, complex languages.
Data formats and
handling                  Java:
 Pandas
 Series                       CPython is not compiled just-in-time.
 DataFrame
 Import/Export data           Java has a C-type syntax.
Visual
illustrations
                          MATLAB
 Matplotlib
 Figures and subplots
                              In Python you primarily follow a scalar way of thinking, while in
 Plot types and styles        MATLAB you write matrix-based programs.
 Pandas visualization
Applications                  In the numerical context, the matrix view and syntax are very
 Time series
                              similar to those of MATLAB.
 Moving window
 Financial applications
                              MATLAB is partially compiled just-in-time.
                          Where CPython is the reference implementation – the “Original Python”,
© 2018 PyEcon.org
                          which is implemented in C itself.
                          In comparison                                                        20
Essential
concepts
 Getting started          R
 Procedural
 programming
 Object-orientation
                              In Python you primarily follow a scalar way of thinking, while in R
Numerical
                              you write vector-based programs.
programming
 NumPy package                R has a C-type syntax including additions to novel language con-
 NumPy array
 Linear Algebra
                              cepts.
Data formats and          Stata
handling
 Pandas                       Any comparison would inadequately describe the differences.
 Series
 DataFrame
 Import/Export data       Reference semantics
Visual
illustrations             An extremely important difference between the first two languages,
 Matplotlib
 Figures and subplots
                          C/C++ and Java, as well as Python itself, and the last three languages
 Plot types and styles    is that they follow a call-by-reference semantic, while MATLAB, R and
 Pandas visualization
Applications
                          Stata are call-by-copy.
 Time series
 Moving window            Further specific differences and similarities to MATLAB and R will be
 Financial applications
                          addressed in other parts of this course.
© 2018 PyEcon.org
                          Versatility – diversity                                                    21
Essential
concepts
 Getting started           Python has become extremely popular:
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
                           Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2018 PyEcon.org
                          Versatility – diversity                                                    22
Essential
concepts
 Getting started           So, you’re on the right track – because who wants to bet on the wrong
 Procedural
 programming               hoRse?
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
                           Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
© 2018 PyEcon.org
                          Versatility – diversity                              23
Essential
concepts
 Getting started           Areas in which Python is used with great success:
 Procedural
 programming
 Object-orientation            Scripts,
Numerical                      Console applications,
programming
 NumPy package                 GUI applications,
 NumPy array
 Linear Algebra
                               Game development,
Data formats and               Website development, and
handling
 Pandas
                               Numerical programming.
 Series
 DataFrame                 Places where Python is used:
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Yet another outline                                                 24
Essential
concepts
 Getting started           In this course we will successively gain the following insights:
 Procedural
 programming
 Object-orientation
Numerical
programming
                            1   General basics of the language.
 NumPy package
 NumPy array
 Linear Algebra
                            2   Numerical programming and handling of data sets.
Data formats and
handling                    3   Application to economic and analytical questions.
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 1.2                25
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Essential concepts
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          The first program                                                   26
Essential
concepts
 Getting started           Programs can be implemented very quickly – this is a pretty minimal
 Procedural
 programming               example. You can write this command to a text file of your choice and
 Object-orientation
                           run it directly on your system:
Numerical
programming
 NumPy package             Hello there
 NumPy array
 Linear Algebra
                           print("Hello there!")
Data formats and
handling                   ## Hello there!
 Pandas
 Series
Visual
illustrations                  Function displays argument (a string) on screen,
 Matplotlib
 Figures and subplots
                               Arguments are passed to the function in parentheses,
                               A string must be wrapped in " " or ’ ’,
 Plot types and styles
 Pandas visualization
Applications
                               No semicolon at the end.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          User input                                                           27
Essential
concepts
 Getting started          Let’s add a user input to the program:
 Procedural
 programming
 Object-orientation       Hello you
                          name = input("Please enter your name: ")
Numerical
programming
 NumPy package
 NumPy array
                          ## Please enter your name: Angela Merkel
 Linear Algebra
                          print("Hello " + name + "!")
Data formats and
handling
 Pandas
                          ## Hello Angela Merkel!
 Series
 DataFrame
 Import/Export data
Visual
                              The function input() is used for interactive text input,
                              You can use the equal sign = to assign variables (here: name),
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
                              Strings can be joined by the (overloaded) Operator +.
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Determining weekdays                                                28
Essential
concepts
 Getting started          We are now trying to find out on which weekday a person was born
 Procedural
 programming              (Merkel’s birthday is 17-07-1954):
 Object-orientation
Numerical
programming
                          Weekday of birth
 NumPy package
                          from datetime import datetime
 NumPy array
 Linear Algebra           answer = input("Your birthday (DD-MM-YYYY): ")
Data formats and
handling                  ## Your birthday (DD-MM-YYYY): 17-07-1954
 Pandas
 Series                   birthday = datetime.strptime(answer, "%d-%m-%Y")
 DataFrame                print("Your birthday was on a " + birthday.strftime("%A") + "!")
 Import/Export data
© 2018 PyEcon.org
                          Time since birth                                                       29
Essential
concepts
 Getting started          And how many days have passed since then (until Merkel’s 4th swearing-
 Procedural
 programming              in as Federal Chancellor)?
 Object-orientation
Numerical
programming
                          Age in days
 NumPy package
                           someday = datetime.strptime("09-10-2018", "%d-%m-%Y")
 NumPy array
 Linear Algebra
                           print("You are " + str((someday - birthday).days) + " days old!")
Data formats and
handling                   ## You are 23460 days old!
 Pandas
 Series
 DataFrame
 Import/Export data            You can create time differences, i. e. the operator - is overloaded,
Visual
illustrations                  The difference represents a new object, with its own attributes,
 Matplotlib
 Figures and subplots
                               such as days,
 Plot types and styles
 Pandas visualization
                               When using the overloaded operator +, you have to explicitly
Applications                   convert the number of days by means of str() into a string.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Time since birth                                                      30
Essential
concepts
 Getting started           How many years, weeks and days do you think that is?
 Procedural
 programming
 Object-orientation        Human readable age
Numerical
programming
                           from dateutil.relativedelta import relativedelta
 NumPy package             delta = relativedelta(someday, birthday)
 NumPy array               print(f"That’s {delta.years} years, {delta.months} months "
 Linear Algebra
                                 f"and {delta.days} days!!")
Data formats and
handling
 Pandas
                           ## That's 64 years, 2 months and 22 days!!
 Series
 DataFrame
 Import/Export data
Visual
                               You don’t have to keep reinventing the wheel – a wealth of packages
illustrations                  and individual modules are freely available,
 Matplotlib
 Figures and subplots
                               A lowercase f before "..." provides convenient formatting – there
 Plot types and styles
 Pandas visualization          are other options as well,
Applications
 Time series
                               Two strings in sequence are implicitly joined together – "That"
 Moving window                 "’s nice"!
 Financial applications
© 2018 PyEcon.org
                          Getting help                                                            31
Essential
concepts
 Getting started          When working with the interactive interpreter, i. e. in a notebook, you
 Procedural
 programming              can quickly get useful information about Python objects:
 Object-orientation
Numerical
programming
                          Help system
 NumPy package
                          help(len)
 NumPy array
 Linear Algebra
                          ## Help on built-in function len in module builtins:
Data formats and
handling                  ##
 Pandas                   ## len(obj, /)
 Series
                          ##     Return the number of items in a container.
 DataFrame
 Import/Export data
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Lexical structure                                                       32
Essential
concepts
 Getting started          As with natural language, programming languages have a lexical struc-
 Procedural
 programming              ture. Source code consists of the smallest possible, indivisible elements,
 Object-orientation
                          the tokens. In Python you can find the following groups of elements:
Numerical
programming
 NumPy package                 Literals
 NumPy array
 Linear Algebra
                               Variables
Data formats and
handling                       Operators
 Pandas
 Series                        Delimiters
 DataFrame
 Import/Export data            Keywords
Visual
illustrations                  Comments
 Matplotlib
 Figures and subplots
 Plot types and styles    These terms give us a rock-solid foundation for exploring the heart of
 Pandas visualization
                          a programming language.
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Literals and variables                                                    33
Essential
concepts
 Getting started           Basically, we distinguish between literals and variables:
 Procedural
 programming
 Object-orientation       Assigning variables with literals
Numerical
programming
                           myint = 7
 NumPy package             myfloat = 4.0
 NumPy array               myboat = "nice"
 Linear Algebra
                           mybool = True
Data formats and
handling
                           myfloat = myboat
 Pandas
 Series
 DataFrame
 Import/Export data            In this course, we will work with four different literals: integer (7),
Visual                         float (4.0), string ("nice") and boolean (True),
illustrations
 Matplotlib                    Literals are assigned to variables at runtime,
 Figures and subplots
 Plot types and styles         In Python the data type is derived from the literal and does not
 Pandas visualization
Applications
                               have to be described explicitly,
 Time series
 Moving window
                               It is allowed to assign values of different data types to the same
 Financial applications        variable (name) sequentially,
                               If we don’t assign a literal to any variables, we forfeit it.
© 2018 PyEcon.org
                          Operators and delimiters                                          34
Essential
concepts
 Getting started          Most operators and delimiters will be introduced to you during this
 Procedural
 programming              course. Here is an overview of the operators:
 Object-orientation
Numerical
programming
                          Overview of operators
 NumPy package
                          ##   +    -        *     /       **      //
                          ##   %    <<       >>    &       |       ^
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Arithmetic operators                                                   35
Essential
concepts
 Getting started          All regular arithmetic operations involving numbers are possible:
 Procedural
 programming
 Object-orientation        Pocket calculator
Numerical                 10 + 5
programming
 NumPy package
                          100 - 20
 NumPy array              8 / 2
 Linear Algebra           4 * (10 + 20)
Data formats and          2**3
handling
 Pandas                   ## 15
 Series                   ## 80
 DataFrame
                          ## 4.0
 Import/Export data
                          ## 120
Visual
illustrations             ## 8
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization          The result of dividing two integers is a floating point number,
Applications
 Time series
                               The conventional rules apply: Parentheses first, then multiplication
 Moving window                 and division, etc.,
 Financial applications
                               The operator ** is used for exponentiation.
© 2018 PyEcon.org
                          Keywords and comments                                           36
Essential
concepts
 Getting started          The programmer explains the structure of his/her program to the
 Procedural
 programming              interpreter via a restricted set of short commands, the keywords:
 Object-orientation
Numerical
programming
                          Overview of keywords
 NumPy package            ##   and       as     assert   break    class      continue
 NumPy array
 Linear Algebra
                          ##   def       del    elif     else     except     False
                          ##   finally   for    from     global   if         import
Data formats and
handling                  ##   in        is     lambda   None     nonlocal   not
 Pandas                   ##   or        pass   raise    return   True       try
 Series
                          ##   while     with   yield
 DataFrame
 Import/Export data
Numerical
programming
                          Logical table
 NumPy package
                          # Create table head
                                      b   a and b   a or b   not a\n"
 NumPy array
 Linear Algebra
                          print("a
Data formats and
                                "--------------------------------")
handling                  # Loop through the rows
 Pandas                   for a in [False, True]:
 Series
 DataFrame
                              for b in [False, True]:
 Import/Export data                print(f"{a:1} {b:3} {a and b:6} {a or b:8} {not a:7}")
Visual                    ##   a   b   a and b   a or b   not a
illustrations
 Matplotlib
                          ##   --------------------------------
 Figures and subplots     ##   0   0      0        0       1
 Plot types and styles    ##   0   1      0        1       1
 Pandas visualization
                          ##   1   0      0        1       0
Applications
                          ##   1   1      1        1       0
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data types                                                              38
Essential
concepts
 Getting started          Python offers the following basic data types, which we will use in this
 Procedural
 programming              course:
 Object-orientation
Applications
                          Each data type has its own methods, that is, functions that are appli-
 Time series              cable specifically to an object of this type.
 Moving window
 Financial applications   You will gradually get to know new and more complex data types or
                          object classes.
© 2018 PyEcon.org
                          Lists                                                                   39
Essential
concepts
 Getting started          A list is an ordered array of objects, accessible via an index:
 Procedural
 programming
 Object-orientation        Listing tech companies
                           stocks = ["Google", "Amazon", "Facebook", "Apple"]
Numerical
programming
 NumPy package             stocks[1]
 NumPy array
                           stocks.append("Twitter")
                           stocks.insert(2, "Microsoft")
 Linear Algebra
© 2018 PyEcon.org
                          Tuples                                                                40
Essential
concepts
 Getting started          Tuples are immutable sequences related to lists that cannot be extended,
 Procedural
 programming              for example. The drawbacks in flexibility are compensated by the
 Object-orientation
                          advantages in speed and memory usage:
Numerical
programming
 NumPy package            Selecting elements in sequences
 NumPy array
 Linear Algebra           lottery = (1, 8, 9, 12, 24, 28)
Data formats and          len(lottery)
handling                  lottery[1:3]
                          lottery[:4]
 Pandas
 Series
 DataFrame                lottery[-1]
 Import/Export data       lottery[-2:]
Visual
illustrations             ##   (1, 8, 9, 12, 24, 28)
 Matplotlib               ##   6
 Figures and subplots
                          ##   (8, 9)
 Plot types and styles
 Pandas visualization
                          ##   (1, 8, 9, 12)
Applications
                          ##   28
 Time series              ##   (24, 28)
 Moving window
 Financial applications
                          The same operations are also supported when using lists.
© 2018 PyEcon.org
                          Dictionaries                                                           41
Essential
concepts
 Getting started          Dictionaries are associative collections of key-value pairs. The key must
 Procedural
 programming              be immutable and unique:
 Object-orientation
Numerical
programming
                          Internet slang dictionary
 NumPy package
                          slang = {"imho": "in my humble opinion",
                                   "lol": "laughing out loud",
 NumPy array
 Linear Algebra
Applications
 Time series
 Moving window
 Financial applications
                                The constructor for dict() is { } with :,
                                The pairs are unordered, iterable sequences.
© 2018 PyEcon.org
                          Sets                                                              42
Essential
concepts
 Getting started          A set is an unordered collection of objects without duplicates:
 Procedural
 programming
 Object-orientation       Set operations
                          x   =   {"o", "n", "y", "t"}
Numerical
programming
 NumPy package            y   =   {"p", "h", "o", "n"}
 NumPy array
                          x   &   y
                          x   |   y
 Linear Algebra
Applications
 Time series
                                  Defines its own operators that overload existing ones.
 Moving window
 Financial applications
                                  Empty set via set(), because {} already creates dict().
© 2018 PyEcon.org
                          Control flow: Conditional statements                                43
Essential
concepts
 Getting started          Python has only one kind of conditional statement – if-elif-else:
 Procedural
 programming
 Object-orientation       Computer data sizes
                          bytes = 100000000 / 8 # e.g. DSL 100000
Numerical
programming
 NumPy package            if bytes >= 1e9:
 NumPy array
                              print(f"{bytes/1e9:6.2f} GByte")
                          elif bytes >= 1e6:
 Linear Algebra
© 2018 PyEcon.org
                          Control flow: continue and break              45
Essential
concepts
 Getting started          Loops can skip iterations (continue):
 Procedural
 programming
 Object-orientation       Continue the loop
Numerical
programming
                          for x in ["a", "b", "c"]:
 NumPy package                a = x.upper()
 NumPy array                  continue
 Linear Algebra
                              print(x)
Data formats and          print(a)
handling
 Pandas
 Series                   ## C
 DataFrame
 Import/Export data
                          Or a loop can be aborted instantly (break):
Visual
illustrations
 Matplotlib               Breaking the habit
 Figures and subplots
 Plot types and styles    y = 0
 Pandas visualization     for i in [7, 3, 4, "x", 6, 15]:
Applications                  if not isinstance(i, int):
 Time series                      break
 Moving window
                              y += i
 Financial applications
                          print(f"The total sum is {y}.")
Numerical
                          Have you already noticed the keyword else? Python only executes the
programming
 NumPy package
                          branch if it was not terminated by break:
 NumPy array
 Linear Algebra           Favorite lottery number
Data formats and
handling
                          import random
 Pandas                   n = 0
 Series                   favorite = 7
 DataFrame
 Import/Export data
                          while n < 100:
                              n += 1
Visual
illustrations                 draw = random.randint(1, 49) # e.g. German lottery
 Matplotlib                   if draw == favorite:
 Figures and subplots
                                  print("Got my number! :)")
 Plot types and styles
 Pandas visualization
                                  break
Applications
                          else:
 Time series                  print("My favorite did not show up! :(")
 Moving window            print(f"I tried {n} times!")
 Financial applications
                          ## Got my number! :)
                          ## I tried 15 times!
© 2018 PyEcon.org
                          Functions                                                            47
Essential
concepts
 Getting started          Functions are defined using the keyword def. The structure of function
 Procedural
 programming              signature and body is specified by indentation, too:
 Object-orientation
Numerical
programming
                          Drawing lottery numbers
 NumPy package
                          def draw_sample(n, first=1, last=49):
                              numbers = list(range(first, last + 1))
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Functions                                                       48
Essential
concepts
 Getting started          Functions are of type callable(), defined as closures, and can be
 Procedural
 programming              created and used like other objects:
 Object-orientation
Numerical
programming
                          Prime numbers
 NumPy package            def primes(n):
 NumPy array
                              numbers = [2]
 Linear Algebra
                              def is_prime(num):
Data formats and
handling                          for i in numbers:
 Pandas                               if num % i == 0:
 Series                                   return False
 DataFrame
 Import/Export data
                                  return True
                              if n == 2:
Visual
illustrations                     return numbers
 Matplotlib                   for i in range(3, n + 1):
 Figures and subplots
                                  if is_prime(i):
 Plot types and styles
 Pandas visualization
                                      numbers.append(i)
Applications
                              return numbers
 Time series              primes(50)
 Moving window
 Financial applications   ## [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Python is object-oriented                                             50
Essential
concepts
 Getting started          There are three widely known programming paradigms: procedural,
 Procedural
 programming              functional and object-oriented programming (OOP). Python supports
 Object-orientation
                          them all.
Numerical
programming
 NumPy package            You have learned how to handle predefined data types in Python.
 NumPy array
 Linear Algebra
                          Actually, we have already encountered classes and instances, take for
Data formats and
                          example dict().
handling
 Pandas                   In this section you will learn the basics of dealing with (your own)
 Series
 DataFrame
                          classes:
 Import/Export data
                            1   References
Visual
illustrations               2   Classes
 Matplotlib
 Figures and subplots       3   Instances
 Plot types and styles
 Pandas visualization       4   Main principles
Applications
 Time series                5   Garbage collection
 Moving window
 Financial applications   OOP is a wide field and challenging for beginners. Don’t get discouraged
                          and, if you find deficits in yourself, read the literature.
© 2018 PyEcon.org
                          References                                                     51
Essential
concepts
 Getting started          When you assign a variable, a reference to an object is set:
 Procedural
 programming
 Object-orientation       Equal but not identical
Numerical
programming               a   = ["Star", "Trek"]
 NumPy package            b   = ["Star", "Trek"]
 NumPy array              c   = a
 Linear Algebra
                          a   == b
Data formats and
handling
                          a   == c
 Pandas                   a   is b
 Series                   a   is c
 DataFrame
 Import/Export data       ##   ['Star', 'Trek']
Visual
                          ##   ['Star', 'Trek']
illustrations             ##   ['Star', 'Trek']
 Matplotlib
                          ##   True
 Figures and subplots
 Plot types and styles
                          ##   True
 Pandas visualization     ##   False
Applications              ##   True
 Time series
 Moving window
 Financial applications
                                Two equal but not identical objects are created,
                                Variables a and c link to the same object.
© 2018 PyEcon.org
                          Copying objects                                                     52
Essential
concepts
 Getting started          When we introduced lists, we initially did not mention that they are a
 Procedural
 programming              first-class example of mutable objects:
 Object-orientation
Visual
illustrations
 Matplotlib
 Figures and subplots
                                There are side effects,
 Plot types and styles
 Pandas visualization
                                Referenced mutable objects might be modified,
Applications                    Referenced immutable objects might be copyied.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Copying objects                                             54
Essential
concepts
 Getting started          We are able to make an exact copy of the object:
 Procedural
 programming
 Object-orientation       Copying
Numerical
programming               def last_element(x):
 NumPy package                y = x.copy()
 NumPy array
                              return y.pop(-1)
                          a = stocks
 Linear Algebra
© 2018 PyEcon.org
                          Deep and shallow copying                                            55
Essential
concepts
 Getting started          However, keep in mind that, in most cases, a method copy() will
 Procedural
 programming              create shallow copys while only deep copying will duplicate also the
 Object-orientation
                          contents of a mutable object with a complex structure:
Numerical
programming
 NumPy package            Cloning fast food
 NumPy array
 Linear Algebra           fastfood = [["burgers", "hot dogs"], ["pizza", "pasta"]]
Data formats and          italian = fastfood.copy()
handling                  italian.pop(0)
                          american = list(fastfood)
 Pandas
 Series
 DataFrame                american.pop(1)
 Import/Export data       american[0] = american[0].copy()
Visual                    fastfood[0][1] = "chicken wings"
illustrations
                          fastfood[1][0] = "risotto"
 Matplotlib
 Figures and subplots
                          italian
 Plot types and styles    american
 Pandas visualization
                          ## [['risotto', 'pasta']]
Applications
 Time series
                          ## [['burgers', 'hot dogs']]
 Moving window
 Financial applications
                          Both approaches, copy() and list(), create new list objects con-
                          taining new references to the original sub-lists. But for a deep copy,
© 2018 PyEcon.org
                          you have to recursively create duplicates of all its objects.
                          Classes                                                               56
Essential
concepts
 Getting started          In Python everything is an object and more complex objects consist of
 Procedural
 programming              several other objects.
 Object-orientation
© 2018 PyEcon.org
                          Class definition                                                       57
Essential
concepts
 Getting started           Specifically, we want to create “rectangle object” and define a separate
 Procedural
 programming               Rectangle class for it:
 Object-orientation
Numerical
programming
                           Rectangle class
 NumPy package             class Rectangle:
                               width = 0
 NumPy array
 Linear Algebra
                               height = 0
Data formats and
handling                       def area(self):
 Pandas                            return self.width * self.height
 Series
                           myrectangle = Rectangle()
 DataFrame
 Import/Export data        myrectangle.width = 10
Visual
                           myrectangle.height = 20
illustrations              print(myrectangle.area())
 Matplotlib
 Figures and subplots
                           ## 200
 Plot types and styles
 Pandas visualization
Applications
 Time series                   New classes are defined using the keyword class,
 Moving window
 Financial applications        The variable self always refers to the instance itself.
© 2018 PyEcon.org
                          Class constructor                                                   58
Essential
concepts
 Getting started          We add a constructor (method) __init__(), that is called to initialize
 Procedural
 programming              an object of Rectangle:
 Object-orientation
Numerical
programming
                          Rectangle class with constructor
 NumPy package            class Rectangle:
                              width = 0
 NumPy array
 Linear Algebra
                              height = 0
Data formats and
handling                      def __init__(self, width, height):
 Pandas                           self.width = width
 Series
                                  self.height = height
 DataFrame
 Import/Export data           def area(self):
Visual
                                  return self.width * self.height
illustrations             myrectangle = Rectangle(15, 30)
 Matplotlib
                          print(myrectangle.area())
 Figures and subplots
 Plot types and styles
 Pandas visualization
                          ## 450
Applications
 Time series              In our example, we use the constructor to set the attributes. Methods
                          with names matching __fun__() have a special, standardized meaning
 Moving window
 Financial applications
in Python.
© 2018 PyEcon.org
                          Class inheritance                                                    59
Essential
concepts
 Getting started          One of the most important concepts of OOP is inheritance. A class
 Procedural
 programming              inherits all attributes and methods of its parent class and can add new
 Object-orientation
                          or overwrite existing ones:
Numerical
programming
 NumPy package            Square inherits Rectangle
 NumPy array
 Linear Algebra           class Square(Rectangle):
Data formats and              def __init__(self, length):
handling                          super().__init__(length, length)
 Pandas
 Series
                              def diagonal(self):
 DataFrame                        return (self.width**2 + self.height**2)**0.5
 Import/Export data       mysquare = Square(15)
Visual
illustrations             print(f"Area: {mysquare.area()}")
 Matplotlib               print(f"Diagonal length: {mysquare.diagonal():7.4f}")
 Figures and subplots
 Plot types and styles    ## Area: 225
 Pandas visualization     ## Diagonal length: 21.2132
Applications
 Time series
 Moving window            The methods of the parent class, including the constructor, may be
 Financial applications
                          referenced by super().
© 2018 PyEcon.org
                          Garbage collection                                                      60
Essential
concepts
 Getting started          You do not have to worry about memory management in Python. The
 Procedural
 programming              garbage collector will tidy up for you.
 Object-orientation
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Namespaces                                                    62
Essential
concepts
 Getting started          Reference names from the local namespace mask the same names in
 Procedural
 programming              an outer or in the global namespace:
 Object-orientation
Numerical
programming
                          Namespaces
 NumPy package
                          def multiplier(x):
                              x = 4 * x
 NumPy array
 Linear Algebra
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Namespaces                                                           63
Essential
concepts
 Getting started          In fact, functions defined in Python are themselves objects that remem-
 Procedural
 programming              ber and can access their own context where they were created. This
 Object-orientation
                          concept comes from functional programming and is called closure:
Numerical
programming
 NumPy package            Closures
 NumPy array
 Linear Algebra
                          def gen_multiplier(a):
Data formats and
                              def fun(x):
handling                          return a * x
 Pandas
                              return(fun)
                          multi1 = gen_multiplier(4)
 Series
 DataFrame
 Import/Export data       multi2 = gen_multiplier(5)
Visual                    multi1
illustrations             multi1("EH")
 Matplotlib
 Figures and subplots
                          multi2("EH")
 Plot types and styles    ## <function gen_multiplier.<locals>.fun at 0x7f042eaa6048>
 Pandas visualization
                          ## EHEHEHEH
Applications              ## EHEHEHEHEH
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Managing code                                                         64
Essential
concepts
 Getting started          In order to provide, maintain and extend modular functionality with
 Procedural
 programming              Python, its code containing components can be described hierarchically:
 Object-orientation
Numerical
programming                                 Packages
 NumPy package
 NumPy array
 Linear Algebra                                      Modules
Data formats and
handling
 Pandas
                                                               Classes
 Series
 DataFrame                                                               Functions
 Import/Export data
Applications
                          In the latter case, all classes and functions, but no instances, are
 Time series              imported from the datetime namespace.
 Moving window
 Financial applications
© 2018 PyEcon.org
                          The Zen of Python                                                66
Essential
concepts
 Getting started
 Procedural               The Zen of Python
 programming
 Object-orientation       import this
Numerical
programming
                          ##   The Zen of Python, by Tim Peters
 NumPy package            ##
 NumPy array              ##   Beautiful is better than ugly.
 Linear Algebra
                          ##   Explicit is better than implicit.
Data formats and
handling
                          ##   Simple is better than complex.
 Pandas
                          ##   Complex is better than complicated.
 Series                   ##   Flat is better than nested.
 DataFrame
                          ##   Sparse is better than dense.
                          ##   Readability counts.
 Import/Export data
Visual
illustrations
                          ##   Special cases aren't special enough to break the rules.
 Matplotlib               ##   Although practicality beats purity.
 Figures and subplots     ##   Errors should never pass silently.
 Plot types and styles
 Pandas visualization
                          ##   Unless explicitly silenced.
                          ##   In the face of ambiguity, refuse the temptation to guess.
Applications
 Time series
                          ##   ...
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Further topics                                                      67
Essential
concepts
 Getting started          A selection of exciting topics that are among the advanced basics but
 Procedural
 programming              are not covered in this lecture:
 Object-orientation
Numerical
programming
                              Dynamic language concepts, such as duck typing,
 NumPy package
 NumPy array
                              Further, complex type classes, such as ChainMap or OrderedDict,
 Linear Algebra
                              Iterators and generators in detail,
Data formats and
handling
 Pandas
                              Exception handling, raising exceptions, catching errors,
 Series
 DataFrame
                              Debugging, introspection and annotations.
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Chapter 2               68
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Numerical programming
Numerical
programming
 NumPy package
 NumPy array
                          2.1 NumPy package
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 2.1             69
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Numerical programming
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          The NumPy package                                                  70
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Motivation                                                  71
Essential
concepts
 Getting started
 Procedural               Element-wise addition
 programming
 Object-orientation       vec1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
Numerical                 vec2 = np.array(vec1)
programming               print(vec1 + vec1)
 NumPy package
 NumPy array
 Linear Algebra
                          ## [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Data formats and
handling
                          print(vec2 + vec2)
 Pandas
 Series                   ## [ 2   4   6   8 10 12 14 16 18]
 DataFrame
 Import/Export data
                          for i in range(len(vec1)):
Visual                        vec1[i] += vec1[i]
illustrations
 Matplotlib
                          print(vec1)
 Figures and subplots
 Plot types and styles    ## [2, 4, 6, 8, 10, 12, 14, 16, 18]
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Motivation                                                      72
Essential
concepts
 Getting started
 Procedural               Matrix multiplication
 programming
 Object-orientation       mat1 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Numerical                 mat2 = np.array(mat1)
programming
                          print(np.dot(mat2, mat2))
 NumPy package
 NumPy array
 Linear Algebra           ## [[ 30 36 42]
Data formats and
                          ## [ 66 81 96]
handling                  ## [102 126 150]]
 Pandas
 Series
                          mat3 = np.zeros([3, 3])
 DataFrame
 Import/Export data       for i in range(3):
Visual
                              for k in range(3):
illustrations                     for j in range(3):
 Matplotlib
                                      mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
 Figures and subplots
 Plot types and styles
                          print(mat3)
 Pandas visualization
Applications
                          ## [[ 30. 36. 42.]
 Time series              ## [ 66. 81. 96.]
 Moving window            ## [102. 126. 150.]]
 Financial applications
© 2018 PyEcon.org
                          Motivation                                                      73
Essential
concepts
 Getting started
 Procedural               Time comparison
 programming
 Object-orientation       import time
Numerical                 mat1 = np.random.rand(50, 50)
programming               mat2 = np.array(mat1)
                          t = time.time()
 NumPy package
 NumPy array
 Linear Algebra           mat3 = np.dot(mat2, mat2)
Data formats and          nptime = time.time() - t
handling                  mat3 = np.zeros([50, 50])
 Pandas
 Series
                          t = time.time()
 DataFrame                for i in range(50):
 Import/Export data           for k in range(50):
Visual                            for j in range(50):
illustrations
                                      mat3[i][k] = mat3[i][k] + mat1[i][j] * mat1[j][k]
                          pytime = time.time() - t
 Matplotlib
 Figures and subplots
 Plot types and styles    times = str(pytime / nptime)
 Pandas visualization     print("NumPy is " + times + " times faster!")
Applications
 Time series              ## NumPy is 35.166825796371846 times faster!
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 2.2             74
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Numerical programming
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Creating NumPy arrays                                     75
Essential
concepts
 Getting started          np.array(list): converts python list into NumPy arrays.
 Procedural
 programming              array.ndim: returns dimension of the array.
 Object-orientation
                          array.shape: return shape of the array as a list.
Numerical
programming
 NumPy package            Creation
 NumPy array
 Linear Algebra           arr1 = [4, 8, 2]
Data formats and
                          arr1 = np.array(arr1)
handling                  arr2 = np.array([24.3, 0., 8.9, 4.4, 1.65, 45])
 Pandas
                          arr3 = np.array([[4, 8, 5], [9, 3, 4], [1, 0, 6]])
 Series
 DataFrame
                          print(arr1.ndim)
 Import/Export data
Visual
                          ## 1
illustrations
 Matplotlib               print(arr3.shape)
 Figures and subplots
 Plot types and styles
 Pandas visualization
                          ## (3, 3)
Applications
 Time series
 Moving window            From now on, the name array refers to an np.array().
 Financial applications
© 2018 PyEcon.org
                          Array creation functions                                       76
Essential
concepts
 Getting started           np.arange(start, stop, step): array of values from start to
 Procedural
 programming               stop.
 Object-orientation
                           np.zeros((rows, columns)): array with all values set to 0.
Numerical
programming                np.identity(dimension): identity matrix of a certain dimension.
 NumPy package
 NumPy array
 Linear Algebra
                           Creation functions
Data formats and           print(np.zeros((4, 3)))
handling
 Pandas
 Series
                           ## [[0. 0. 0.]
 DataFrame                 ## [0. 0. 0.]
 Import/Export data        ## [0. 0. 0.]
Visual                     ## [0. 0. 0.]]
illustrations
 Matplotlib
                           print(np.arange(6))
 Figures and subplots
 Plot types and styles
 Pandas visualization      ## [0 1 2 3 4 5]
Applications
 Time series
                           print(np.identity(3))
 Moving window
 Financial applications    ## [[1. 0. 0.]
                           ## [0. 1. 0.]
                           ## [0. 0. 1.]]
© 2018 PyEcon.org
                          Array creation functions                                        77
Essential
concepts
 Getting started           array.linspace(start, stop, n): array of n evenly divided values
 Procedural
 programming               from start to stop.
 Object-orientation
                           array.full((row, column), k): array with all values set to k.
Numerical
programming
 NumPy package            Array creation
 NumPy array
 Linear Algebra            print(np.linspace(0, 80, 5))
Data formats and
handling                   ## [ 0. 20. 40. 60. 80.]
 Pandas
 Series
 DataFrame
                           print(np.full((5, 4), 7))
 Import/Export data
                           ## [[7 7 7 7]
Visual
illustrations              ## [7 7 7 7]
 Matplotlib                ## [7 7 7 7]
 Figures and subplots
                           ## [7 7 7 7]
 Plot types and styles
 Pandas visualization      ## [7 7 7 7]]
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Array creation functions                                     78
Essential
concepts
 Getting started           np.random.rand(rows, columns): array of random floats between
 Procedural
 programming               zero and one.
 Object-orientation
                           np.rondom.randint(k, size=(rows, columns)): array of random
Numerical
programming                integers between 0 and k-1.
 NumPy package
 NumPy array
 Linear Algebra
                          Array of random numbers
Data formats and           print(np.random.rand(3, 3))
handling
 Pandas
 Series
                           ## [[0.65417053 0.63215654 0.72761157]
 DataFrame                 ## [0.30757468 0.64874108 0.69997956]
 Import/Export data
                           ## [0.74054193 0.57131055 0.77555459]]
Visual
illustrations
                           print(np.random.randint(10, size=(5, 4)))
 Matplotlib
 Figures and subplots
 Plot types and styles     ## [[4 2 0 4]
 Pandas visualization      ## [3 1 0 9]
Applications               ## [9 6 0 0]
 Time series               ## [3 8 1 9]
 Moving window
                           ## [9 7 6 7]]
 Financial applications
© 2018 PyEcon.org
                          Copy arrays                                                         79
Essential
concepts
 Getting started
 Procedural               Reference
 programming
 Object-orientation       print(arr3)
Numerical
programming               ## [[4 8 5]
 NumPy package            ## [9 3 4]
 NumPy array
 Linear Algebra
                          ## [1 0 6]]
Data formats and
handling
                          arr = arr3
 Pandas                   arr[1, 1] = 777
 Series                   print(arr3)
 DataFrame
                          ## [[   4   8    5]
 Import/Export data
Visual
illustrations
                          ## [    9 777    4]
 Matplotlib               ## [    1   0    6]]
 Figures and subplots
 Plot types and styles    arr3[1, 1] = 3
 Pandas visualization
Applications
 Time series
 Moving window
                          call-by-reference
                          arr = arr3 binds arr to the existing arr3. They both refer to the
 Financial applications
                          same object.
© 2018 PyEcon.org
                          Copy array                                                    80
Essential
concepts
 Getting started          array.copy(): copy array without reference (call-by-value).
 Procedural
 programming
 Object-orientation
© 2018 PyEcon.org
                          Overview: array creation functions                                 81
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                           Function                  Description
Numerical
programming                array                     Convert input array in NumPy array
 NumPy package
 NumPy array               arange(start,stop,step)   Creates array from given input
 Linear Algebra
                           ones                      Creates array containing only ones
Data formats and
handling                   zeros                     Creates array containing only zeros
 Pandas
 Series
                           empty                     Allocating memory without specific values
 DataFrame                 eye, identity             Creates N x N identity matrix
 Import/Export data
Visual
                           linspace                  Creats array of evenly divided values
illustrations              full                      Creates array with values set to one number
 Matplotlib
 Figures and subplots      random.rand               Creates array of random floats
 Plot types and styles
 Pandas visualization
                           random.randint            Creates array of random int
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data types of arrays                      82
Essential
concepts
 Getting started          array.dtype: type of array,
 Procedural
 programming              array.astype(np.type): manual typecast.
 Object-orientation
Numerical
programming
                          Data types
 NumPy package
                          print(arr1.dtype)
 NumPy array
 Linear Algebra
                          ## int64
Data formats and
handling
 Pandas                   print(arr2.dtype)
 Series
 DataFrame                ## float64
 Import/Export data
                          ## int64
 Financial applications
© 2018 PyEcon.org
                          Array operations                                               83
Essential
concepts
 Getting started
 Procedural
                           Element-wise operations
 programming
 Object-orientation
                           Calculation operators on NumPy arrays operate element-wise.
Numerical
programming
 NumPy package
 NumPy array
                           Element-wise operations
 Linear Algebra
                           print(arr3)
Data formats and
handling
                           ## [[4 8 5]
 Pandas
 Series                    ## [9 3 4]
 DataFrame                 ## [1 0 6]]
 Import/Export data
© 2018 PyEcon.org
                          Slicing                                                        85
Essential
concepts
 Getting started           array[start : stop : step]: Selecting a subset of the data.
 Procedural
 programming
 Object-orientation        Slicing in one dimension
Numerical
programming                arr = np.arange(10)
 NumPy package             print(arr)
 NumPy array
 Linear Algebra
                           ## [0 1 2 3 4 5 6 7 8 9]
Data formats and
handling
 Pandas
                           print(arr[4])
 Series
 DataFrame                 ## 4
 Import/Export data
Visual                     print(arr[3:7])
illustrations
 Matplotlib
                           ## [3 4 5 6]
 Figures and subplots
 Plot types and styles
 Pandas visualization      print(arr)
Applications
 Time series
                           ## [0 1 2 3 4 5 6 7 8 9]
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Slicing                                86
Essential
concepts
 Getting started
 Procedural                Slicing in one dimension with steps
 programming
 Object-orientation        print(arr[:7])
Numerical
programming                ## [0 1 2 3 4 5 6]
 NumPy package
 NumPy array
 Linear Algebra
                           print(arr[-3:])
Data formats and
handling                   ## [7 8 9]
 Pandas
 Series                    print(arr[::-1])
 DataFrame
 Import/Export data
                           ## [9 8 7 6 5 4 3 2 1 0]
Visual
illustrations
 Matplotlib
                           print(arr[::2])
 Figures and subplots
 Plot types and styles     ## [0 2 4 6 8]
 Pandas visualization
Applications               print(arr[:5:-1])
 Time series
 Moving window             ## [9 8 7 6]
 Financial applications
© 2018 PyEcon.org
                          Slicing                                                   87
Essential
concepts
 Getting started
 Procedural                Slicing in higher dimensions
 programming
 Object-orientation
                          In n-dimensional arrays the element at each index is an
Numerical
programming               (n-1)-dimensional array.
 NumPy package
 NumPy array
 Linear Algebra            Indexing in two dimensions
Data formats and
handling                   print(arr3)
 Pandas
 Series
                           ## [[4 8 5]
                           ## [9 3 4]
 DataFrame
 Import/Export data
Visual
                           ## [1 0 6]]
illustrations
 Matplotlib                vec = arr3[1]
 Figures and subplots
                           print(vec)
 Plot types and styles
 Pandas visualization
                           ## [9 3 4]
Applications
                           print(arr3[1, 0])
 Time series
 Moving window
 Financial applications
                           ## 9
© 2018 PyEcon.org
                          Slicing                      88
Essential
concepts
 Getting started
 Procedural                Slicing in two dimensions
 programming
 Object-orientation        print(arr3)
Numerical
programming                ## [[4 8 5]
 NumPy package
 NumPy array
                           ## [9 3 4]
 Linear Algebra            ## [1 0 6]]
Data formats and
handling                   print(arr3[0:2, 0:2])
 Pandas
 Series                    ## [[4 8]
 DataFrame
 Import/Export data
                           ## [9 3]]
Visual
illustrations              print(arr3[2:, :])
 Matplotlib
 Figures and subplots      ## [[1 0 6]]
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Slicing                                                        89
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Views on arrays                                                        90
Essential
concepts
 Getting started          So far, selecting by index numbers or slicing belongs to basic indexing
 Procedural
 programming              in NumPy. With basic indexing you get NO COPY of your data but a
 Object-orientation
                          so-called view on the existing data set – a different perspective.
Numerical
programming               A view on an array can be seen as a reference to a rectangular memory
 NumPy package
 NumPy array
                          area of its values. The view is intended to
 Linear Algebra
                               edit a rectangular part of a matrix, e.g., a sub-matrix, a column,
Data formats and
handling                       or a single value,
 Pandas
 Series                        change the shape of the matrix or the arrangement of its elements,
 DataFrame
 Import/Export data
                               e.g., transpose or reshape a matrix,
Visual
illustrations
                               change the visual representation of values, e.g, to cast a float
 Matplotlib                    array into an int array,
 Figures and subplots
 Plot types and styles         map the values in other program areas.
 Pandas visualization
Applications              The crucial point here is that for efficiency reasons data arrays in your
 Time series
                          working memory do not have to be copied again and again for simple
 Moving window
 Financial applications   index operations, which would require an excessive additional effort
                          writing to the computer memory.
© 2018 PyEcon.org
                          Creating views implicitly                                        91
Essential
concepts
 Getting started          A view is created automatically when you do basic indexing such as
 Procedural
 programming              slicing:
 Object-orientation
Numerical
programming
                          Create a view by slicing
 NumPy package
                          column = arr3[:, 1]
 NumPy array
 Linear Algebra
                          print(column)
Data formats and
handling                  ## [8 3 0]
 Pandas
 Series                   print(column.base)
 DataFrame
 Import/Export data
                          ## [[4 8 5]
Visual
illustrations
                          ## [9 3 4]
 Matplotlib               ## [1 0 6]]
 Figures and subplots
 Plot types and styles    column[1] = 100
 Pandas visualization
                          print(arr3)
Applications
 Time series
                          ## [[   4   8   5]
                          ## [    9 100   4]
 Moving window
 Financial applications
                          ## [    1   0   6]]
© 2018 PyEcon.org
                          Creating views implicitly                                           92
Essential
concepts
 Getting started
 Procedural               Create a view by slicing
 programming
 Object-orientation       elem = column[1:2]
Numerical                 print(elem.base)
programming
 NumPy package
 NumPy array
                          ## [[   4   8   5]
 Linear Algebra           ## [    9 100   4]
Data formats and
                          ## [    1   0   6]]
handling
 Pandas                   elem[0] = 3
 Series
                          print(arr3)
 DataFrame
 Import/Export data
                          ## [[4 8 5]
Visual
illustrations             ## [9 3 4]
 Matplotlib               ## [1 0 6]]
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications                  The middle column is a view of the base array referenced by arr3,
 Time series
 Moving window                Any changes to the values of a view directly affect the base data,
 Financial applications
                              A view of a view is another view on the same base matrix.
© 2018 PyEcon.org
                          Obtaining views explicitly                                        93
Essential
concepts
 Getting started          In addition, an array contains methods and attributes that return a
 Procedural
 programming              view of its data:
 Object-orientation
                          ## False
© 2018 PyEcon.org
                          Obtaining views explicitly                                           94
Essential
concepts
 Getting started
 Procedural               Obtain a view
 programming
 Object-orientation       arr3_v = arr3.view()
Numerical                 print(arr3_v.flags.owndata)
programming
 NumPy package            ## False
 NumPy array
 Linear Algebra
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Fancy indexing                                                      95
Essential
concepts
 Getting started          The behavior described above changes with advanced indexing, i. e., if
 Procedural
 programming              at least one component of the index tuple is not a scalar index number
 Object-orientation
                          or slice. The case of fancy indexing is described below:
Numerical
programming
 NumPy package            Advanced and basic indexing
 NumPy array
 Linear Algebra           print(arr3)
Data formats and
handling                  ## [[4 8 5]
 Pandas
                          ## [9 3 4]
                          ## [1 0 6]]
 Series
 DataFrame
 Import/Export data
Visual
                          arr = arr3[[0, 2], [0, 2]]
illustrations             print(arr)
 Matplotlib
 Figures and subplots
                          ## [4 6]
 Plot types and styles
 Pandas visualization
                          print(arr.base)
Applications
 Time series
 Moving window
                          ## None
 Financial applications
© 2018 PyEcon.org
                          Fancy indexing                                                       96
Essential
concepts
 Getting started
 Procedural               Advanced and basic indexing
 programming
 Object-orientation       arr = arr3[0:3:2, 0:3:2]
Numerical                 print(arr)
programming
 NumPy package
                          ## [[4 5]
 NumPy array
 Linear Algebra           ## [1 6]]
Data formats and
handling                  print(arr.base)
 Pandas
 Series                   ## [[4 8 5]
 DataFrame
 Import/Export data
                          ## [9 3 4]
                          ## [1 0 6]]
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
                              Contrary to intuition, fancy indexing does not return a (2 × 2)-
 Pandas visualization
                              matrix, but a vector of the matrix elements (0, 0) and (2, 2). This
Applications
 Time series
                              is a complete copy – a new object and not a view to the original
 Moving window                matrix.
 Financial applications
                              A submatrix (view) with the corner elements of the initial matrix
                              can be obtained with slicing.
© 2018 PyEcon.org
                          Conditional indexing                                         97
Essential
concepts
 Getting started          Filter arrays without using loops by conditional indexing.
 Procedural
 programming
 Object-orientation       Find and replace values in arrays, condition: smaller
Numerical
programming               print(arr3)
 NumPy package
 NumPy array              ## [[4 8 5]
 Linear Algebra
                          ## [9 3 4]
Data formats and          ## [1 0 6]]
handling
 Pandas
 Series                   arr = arr3.copy()
 DataFrame                arr[arr < 5] = 0
 Import/Export data
                          print(arr)
Visual
illustrations
                          ## [[0 8 5]
 Matplotlib
 Figures and subplots
                          ## [9 0 0]
 Plot types and styles    ## [0 0 6]]
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Conditional indexing                                  98
Essential
concepts
 Getting started
 Procedural               Find and replace values in arrays, condition: equal
 programming
 Object-orientation       print(arr3)
Numerical
programming               ## [[4 8 5]
 NumPy package
 NumPy array
                          ## [9 3 4]
 Linear Algebra           ## [1 0 6]]
Data formats and
handling                  arr = arr3.copy()
 Pandas                   arr[arr == 4] = 100
 Series
                          print(arr)
 DataFrame
 Import/Export data
                          ## [[100      8   5]
Visual
illustrations             ## [ 9        3 100]
 Matplotlib               ## [ 1        0   6]]
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Reshaping arrays                                           99
Essential
concepts
 Getting started          array.reshape((rows, columns)): reshaping existing array.
 Procedural
 programming              array.resize((rows, columns)): changes array shape to rows x
 Object-orientation
                          columns and fills new values with 0.
Numerical
programming
 NumPy package
                          Reshape
 NumPy array
 Linear Algebra
                          arr = np.arange(15)
Data formats and
                          print(arr.reshape((3, 5)))
handling
 Pandas                   ## [[ 0 1 2 3 4]
 Series
                          ## [ 5 6 7 8 9]
 DataFrame
 Import/Export data
                          ## [10 11 12 13 14]]
Visual
illustrations             arr.resize((3, 7))
 Matplotlib               print(arr)
 Figures and subplots
 Plot types and styles
                          ## [[ 0   1   2 3 4 5 6]
 Pandas visualization
                          ## [ 7    8   9 10 11 12 13]
Applications
 Time series
                          ## [14    0   0 0 0 0 0]]
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Adding and removing elements of arrays                      100
Essential
concepts
 Getting started          np.append(array, value): appends value to the end of array.
 Procedural
 programming              np.insert(array, index, value): inserts values before index.
 Object-orientation
                          np.delete(array, index, axis): deletes row or column on index.
Numerical
programming
 NumPy package            Naming
 NumPy array
 Linear Algebra           a = np.arange(5)
Data formats and          a = np.append(a, 8)
handling                  a = np.insert(a, 3, 77)
 Pandas
 Series
                          print(a)
 DataFrame
 Import/Export data       ## [ 0   1   2 77   3   4   8]
Visual
illustrations             a.resize((3, 3))
 Matplotlib
                          print(np.delete(a, 1, axis=0))
 Figures and subplots
 Plot types and styles
 Pandas visualization     ## [[0 1 2]
Applications
                          ## [8 0 0]]
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Combining and splitting                                           101
Essential
concepts
 Getting started          np.concatenate((arr1, arr2), axis): join a sequence of arrays
 Procedural
 programming              along an existing axis.
 Object-orientation
                          np.split(array, n): split an array into multiple sub-arrays.
Numerical
programming               np.hsplit(array, n): split an array into multiple sub-arrays horizon-
 NumPy package
 NumPy array
                          tally.
 Linear Algebra
Visual
illustrations
                          ## [ 8    0   0]
 Matplotlib               ## [ 0    1   2]
 Figures and subplots     ## [ 3    4   5]]
 Plot types and styles
                          print(np.split(np.arange(8), 4))
 Pandas visualization
Applications
 Time series
 Moving window
                          ## [array([0, 1]), array([2, 3]), array([4, 5]), array([6, 7])]
 Financial applications
© 2018 PyEcon.org
                          Transposing array                         102
Essential
concepts
 Getting started           array.T: transposed array (as a view).
 Procedural
 programming
 Object-orientation       Transpose
Numerical
programming                print(arr3)
 NumPy package
 NumPy array               ## [[4 8 5]
 Linear Algebra
                           ## [9 3 4]
Data formats and
handling
                           ## [1 0 6]]
 Pandas
 Series                    print(arr3.T)
 DataFrame
 Import/Export data        ## [[4 9 1]
Visual                     ## [8 3 0]
                           ## [5 4 6]]
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles     print(np.eye(3).T)
 Pandas visualization
© 2018 PyEcon.org
                          Matrix multiplication                                         103
Essential
concepts
 Getting started          np.dot(array1, array2): matrix multiplication of array1 and array2.
 Procedural
 programming
 Object-orientation       Matrix multiplication
Numerical
programming               res = np.dot(arr3, np.arange(18).reshape((3, 6)))
 NumPy package            print(res)
 NumPy array
 Linear Algebra
                          ## [[108 125 142 159 176 193]
Data formats and          ## [ 66 82 98 114 130 146]
handling
 Pandas
                          ## [ 72 79 86 93 100 107]]
 Series
 DataFrame                res = np.dot(np.eye(4), np.arange(16).reshape((4, 4)))
 Import/Export data
                          print(res)
Visual
illustrations
 Matplotlib
                          ## [[ 0. 1. 2. 3.]
 Figures and subplots     ## [ 4. 5. 6. 7.]
 Plot types and styles    ## [ 8. 9. 10. 11.]
 Pandas visualization
                          ## [12. 13. 14. 15.]]
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Array functions                                       104
Essential
concepts
 Getting started
 Procedural                Element-wise functions
 programming
 Object-orientation        print(arr3)
Numerical
programming                ## [[4 8 5]
 NumPy package             ## [9 3 4]
 NumPy array
 Linear Algebra
                           ## [1 0 6]]
Data formats and
handling
                           print(np.sqrt(arr3))
 Pandas
 Series                    ## [[2.         2.82842712 2.23606798]
 DataFrame                 ## [3.          1.73205081 2.        ]
                           ## [1.          0.         2.44948974]]
 Import/Export data
Visual
illustrations
 Matplotlib
                           print(np.exp(arr3))
 Figures and subplots
 Plot types and styles     ## [[5.45981500e+01 2.98095799e+03 1.48413159e+02]
 Pandas visualization
                           ## [8.10308393e+03 2.00855369e+01 5.45981500e+01]
Applications               ## [2.71828183e+00 1.00000000e+00 4.03428793e+02]]
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Overview: element-wise array functions                                105
Essential
concepts
 Getting started
 Procedural
 programming                 Function            Description
 Object-orientation
                             abs                 Absolute value of integer and floating point
Numerical
programming                  sqrt                Sqare root
 NumPy package
 NumPy array                 exp                 Exponential function
 Linear Algebra
                             log, log10, log2 Natural logarithm, log base 10, log base 2
Data formats and
handling                     sign                Sign (1 : positiv, 0: zero, -1 : negative)
 Pandas
 Series
                             ceil                Rounding up to integer
 DataFrame                   floor               Round down to integer
 Import/Export data
Visual
                             rint                Round to nearest integer
illustrations
 Matplotlib
                             modf                Returns fractional parts
 Figures and subplots        sin, cos, tan, sinh, cosh, tanh, arcsin, ...
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Binary functions                               106
Essential
concepts
 Getting started
 Procedural               Binary
 programming
 Object-orientation       x = np.array([3, -6, 8, 4, 3, 5])
Numerical                 y = np.array([3, 5, 7, 3, 5, 9])
programming
                          print(np.maximum(x, y))
 NumPy package
 NumPy array
 Linear Algebra           ## [3 5 8 4 5 9]
Data formats and
handling                  print(np.greater_equal(x, y))
 Pandas
 Series                   ## [ True False    True    True False False]
 DataFrame
                          print(np.add(x, y))
 Import/Export data
Visual
illustrations
 Matplotlib
                          ## [ 6 -1 15   7   8 14]
 Figures and subplots
 Plot types and styles    print(np.mod(x, y))
 Pandas visualization
Applications              ## [0 4 1 1 3 5]
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Overview: binary functions                                             107
Essential
concepts
 Getting started
 Procedural
 programming                Function Description
 Object-orientation
                            add           Add elements of arrays
Numerical
programming                 subtract      Subtract elements in the second from the first array
 NumPy package
 NumPy array                multiply      Multiply elements
 Linear Algebra
                            divide        Divide elements
Data formats and
handling                    power         Raise elements in first array to powers in second
 Pandas
 Series
                            maximum Element-wise maximum
 DataFrame                  minimum Element-wise minimum
 Import/Export data
Visual
                            mod           Element-wise modulus
illustrations
 Matplotlib
                            greater, less, equal gives boolean
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data processing                                             108
Essential
concepts
 Getting started          np.meshgrid(array1, array2): coordinate matrice from coordinate
 Procedural
 programming              arrays.
 Object-orientation
                                                            p
Numerical
programming               Evaluate the function f (x , y ) = x 2 + y 2 on a 10 x 10 grid
 NumPy package
 NumPy array              p = np.arange(-5, 5, 0.01)
 Linear Algebra           x, y = np.meshgrid(p, p)
Data formats and          print(x)
handling
 Pandas
 Series
                          ## [[-5.   -4.99 -4.98 ...   4.97   4.98   4.99]
 DataFrame                ## [-5.    -4.99 -4.98 ...   4.97   4.98   4.99]
 Import/Export data       ## [-5.    -4.99 -4.98 ...   4.97   4.98   4.99]
Visual                    ## ...
illustrations
                          ## [-5.    -4.99 -4.98 ...   4.97   4.98   4.99]
                          ## [-5.    -4.99 -4.98 ...   4.97   4.98   4.99]
 Matplotlib
 Figures and subplots
 Plot types and styles    ## [-5.    -4.99 -4.98 ...   4.97   4.98   4.99]]
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data processing                                                      109
Essential
concepts
 Getting started                                               p
 Procedural
 programming
                          Evaluate the function f (x , y ) =    x 2 + y 2 on a 10 x 10 grid.
 Object-orientation
                          import matplotlib.pyplot as plt
Numerical
programming
                          val = np.sqrt(x**2 + y**2)
 NumPy package            plt.figure(figsize=(2, 2))
 NumPy array              plt.imshow(val, cmap="hot")
 Linear Algebra
                          plt.colorbar()
Data formats and
handling
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data processing                                                      110
Essential
concepts
 Getting started                                               p
 Procedural
 programming
                          Evaluate the function f (x , y ) =    x 2 + y 2 on a 10 x 10 grid.
                          plt.show()
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
                                                          4
 DataFrame
 Import/Export data
Visual
illustrations
                                                          2
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Conditional logic                                             111
Essential
concepts
 Getting started          np.where(condition, a, b): If condition is True, take value from
 Procedural
 programming              a, else take b.
 Object-orientation
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Conditional logic                                112
Essential
concepts
 Getting started
 Procedural               Conditional logic, examples
 programming
 Object-orientation       print(arr3)
Numerical
programming               ## [[4 8 5]
 NumPy package
 NumPy array
                          ## [9 3 4]
 Linear Algebra           ## [1 0 6]]
Data formats and
handling                  res = np.where(arr3 < 5, 0, arr3)
 Pandas                   print(res)
 Series
 DataFrame
 Import/Export data
                          ## [[0 8 5]
                          ## [9 0 0]
Visual
illustrations             ## [0 0 6]]
 Matplotlib
 Figures and subplots     even = np.where(arr3 % 2 == 0, arr3, arr3 + 1)
 Plot types and styles
 Pandas visualization
                          print(even)
Applications
                          ## [[ 4   8   6]
 Time series
 Moving window
                          ## [10    4   4]
 Financial applications   ## [ 2    0   6]]
© 2018 PyEcon.org
                          Statistical methods                         113
Essential
concepts
 Getting started          array.mean(): mean of all array elements.
 Procedural
 programming              array.sum(): sum of all array elements.
 Object-orientation
Numerical
programming
                          Statistical methods
 NumPy package            print(arr3)
 NumPy array
 Linear Algebra
                          ## [[4 8 5]
Data formats and
handling
                          ## [9 3 4]
 Pandas                   ## [1 0 6]]
 Series
 DataFrame                print(arr3.mean())
 Import/Export data
Visual                    ## 4.444444444444445
illustrations
 Matplotlib
 Figures and subplots
                          print(arr3.sum())
 Plot types and styles
 Pandas visualization     ## 40
Applications
 Time series              print(arr3.argmin())
 Moving window
 Financial applications
                          ## 7
© 2018 PyEcon.org
                          Overview: statistical methods                               114
Essential
concepts
 Getting started
 Procedural
 programming                  Method           Description
 Object-orientation
                              sum              Sum of all array elements
Numerical
programming                   mean             Mean of all array elements
 NumPy package
 NumPy array                  std, var         Standard deviation, variance
 Linear Algebra
                              min, max         Minimum and Maximum value in array
Data formats and
handling                      argmin, argmax   Indices of Minimum and Maximum value
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Axis                                                             115
Essential
concepts
 Getting started           Axes are defined for arrays with more than one dimension. A two-
 Procedural
 programming               dimensional array has two axes. The first one is running vertically
 Object-orientation
                           downwards across the rows (axis=0), the second one running horizon-
Numerical
programming                tally across the columns (axis=1).
 NumPy package
 NumPy array
 Linear Algebra
                          Axis
Data formats and           print(arr3)
handling
 Pandas
 Series
                           ## [[4 8 5]
 DataFrame                 ## [9 3 4]
 Import/Export data        ## [1 0 6]]
Visual
illustrations              print(arr3.sum(axis=0))
 Matplotlib
                           ## [14 11 15]
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
                           print(arr3.sum(axis=1))
 Time series
 Moving window             ## [17 16     7]
 Financial applications
© 2018 PyEcon.org
                          Sorting                                           116
Essential
concepts
 Getting started          array.sort(axis): sort array by an axis.
 Procedural
 programming
 Object-orientation       Sorting one-dimensional arrays
Numerical
programming               print(arr2)
 NumPy package
 NumPy array              ## [24.3      0.     8.9   4.4    1.65 45.    ]
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Sorting                                                        117
Essential
concepts
 Getting started
 Procedural
                          Sorting two-dimensional arrays
 programming
 Object-orientation       print(arr3)
Numerical
programming               ## [[4 8 5]
 NumPy package            ## [9 3 4]
 NumPy array
 Linear Algebra
                          ## [1 0 6]]
Data formats and
handling
                          arr3.sort()
 Pandas                   print(arr3)
 Series
 DataFrame                ## [[4 5 8]
                          ## [3 4 9]
 Import/Export data
Visual
illustrations
                          ## [0 1 6]]
 Matplotlib
 Figures and subplots     arr3.sort(axis=0)
 Plot types and styles    print(arr3)
 Pandas visualization
Applications              ## [[0 1 6]
 Time series
                          ## [3 4 8]
 Moving window
 Financial applications
                          ## [4 5 9]]
                          The default axis using sort() is -1, which means to sort along the
© 2018 PyEcon.org         last axis (in this case axis 1).
                          Section 2.3             118
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Numerical programming
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Inverse matrix                                               119
Essential
concepts
 Getting started
 Procedural               Import numpy.linalg
 programming
 Object-orientation       import numpy.linalg as nplin
Numerical
programming
 NumPy package            nplin.inv(array): inverse matrix.
 NumPy array
 Linear Algebra
                          np.allclose(array1, array2): returns True if two arrays are ele-
Data formats and
                          ment-wise equal within a tolerance.
handling
 Pandas
 Series
                          Inverse
 DataFrame                inv = nplin.inv(arr3)
 Import/Export data
                          print(inv)
Visual
illustrations
 Matplotlib
                          ## [[ 4. -21. 16.]
 Figures and subplots     ## [ -5. 24. -18.]
 Plot types and styles    ## [ 1. -4.    3.]]
 Pandas visualization
© 2018 PyEcon.org
                          Matrix functions                                        120
Essential
concepts
 Getting started          nplin.det(array): compute determininat.
 Procedural
 programming              np.trace(array): compute trace.
 Object-orientation
                          np.diag(array): return diagonal elements as an array.
Numerical
programming
 NumPy package            Linear algebra functions
 NumPy array
 Linear Algebra           print(nplin.det(arr3))
Data formats and
handling                  ## -1.0
 Pandas
 Series
 DataFrame
                          print(np.trace(arr3))
 Import/Export data
                          ## 13
Visual
illustrations
 Matplotlib               print(np.diag(arr3))
 Figures and subplots
 Plot types and styles
                          ## [0 4 9]
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Eigenvalues and eigenvectors                                     121
Essential
concepts
 Getting started          nplin.eig(array): return array of eigenvalues and array of eigenvec-
 Procedural
 programming              tors as a list.
 Object-orientation
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Eigenvalues and eigenvectors                            122
Essential
concepts
 Getting started
 Procedural               Check eigenvalues and eigenvectors
 programming
 Object-orientation       print(eigenval * eigenvec)
Numerical
programming               ## [[-0.         -0.40824829 -1.41421356]
 NumPy package
 NumPy array
                          ## [-0.          -0.81649658 -1.41421356]
 Linear Algebra           ## [-1.          -0.40824829 0.         ]]
Data formats and
handling                  print(np.dot(A, eigenvec))
 Pandas
 Series                   ## [[ 0.         -0.40824829 -1.41421356]
 DataFrame
 Import/Export data
                          ## [ 0.          -0.81649658 -1.41421356]
                          ## [-1.          -0.40824829 0.         ]]
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles                                             
 Pandas visualization                 3    −1    0      0            0      0
Applications                         2     0    0  · 0 = (−1) · 0 =  0 
 Time series
 Moving window                        −2    2   −1      1            1     −1
 Financial applications
© 2018 PyEcon.org
                          QR decomposition                                               123
Essential
concepts
 Getting started          nplin.qr(array): QR decomposition, returns Q and R as lists.
 Procedural
 programming
 Object-orientation       QR decomposition
Numerical
programming               Q, R = nplin.qr(arr3)
 NumPy package            print(Q)
 NumPy array
 Linear Algebra
                          ## [[ 0.          0.98058068 0.19611614]
Data formats and          ## [-0.6          0.15689291 -0.78446454]
handling
 Pandas
                          ## [-0.8         -0.11766968 0.58834841]]
 Series
 DataFrame                print(R)
 Import/Export data
© 2018 PyEcon.org
                          Linearsystem                                                     124
Essential
concepts
 Getting started          nplin.solve(A, b): return solution of the linearsystem Ax = b.
 Procedural
 programming
 Object-orientation       Solve linearsystems
Numerical
programming               b = np.array([7, 4, 8])
 NumPy package            x = nplin.solve(A, b)
 NumPy array
                          print(x)
 Linear Algebra
© 2018 PyEcon.org
                          Overview: linear algebra                                125
Essential
concepts
 Getting started
 Procedural
 programming                        Function      Description
 Object-orientation
                                    np.dot        Matrix multiplication
Numerical
programming                         np.trace      Sum of the diagonal elements
 NumPy package
 NumPy array                        np.diag       Diagonal elements as an array
 Linear Algebra
                                    nplin.det     Matrix determinant
Data formats and
handling                            nplin.eig     Eigenvalues and eigenvectors
 Pandas
 Series
                                    nplin.inv     Inverse matrix
 DataFrame                          nplin.qr      QR decomposition
 Import/Export data
Visual
                                    nplin.solve   Solve linearsystem
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Chapter 3                   126
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Data formats and handling
Numerical
programming
 NumPy package
 NumPy array
                          3.1   Pandas
 Linear Algebra
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 3.1                 127
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Data formats and handling
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Pandas                                                             128
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Motivation                                                         129
Essential
concepts
 Getting started          With pandas you can import and visualize financial data in only a few
 Procedural
 programming              lines of code.
 Object-orientation
Numerical                 Motivation
programming
 NumPy package
                          import pandas as pd
 NumPy array              import matplotlib.pyplot as plt
 Linear Algebra
                          fig = plt.figure()
Data formats and          ax = fig.add_subplot(1, 1, 1)
handling
 Pandas
                          dow = pd.read_csv("data/dji.csv", index_col=0, parse_dates=True)
 Series                   close = dow["Close"]
 DataFrame                close.plot(ax=ax)
 Import/Export data
                          ax.set_xlabel("Date")
Visual
illustrations
                          ax.set_ylabel("Price")
 Matplotlib               ax.set_title("DJI")
 Figures and subplots     fig.savefig("out/dji.pdf", format="pdf")
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Motivation                                                               130
Essential
concepts
 Getting started
 Procedural
 programming
                                                                       DJI
 Object-orientation
Numerical                              27500
programming
 NumPy package
 NumPy array
                                       25000
 Linear Algebra
Import/Export data
Visual
illustrations                          15000
 Matplotlib
 Figures and subplots                  12500
 Plot types and styles
 Pandas visualization
                                       10000
Applications
 Time series
 Moving window
                                       7500
 Financial applications
                                                   6      8      0       2       4      6      8
                                               200     200    201    201      201    201    201
                                                                       Date
© 2018 PyEcon.org
                          Section 3.2                 131
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Data formats and handling
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Series                                                             132
Essential
concepts
 Getting started           Series are a data structure in pandas.
 Procedural
 programming
 Object-orientation
                                 One-dimensional array-like object,
Numerical
programming                      Containing a sequence of values and an corresponding array of
 NumPy package
 NumPy array
                                 labels, called the index,
 Linear Algebra
                                 The string representation of a Series displays the index an the
Data formats and
handling                         right and the values on the right,
 Pandas
 Series                          The default index consists of the integers 0 through N-1.
 DataFrame
 Import/Export data
Visual
illustrations              String representation of a Series
                           ##   0     3
 Matplotlib
 Figures and subplots
 Plot types and styles     ##   1     7
 Pandas visualization      ##   2    -8
Applications               ##   3     4
 Time series
                           ##   4    26
                           ##   dtype: int64
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Create Series                                                   133
Essential
concepts
 Getting started
 Procedural               Import pandas
 programming
 Object-orientation       import pandas as pd
Numerical
programming
 NumPy package            pd.Series(): one-dimensional array-like object including values and
 NumPy array
 Linear Algebra
                          an index.
Data formats and
handling
                          Series
 Pandas                   obj = pd.Series([2, -5, 9, 4])
 Series
 DataFrame
                          print(obj)
 Import/Export data
Visual
                          ##   0    2
illustrations             ##   1   -5
 Matplotlib               ##   2    9
 Figures and subplots
 Plot types and styles
                          ##   3    4
 Pandas visualization     ##   dtype: int64
Applications
 Time series
 Moving window
 Financial applications
                                Simple Series formed only from a list,
                                Index is added automatically.
© 2018 PyEcon.org
                          Create Series                                                      134
Essential
concepts
 Getting started
 Procedural
                          Series indexing vs. Numpy indexing
 programming
 Object-orientation       obj2 = pd.Series([2, -5, 9, 4], index=["a", "b", "c", "d"])
Numerical                 npobj = np.array([2, -5, 9, 4])
programming               print(obj2)
 NumPy package
                          ##   a    2
 NumPy array
 Linear Algebra
Visual
                          print(obj2["b"])
illustrations
 Matplotlib               ## -5
 Figures and subplots
 Plot types and styles
                          print(npobj[1])
 Pandas visualization
Applications              ## -5
 Time series
 Moving window
 Financial applications
Visual
illustrations
                          Pandas Series can be created from:
 Matplotlib
 Figures and subplots
                                Lists,
 Plot types and styles
 Pandas visualization
                                NumPy arrays,
Applications                    Dicts.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Create Series                                                      136
Essential
concepts
 Getting started
 Procedural
 programming
                                The index of the Series can be set manually,
 Object-orientation
                                Compared to NumPy array you can use the set index to select
Numerical
programming                     single values,
 NumPy package
 NumPy array                    Data contained in a dict can be passed to a Series. The index of
 Linear Algebra
                                the resulting Series consists of the dict’s keys.
Data formats and
handling
 Pandas
 Series
                          Series from dicts
 DataFrame
 Import/Export data       dictdata = {"Göttingen": 117665, "Northeim": 28920,
Visual                                "Hannover": 532163, "Berlin": 3574830}
illustrations
                          obj3 = pd.Series(dictdata)
 Matplotlib
 Figures and subplots
                          print(obj3)
 Plot types and styles
 Pandas visualization     ##   Göttingen     117665
Applications              ##   Northeim       28920
 Time series              ##   Hannover      532163
 Moving window
                          ##   Berlin       3574830
 Financial applications
                          ##   dtype: int64
© 2018 PyEcon.org
                          Create Series                                                      137
Essential
concepts
 Getting started
 Procedural               Dict to Series with manual index
 programming
 Object-orientation       cities = ["Hamburg", "Göttingen", "Berlin", "Hannover"]
Numerical                 obj4 = pd.Series(dictdata, index=cities)
programming               print(obj4)
 NumPy package
 NumPy array
 Linear Algebra
                          ##   Hamburg             NaN
                          ##   Göttingen      117665.0
Data formats and
handling                  ##   Berlin       3574830.0
 Pandas                   ##   Hannover      532163.0
 Series
                          ##   dtype: float64
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
                                Passing a dict to a Series, the index can be set manually,
 Figures and subplots
 Plot types and styles
                                NaN (not a number) marks missing values where the index and the
 Pandas visualization           dict do not match.
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Series properties                                                  138
Essential
concepts
 Getting started           Series.values: returns the values of a Series.
 Procedural
 programming               Series.index: returns the index of a Series.
 Object-orientation
Visual                     print(obj2.index)
illustrations
 Matplotlib
                           ## Index(['a', 'b', 'c', 'd'], dtype='object')
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications                   The values and the index of a DataFrame can be printed separately
 Time series
 Moving window                 as a list,
 Financial applications
                               The default index is given as a RangeIndex.
© 2018 PyEcon.org
                          Selecting and manipulating values                                139
Essential
concepts
 Getting started
 Procedural               Series manipulation
 programming
 Object-orientation       print(obj2[["c", "d", "a"]])
Numerical
programming               ##   c    9
 NumPy package
 NumPy array
                          ##   d    4
 Linear Algebra           ##   a    2
Data formats and
                          ##   dtype: int64
handling
 Pandas                   print(obj2[obj2 < 0])
 Series
 DataFrame
 Import/Export data
                          ## b   -5
                          ## dtype: int64
Visual
illustrations
 Matplotlib
 Figures and subplots
                          NumPy-like functions can be applied on Series
 Plot types and styles
 Pandas visualization
                                For filtering data,
Applications                    To do scalar multiplications or applying math functions,
 Time series
 Moving window                  The index-value link will be preserved.
 Financial applications
© 2018 PyEcon.org
                          Selecting and manipulating values                                140
Essential
concepts
 Getting started
 Procedural
                          Series functions
                          print(obj2 * 2)
 programming
 Object-orientation
Numerical
programming               ##   a     4
 NumPy package            ##   b   -10
 NumPy array              ##   c    18
 Linear Algebra
                          ##   d     8
Data formats and
handling
                          ##   dtype: int64
 Pandas
 Series                   print(np.exp(obj2)["a":"c"])
 DataFrame
 Import/Export data
                          ##   a       7.389056
Visual                    ##   b       0.006738
illustrations
 Matplotlib
                          ##   c    8103.083928
 Figures and subplots     ##   dtype: float64
 Plot types and styles
 Pandas visualization     print("c" in obj2)
Applications
 Time series              ## True
 Moving window
 Financial applications
Numerical
programming
                          NaN
 NumPy package            print(pd.isnull(obj4))
 NumPy array
 Linear Algebra
                          ##   Hamburg       False
Data formats and
handling
                          ##   Göttingen     False
 Pandas                   ##   Berlin        False
 Series                   ##   Hannover      False
 DataFrame
                          ##   dtype: bool
 Import/Export data
Visual                    print(pd.notnull(obj4))
illustrations
 Matplotlib
 Figures and subplots     ##   Hamburg       True
 Plot types and styles    ##   Göttingen     True
 Pandas visualization
                          ##   Berlin        True
Applications              ##   Hannover      True
 Time series
 Moving window
                          ##   dtype: bool
 Financial applications
© 2018 PyEcon.org
                          Align differently indexed data                                      143
Essential
concepts
 Getting started           There are not two values to align for Hamburg and Northeim so they
 Procedural
 programming               are marked with NaN (not a number).
 Object-orientation
Numerical
programming
 NumPy package
                                Data 1                         Data 2
 NumPy array
                                print(obj3)                    print(obj4)
 Linear Algebra
Numerical                 Naming
programming
 NumPy package            obj4.name = "population"
 NumPy array
                          obj4.index.name = "city"
 Linear Algebra
                          print(obj4)
Data formats and
handling
 Pandas
                          ##   city
 Series                   ##   Hamburg      1900000.0
 DataFrame                ##   Göttingen     117665.0
                          ##   Berlin       3600000.0
 Import/Export data
Visual
illustrations
                          ##   Hannover     1100000.0
 Matplotlib               ##   Name: population, dtype: float64
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications                    The attribute name will change the name of the existing Series,
 Time series
 Moving window                  There is no default name of the Series or the index.
 Financial applications
© 2018 PyEcon.org
                          Series vs. NumPy arrays                                         145
Essential
concepts
 Getting started
 Procedural
 programming
                              NumPy arrays are accessed by their integer position.
 Object-orientation
                              Series can definied and accessed by your own index, including
Numerical
programming                   letters and numbers.
 NumPy package
 NumPy array                  Different Series can be aligned efficiently by the index.
 Linear Algebra
Data formats and              Series can work with missing values, so operations do not auto-
handling
 Pandas
                              matically fail.
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 3.3                 146
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Data formats and handling
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          DataFrame                                                                 147
Essential
concepts
 Getting started
 Procedural
 programming
                                   DataFrames are the primary structure of pandas,
 Object-orientation
                                   It represents a table of data with an ordered collection of columns,
Numerical
programming
 NumPy package
                                   Each column can have a different data type,
 NumPy array
 Linear Algebra
                                   A DataFrame can be thought of as a dict of Series sharing the
Data formats and                   same index,
handling
 Pandas                            Physically a DataFrame is two-dimensional, but by using hierachical
 Series
 DataFrame
                                   indexing it can respresent higher dimensional data.
 Import/Export data
Visual
illustrations             String representation of a DataFrame
 Matplotlib
 Figures and subplots     ##        company    price    volume
 Plot types and styles
                          ##   0    Daimler    69.20   4456290
                          ##   1       E.ON     8.11   3667975
 Pandas visualization
Applications
                          ##   2    Siemens   110.92   3669487
 Time series
 Moving window
                          ##   3       BASF    87.28   1778058
 Financial applications   ##   4        BMW    87.81   1824582
© 2018 PyEcon.org
                          DataFrame                                                                   148
Essential
concepts
 Getting started          pd.DataFrame(): a DataFrame is a tabular-like structure. It is two-
 Procedural
 programming              dimensional and has labeled axis (rows and columns).
 Object-orientation
Applications
© 2018 PyEcon.org
                          Inputs to DataFrame constructor                                     150
Essential
concepts
 Getting started
 Procedural
 programming                Type                               Description
 Object-orientation
                            2D NumPy arrays                    A matrix of data
Numerical
programming                 dict of arrays, lists, or tuples   Each sequence becomes a column
 NumPy package
 NumPy array                dict of Series                     Each value becomes a column
 Linear Algebra
                            dict of dicts                      Each inner dict becomes a column
Data formats and
handling                    List of dicts or Series            Each item becomes a row
 Pandas
 Series
                            List of lists or tuples            Treated as the 2D NumPy arrays
 DataFrame                  Another DataFrame                  Same indexes
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Indexing and adding DataFrames                                       151
Essential
concepts
 Getting started
 Procedural               Add data to DataFrame
 programming
 Object-orientation       frame2["change"] = [1.2, -3.2, 0.4, -0.12, 2.4]
Numerical                 print(frame2["change"])
programming
 NumPy package            ##   0    1.20
 NumPy array
 Linear Algebra
                          ##   1   -3.20
                          ##   2    0.40
Data formats and
handling                  ##   3   -0.12
 Pandas                   ##   4    2.40
 Series
                          ##   Name: change, dtype: float64
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
                                Selecting the column of DataFrame, a Series is returned,
 Figures and subplots
 Plot types and styles
                                A attribute-like access, e. g., frame2.change, is also possible,
 Pandas visualization
                                The returned Series has the same index as the initial DataFrame.
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Indexing DataFrames                                                   152
Essential
concepts
 Getting started
 Procedural               Indexing DataFrames
 programming
 Object-orientation       print(frame2[["company", "change"]])
Numerical
programming               ##        company   change
 NumPy package
 NumPy array
                          ##   0    Daimler     1.20
 Linear Algebra           ##   1       E.ON    -3.20
Data formats and
                          ##   2    Siemens     0.40
handling                  ##   3       BASF    -0.12
 Pandas
                          ##   4        BMW     2.40
 Series
 DataFrame
 Import/Export data
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Changing DataFrames                                        153
Essential
concepts
 Getting started          del DataFrame[column]: delete column from DataFrame.
 Procedural
 programming
 Object-orientation       DataFrame delete column
Numerical
programming
                          del frame2["volume"]
 NumPy package            print(frame2)
 NumPy array
 Linear Algebra           ##       company    price   change
Data formats and          ##   0   Daimler    69.20     1.20
handling
 Pandas
                          ##   1      E.ON     8.11    -3.20
 Series                   ##   2   Siemens   110.92     0.40
 DataFrame                ##   3      BASF    87.28    -0.12
 Import/Export data
                          ##   4       BMW    87.81     2.40
Visual
illustrations
                          print(frame2.columns)
 Matplotlib
 Figures and subplots
 Plot types and styles    ## Index(['company', 'price', 'change'], dtype='object')
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Naming DataFrames                                              154
Essential
concepts
 Getting started
 Procedural               Naming properties
 programming
 Object-orientation       frame2.index.name = "number:"
Numerical                 frame2.columns.name = "feature:"
programming
                          print(frame2)
 NumPy package
 NumPy array
 Linear Algebra
                          ##   feature:   company   price    change
Data formats and
                          ##   number:
handling                  ##   0          Daimler    69.20    1.20
 Pandas
                          ##   1             E.ON     8.11   -3.20
                          ##   2          Siemens   110.92    0.40
 Series
 DataFrame
 Import/Export data       ##   3             BASF    87.28   -0.12
Visual                    ##   4              BMW    87.81    2.40
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles          In DataFrames there is no default name for the index or the
 Pandas visualization
                                columns.
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Reindexing                                                                155
Essential
concepts
 Getting started          DataFrame.reindex(): creates new DataFrame with data conformed
 Procedural
 programming              to a new index, the initial DataFrame will not be changed.
 Object-orientation
Numerical
programming
                          Reindexing
 NumPy package
                          frame3 = frame.reindex([0, 2, 3, 4])
 NumPy array
 Linear Algebra
                          print(frame3)
Data formats and
handling                  ##        company    price    volume
 Pandas                   ##   0    Daimler    69.20   4456290
 Series
                          ##   2    Siemens   110.92   3669487
                          ##   3       BASF    87.28   1778058
 DataFrame
 Import/Export data
                          ##   4        BMW    87.81   1824582
Visual
illustrations
 Matplotlib
                                   Index values that are not already present will be filled with NaN by
 Figures and subplots
 Plot types and styles
 Pandas visualization
                                   default,
Applications
 Time series                       There are many options for filling missing values.
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Reindexing                                                           156
Essential
concepts
 Getting started
 Procedural               Filling missing values
 programming
 Object-orientation       frame4 = frame.reindex(index=[0, 2, 3, 4, 5], fill_value=0,
Numerical                                        columns=["company", "price", "market cap"])
programming
                          print(frame4)
 NumPy package
 NumPy array
 Linear Algebra           ##       company    price   market cap
Data formats and
                          ##   0   Daimler    69.20            0
handling                  ##   2   Siemens   110.92            0
 Pandas
                          ##   3      BASF    87.28            0
                          ##   4       BMW    87.81            0
 Series
 DataFrame
 Import/Export data       ##   5         0     0.00            0
Visual
illustrations             frame4 = frame.reindex(index=[0, 2, 3, 4], fill_value=np.nan,
 Matplotlib
                                                 columns=["company", "price", "market cap"])
 Figures and subplots
 Plot types and styles
                          print(frame4)
 Pandas visualization
Applications
                          ##       company    price   market cap
 Time series              ##   0   Daimler    69.20          NaN
 Moving window            ##   2   Siemens   110.92          NaN
 Financial applications
                          ##   3      BASF    87.28          NaN
                          ##   4       BMW    87.81          NaN
© 2018 PyEcon.org
                          Fill NaN                                                       157
Essential
concepts
 Getting started          DataFrame.fillna(value): filling NaN with value
 Procedural
 programming
 Object-orientation       Filling NaN
Numerical
programming
                          print(frame4[:3])
 NumPy package
 NumPy array              ##      company    price   market cap
 Linear Algebra           ## 0    Daimler    69.20          NaN
Data formats and          ## 2    Siemens   110.92          NaN
handling
 Pandas
                          ## 3       BASF    87.28          NaN
 Series
 DataFrame                frame4.fillna(1000000, inplace=True)
 Import/Export data       print(frame4[:3])
Visual
illustrations
                          ##      company    price   market cap
                          ## 0    Daimler    69.20    1000000.0
 Matplotlib
 Figures and subplots
 Plot types and styles    ## 2    Siemens   110.92    1000000.0
 Pandas visualization     ## 3       BASF    87.28    1000000.0
Applications
 Time series
 Moving window
 Financial applications
                                 The option inplace=True fills the current DafaFrame (here
                                 frame4). Without using inplace a new DataFrame will be cre-
                                 ated, filled with NaN values.
© 2018 PyEcon.org
                          Dropping entries                                             158
Essential
concepts
 Getting started          DataFrame.drop(index, axis): returns a new object with labels in
 Procedural
 programming              requested axis removed.
 Object-orientation
Numerical
programming
                          Dropping index
 NumPy package            frame5 = frame
 NumPy array
 Linear Algebra
                          print(frame5)
Data formats and
handling
                          ##       company    price    volume
 Pandas                   ##   0   Daimler    69.20   4456290
 Series                   ##   1      E.ON     8.11   3667975
 DataFrame
 Import/Export data
                          ##   2   Siemens   110.92   3669487
                          ##   3      BASF    87.28   1778058
Visual
illustrations             ##   4       BMW    87.81   1824582
 Matplotlib
 Figures and subplots     print(frame5.drop([1, 2]))
 Plot types and styles
Applications
                          ## 0     Daimler   69.20    4456290
 Time series
 Moving window
                          ## 3        BASF   87.28    1778058
 Financial applications   ## 4         BMW   87.81    1824582
© 2018 PyEcon.org
                          Dropping entries                          159
Essential
concepts
 Getting started
 Procedural               Dropping column
 programming
 Object-orientation       print(frame5[:2])
Numerical
programming               ##       company   price    volume
 NumPy package
 NumPy array
                          ## 0     Daimler   69.20   4456290
 Linear Algebra           ## 1        E.ON    8.11   3667975
Data formats and
handling                  print(frame5.drop("price", axis=1)[:3])
 Pandas
 Series                   ##       company    volume
 DataFrame
 Import/Export data
                          ## 0     Daimler   4456290
                          ## 1        E.ON   3667975
Visual
illustrations             ## 2     Siemens   3669487
 Matplotlib
 Figures and subplots     print(frame5.drop(2, axis=0))
 Plot types and styles
Applications
                          ##   0   Daimler   69.20   4456290
 Time series
 Moving window
                          ##   1      E.ON    8.11   3667975
 Financial applications   ##   3      BASF   87.28   1778058
                          ##   4       BMW   87.81   1824582
© 2018 PyEcon.org
                          Indexing, selecting and filtering                                    160
Essential
concepts
 Getting started
 Procedural
 programming
                                    Indexing of DataFrames works like indexing an numpy array, you
 Object-orientation
                                    can use the default index values and a manually set index.
Numerical
programming
 NumPy package
 NumPy array               Indexing
 Linear Algebra
                           print(frame)
Data formats and
handling
 Pandas                    ##        company    price    volume
 Series                    ##   0    Daimler    69.20   4456290
                           ##   1       E.ON     8.11   3667975
 DataFrame
 Import/Export data
                           ##   2    Siemens   110.92   3669487
Visual
illustrations              ##   3       BASF    87.28   1778058
 Matplotlib                ##   4        BMW    87.81   1824582
 Figures and subplots
 Plot types and styles
 Pandas visualization
                           print(frame[2:])
Applications
                           ##        company    price    volume
 Time series
 Moving window             ## 2      Siemens   110.92   3669487
 Financial applications    ## 3         BASF    87.28   1778058
                           ## 4          BMW    87.81   1824582
© 2018 PyEcon.org
                          Indexing, selecting and filtering                                  161
Essential
concepts
 Getting started
 Procedural                Indexing
 programming
 Object-orientation        frame6 = pd.DataFrame(data, index=["a", "b", "c", "d", "e"])
Numerical                  print(frame6)
programming
 NumPy package
 NumPy array
                           ##        company    price    volume
 Linear Algebra            ##   a    Daimler    69.20   4456290
Data formats and
                           ##   b       E.ON     8.11   3667975
handling                   ##   c    Siemens   110.92   3669487
 Pandas
                           ##   d       BASF    87.28   1778058
                           ##   e        BMW    87.81   1824582
 Series
 DataFrame
 Import/Export data
Visual
                           print(frame6["b":"d"])
illustrations
 Matplotlib                ##        company    price    volume
 Figures and subplots
                           ## b         E.ON     8.11   3667975
 Plot types and styles
 Pandas visualization
                           ## c      Siemens   110.92   3669487
Applications
                           ## d         BASF    87.28   1778058
 Time series
 Moving window
 Financial applications
                                    When slicing with labels the end element is inclusive.
© 2018 PyEcon.org
                          Indexing, selecting and filtering                                162
Essential
concepts
 Getting started           DataFrame.loc(): select a subset of rows and columns from a DataFrame
 Procedural
 programming               using axis labels.
 Object-orientation
                           DataFrame.iloc(): select a subset of rows and columns from a
Numerical
programming                DataFrame using integers.
 NumPy package
 NumPy array
 Linear Algebra
                           Selection with loc and iloc
Data formats and           print(frame6.loc["c", ["company", "price"]])
handling
 Pandas
                           ## company    Siemens
 Series
 DataFrame
                           ## price       110.92
 Import/Export data        ## Name: c, dtype: object
Visual
illustrations              print(frame6.iloc[2, [0, 1]])
 Matplotlib
 Figures and subplots
 Plot types and styles
                           ## company    Siemens
 Pandas visualization      ## price       110.92
Applications
                           ## Name: c, dtype: object
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Indexing, selecting and filtering                                        163
Essential
concepts
 Getting started
 Procedural                Selection with loc and iloc
 programming
 Object-orientation        print(frame6.loc[["c", "d", "e"], ["volume", "price", "company"]])
Numerical
programming                ##       volume    price   company
 NumPy package             ## c    3669487   110.92   Siemens
 NumPy array
 Linear Algebra
                           ## d    1778058    87.28      BASF
                           ## e    1824582    87.81       BMW
Data formats and
handling
 Pandas                    print(frame6.iloc[2:, ::-1])
 Series
 DataFrame                 ##       volume    price   company
                           ## c    3669487   110.92   Siemens
 Import/Export data
Visual
illustrations
                           ## d    1778058    87.28      BASF
 Matplotlib                ## e    1824582    87.81       BMW
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
                                  Both of the indexing functions work with slices or lists of labels,
 Time series
 Moving window
                                  Many ways to select and rearrange pandas objects.
 Financial applications
© 2018 PyEcon.org
                          DataFrame incexing options                                          164
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                           Type                 Description
Numerical
programming                df[val]              Select single column or set of columns
 NumPy package
 NumPy array               df.loc[val]          Select single row or set of rows
 Linear Algebra
                           df.loc[:, val]       Select single column or set of columns
Data formats and
handling                   df.loc[val1, val2]   Select row and column by label
 Pandas
 Series
                           df.iloc[where]       Select row or set of rows by integer position
 DataFrame                 df.iloc[:, where]    Select column or set of columns by integer pos.
 Import/Export data
Visual
                           df.iloc[w1, w2]      Select row and column by integer position
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Hierarchical indexing                                              165
Essential
concepts
 Getting started
 Procedural                   Hierarchical indexing enables you to have multiple index levels.
 programming
 Object-orientation
Numerical
programming
                          Multiindex
 NumPy package            ind = [["a", "a", "a", "b", "b"], [1, 2, 3, 1, 2]]
 NumPy array
                          frame6 = pd.DataFrame(np.arange(15).reshape((5, 3)),
                                                index=ind,
 Linear Algebra
Visual
illustrations
                          ##   2       3        4       5
 Matplotlib               ##   3       6        7       8
 Figures and subplots     ## b 1       9       10      11
 Plot types and styles
 Pandas visualization
                          ##   2      12       13      14
Applications
                          frame6.index.names = ["index1", "index2"]
 Time series
 Moving window
                          print(frame6.index)
 Financial applications
                          ## MultiIndex(levels=[['a', 'b'], [1, 2, 3]],
                          ##            labels=[[0, 0, 0, 1, 1], [0, 1, 2, 0, 1]],
                          ##            names=['index1', 'index2'])
© 2018 PyEcon.org
                          Hierarchical indexing                  166
Essential
concepts
 Getting started
 Procedural               Selecting of a multiindex
 programming
 Object-orientation       print(frame6.loc["a"])
Numerical
programming               ##            first   second   third
 NumPy package
 NumPy array
                          ##   index2
 Linear Algebra           ##   1            0        1      2
Data formats and
                          ##   2            3        4      5
handling                  ##   3            6        7      8
 Pandas
 Series
                          print(frame6.loc["b", 1])
 DataFrame
 Import/Export data
                          ##   first      9
Visual
illustrations             ##   second    10
 Matplotlib               ##   third     11
 Figures and subplots
                          ##   Name: (b, 1), dtype: int64
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Operations between DataFrame and Series                              167
Essential
concepts
 Getting started
 Procedural               Series and DataFrames
 programming
 Object-orientation       frame7 = frame[["price", "volume"]]
Numerical                 frame7.index = ["Daimler", "E.ON", "Siemens", "BASF", "BMW"]
programming               series = frame7.iloc[2]
 NumPy package
                          print(frame7)
 NumPy array
 Linear Algebra
                          ##              price    volume
Data formats and
handling                  ##   Daimler    69.20   4456290
 Pandas                   ##   E.ON        8.11   3667975
 Series
                          ##   Siemens   110.92   3669487
 DataFrame
 Import/Export data
                          ##   BASF       87.28   1778058
Visual
                          ##   BMW        87.81   1824582
illustrations
 Matplotlib               print(series)
 Figures and subplots
 Plot types and styles
                          ## price         110.92
 Pandas visualization
                          ## volume    3669487.00
Applications
 Time series
                          ## Name: Siemens, dtype: float64
 Moving window
 Financial applications
                                Here the Series was generated from the first row of the DataFrame.
© 2018 PyEcon.org
                          Operations between DataFrames and Series                         168
Essential
concepts
 Getting started
 Procedural               Operations between Series and DataFrames down the rows
 programming
 Object-orientation       print(frame7 + series)
Numerical
programming               ##              price      volume
 NumPy package
 NumPy array
                          ##   Daimler   180.12   8125777.0
 Linear Algebra           ##   E.ON      119.03   7337462.0
Data formats and
                          ##   Siemens   221.84   7338974.0
handling                  ##   BASF      198.20   5447545.0
 Pandas
                          ##   BMW       198.73   5494069.0
 Series
 DataFrame
 Import/Export data
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Operations between DataFrames and Series                      169
Essential
concepts
 Getting started
 Procedural               Operations between Series and DataFrames down the columns
 programming
 Object-orientation       series2 = frame7["price"]
Numerical                 print(frame7.add(series2, axis=0))
programming
 NumPy package
 NumPy array
                          ##              price       volume
 Linear Algebra           ##   Daimler   138.40   4456359.20
Data formats and
                          ##   E.ON       16.22   3667983.11
handling                  ##   Siemens   221.84   3669597.92
 Pandas
                          ##   BASF      174.56   1778145.28
                          ##   BMW       175.62   1824669.81
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
                                Here, the Series was generated from the price column,
 Figures and subplots
 Plot types and styles
                                The arithmetic operation will be broadcasted along a column
 Pandas visualization           matching the DataFrame’s row index (axis=0).
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Operations between DataFrames and Series                        170
Essential
concepts
 Getting started
 Procedural               Pandas vs Numpy
 programming
 Object-orientation       nparr = np.arange(12.).reshape((3, 4))
Numerical                 row = nparr[0]
programming
                          print(nparr-row)
 NumPy package
 NumPy array
 Linear Algebra           ## [[0. 0. 0. 0.]
Data formats and
                          ## [4. 4. 4. 4.]
handling                  ## [8. 8. 8. 8.]]
 Pandas
 Series
 DataFrame
 Import/Export data           Operations between DataFrames are similar to operations between
Visual
illustrations
                              one- and two-dimensional Numpy arrays,
 Matplotlib
 Figures and subplots
                              As in DataFrames and Series the arithmetic operations will be
 Plot types and styles
                              broadcasted along the rows.
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          NumPy functions on DataFrames                             171
Essential
concepts
 Getting started          DataFrame.apply(np.function, axis): applies a NumPy function
 Procedural
 programming              on the DataFrame axis.
 Object-orientation
                          See also statistical and mathematical NumPy functions.
Numerical
programming
 NumPy package            Numpy functions on DataFrames
 NumPy array
 Linear Algebra           print(frame7[:2])
Data formats and
handling                  ##           price    volume
 Pandas
                          ## Daimler   69.20   4456290
                          ## E.ON       8.11   3667975
 Series
 DataFrame
 Import/Export data
Visual
                          print(frame7.apply(np.mean))
illustrations
 Matplotlib               ## price          72.664
 Figures and subplots
                          ## volume    3079278.400
 Plot types and styles
 Pandas visualization
                          ## dtype: float64
Applications
 Time series
                          print(frame7.apply(np.sqrt)[:2])
 Moving window
 Financial applications   ##              price        volume
                          ## Daimler   8.318654   2110.992657
                          ## E.ON      2.847806   1915.195812
© 2018 PyEcon.org
                          Grouping DataFrames                                                172
Essential
concepts
 Getting started          DataFrame.groupby(col1, col2): group DataFrame by columns
 Procedural
 programming              (grouping by one or more than two columns is also possible).
 Object-orientation
                          See also how to import data from CSV files.
Numerical
programming
 NumPy package            Groupby
 NumPy array
 Linear Algebra           vote = pd.read_csv("data/vote.csv")[["Party", "Member", "Vote"]]
Data formats and          print(vote.head())
handling
 Pandas
                          ##         Party      Member     Vote
                          ##   0   CDU/CSU    Abercron      yes
 Series
 DataFrame
 Import/Export data       ##   1   CDU/CSU      Albani      yes
Visual                    ##   2   CDU/CSU   Altenkamp      yes
illustrations             ##   3   CDU/CSU    Altmaier   absent
 Matplotlib
 Figures and subplots
                          ##   4   CDU/CSU      Amthor      yes
 Plot types and styles
 Pandas visualization
                          Adding the functions count() or mean() to groupby() returns the
Applications
 Time series              sum or the mean of the grouped columns.
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Grouping DataFrames                             173
Essential
concepts
 Getting started
 Procedural               Groupby
 programming
 Object-orientation       res = vote.groupby(["Party", "Vote"]).count()
Numerical                 print(res)
programming
 NumPy package
 NumPy array
                          ##                         Member
 Linear Algebra           ##   Party        Vote
Data formats and
                          ##   AfD          absent        6
handling                  ##                no           86
 Pandas
                          ##   BÜ90/GR      absent        9
                          ##                no           58
 Series
 DataFrame
 Import/Export data       ##   CDU/CSU      absent        7
Visual                    ##                yes         239
illustrations             ##   DIE LINKE.   absent        7
 Matplotlib
 Figures and subplots
                          ##                no           62
 Plot types and styles    ##   FDP          absent        5
 Pandas visualization     ##                no           75
Applications              ##   Fraktionslos absent        1
 Time series              ##                no            1
 Moving window
 Financial applications
                          ##   SPD          absent        6
                          ##                yes         147
© 2018 PyEcon.org
                          Section 3.4                 174
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Data formats and handling
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Reading data in text format                     175
Essential
concepts
 Getting started          ex1.csv
 Procedural
 programming
 Object-orientation          a,      b,       c,   d,   hello
Numerical
programming
                             1,      2,       3,   4,   world
 NumPy package               5,      6,       7,   8,   python
                             2,      3,       5,   7,   pandas
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Reading data in text format                                 176
Essential
concepts
 Getting started          tab.txt
 Procedural
 programming
 Object-orientation          a|      b|       c|   d|   hello
Numerical
programming
                             1|      2|       3|   4|   world
 NumPy package               5|      6|       7|   8|   python
                             2|      3|       5|   7|   pandas
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Reading data in text format                     177
Essential
concepts
 Getting started          ex2.csv
 Procedural
 programming
 Object-orientation          1, 2, 3, 4, world
Numerical
programming
                             5, 6, 7, 8, python
 NumPy package               2, 3, 5, 7, pandas
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Reading data in text format                             178
Essential
concepts
 Getting started          ex2.csv
 Procedural
 programming
 Object-orientation          1, 2, 3, 4, world
Numerical
programming
                             5, 6, 7, 8, python
 NumPy package               2, 3, 5, 7, pandas
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Reading data in text format                             179
Essential
concepts
 Getting started          ex2.csv
 Procedural
 programming
 Object-orientation          1, 2, 3, 4, world
Numerical
programming
                             5, 6, 7, 8, python
 NumPy package               2, 3, 5, 7, pandas
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Reading data in text format                         180
Essential
concepts
 Getting started          ex3.csv
 Procedural
 programming
 Object-orientation          1, 2, 3, 4, world
Numerical
programming
                             #+#-.,.-'*'-.,
 NumPy package               5, 6, 7, 8, python
                             87646756754456978
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Writing data to text file                                              181
Essential
concepts
 Getting started           DataFrame.to_csv("filename’): writing DataFrame to CSV.
 Procedural
 programming
 Object-orientation        Write to CSV
Numerical
programming
                           df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
 NumPy package
                           df.to_csv("out/out1.csv")
 NumPy array
 Linear Algebra
                           out1.csv
Data formats and
handling
 Pandas                       ,1, 2, 3, 4, world
                              0,5,6,7,8, python
 Series
 DataFrame
 Import/Export data
                              1,2,3,5,7, pandas
Visual
illustrations
 Matplotlib
 Figures and subplots      In the .csv file, the index and header is included (reason why ,1).
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Writing data to text file                               182
Essential
concepts
 Getting started
 Procedural                Write to CSV and settings
 programming
 Object-orientation        df = pd.read_csv("data/ex3.csv", skiprows=[1, 3])
Numerical                  df.to_csv("out/out2.csv", index=False, header=False)
programming
 NumPy package
 NumPy array               out2.csv
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Writing data to text file                               183
Essential
concepts
 Getting started
 Procedural                Write to CSV and specify header
 programming
 Object-orientation        df = pd.read_csv("data/ex3.csv", skiprows=[1, 3, 4])
Numerical                  df.to_csv("out/out3.csv", index=False,
programming
                                     header=["a", "b", "c", "d", "e"])
 NumPy package
 NumPy array
 Linear Algebra
                           out3.csv
Data formats and
handling
 Pandas                       a,b,c,d,e
 Series
 DataFrame
                              5,6,7,8, python
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Reading Excel files                           184
Essential
concepts
 Getting started          pd.read_excel("file.xls"): read .xls files.
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Reading Excel files                                 185
Essential
concepts
 Getting started
 Procedural               Excel as a DataFrame
 programming
 Object-orientation       print(xls_frame[["Adj Close", "Volume", "High"]])
Numerical
programming               ##         Adj Close    Volume          High
 NumPy package            ##   0   1169.939941   1538700   1173.000000
 NumPy array
 Linear Algebra
                          ##   1   1167.699951   2412100   1174.000000
                          ##   2   1111.900024   4857900   1123.069946
Data formats and
handling                  ##   3   1055.800049   3798300   1110.000000
 Pandas                   ##   4   1080.599976   3448000   1081.709961
 Series
                          ##   5   1048.579956   2341700   1081.780029
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Remote data access                                                   186
Essential
concepts
 Getting started          Extract financial data from Internet sources into a DataFrame. There
 Procedural
 programming              are different sources offering different kind of data. Some sources are:
 Object-orientation
Numerical
                                Robinhood
programming
 NumPy package                  IEX
 NumPy array
 Linear Algebra                 World Bank
Data formats and
handling
                                OECD
 Pandas
 Series
                                Eurostat
 DataFrame
 Import/Export data
                          A complete list of the sources and the usage can be found here:
                            pandas-datareader
Visual
illustrations
 Matplotlib
 Figures and subplots
                          Import pandas-datareader
 Plot types and styles    from pandas_datareader import data
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data access: Robinhood                                              187
Essential
concepts
 Getting started          data.DataReader("stock symbol", "source", "start", "end"):
 Procedural
 programming              get financial data of a stock in a certain time period.
 Object-orientation
Numerical
programming
                          Robinhood get data
 NumPy package
                          ford = data.DataReader("F", "robinhood", "1/1/2017", "1/31/2018")
                          print(ford.head()[["close_price", "volume"]])
 NumPy array
 Linear Algebra
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data access: Robinhood                                             188
Essential
concepts
 Getting started
 Procedural               Robinhood handle data
 programming
 Object-orientation       print(ford.index)
Numerical
programming
                          ## MultiIndex(levels=[[F], [2017-01-02 00:00:00, 2017-01-03...
 NumPy package            ## names=[Symbol, Date])
 NumPy array
 Linear Algebra
                          print(ford.loc["F", "1/26/2018"])
Data formats and
handling                  ##   close_price     11.063900
 Pandas                   ##   high_price      11.111400
 Series
                          ##   interpolated        False
 DataFrame
 Import/Export data
                          ##   low_price       10.921500
Visual
                          ##   open_price      11.007000
illustrations             ##   session               reg
 Matplotlib               ##   volume           52496001
                          ##   Name: (F, 2018-01-26 00:00:00), dtype: object
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series              DataFrame index
 Moving window
 Financial applications   Index of the DataFrame is different at different sources. Always check
                          DataFrame.index!
© 2018 PyEcon.org
                          Data access: IEX                                                  189
Essential
concepts
 Getting started
 Procedural               IEX
 programming
 Object-orientation       sap = data.DataReader("SAP", "iex", "1/1/2017", "1/31/2018")
Numerical                 print(sap[25:27])
programming
 NumPy package            ##                  open      high       low     close   volume
 NumPy array
 Linear Algebra
                          ## date
                          ## 2017-02-08   89.5382    90.0263   89.4405   89.6065   653804
Data formats and
handling                  ## 2017-02-09   89.7139    89.9738   89.5284   89.5284   548787
 Pandas
 Series                   print(sap.loc["2017-02-08"])
 DataFrame
                          ##   open          89.5382
 Import/Export data
Visual
illustrations
                          ##   high          90.0263
 Matplotlib               ##   low           89.4405
 Figures and subplots     ##   close         89.6065
 Plot types and styles
                          ##   volume    653804.0000
 Pandas visualization
                          ##   Name: 2017-02-08, dtype: float64
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Data access: Eurostat                                                190
Essential
concepts
 Getting started
 Procedural               Eurostat
 programming
 Object-orientation       population = data.DataReader("tps00001", "eurostat", "1/1/2007",
Numerical                 "1/1/2018")
programming
 NumPy package
                          print(population.columns)
                          ## MultiIndex(levels=[[Population on 1 January - total], [Albania,
 NumPy array
 Linear Algebra
                          ## Andorra, Armenia, Austria, Azerbaijan, Belarus, Belgium, ...
Data formats and
handling
 Pandas
                          print(population["Population on 1 January - total", "France"][0:5])
 Series
                          ##    FREQ                   Annual
                          ##    TIME_PERIOD
 DataFrame
 Import/Export data
Visual
                          ##    2007-01-01         63645065.0
illustrations             ##    2008-01-01         64007193.0
 Matplotlib               ##    2009-01-01         64350226.0
 Figures and subplots
 Plot types and styles
                          ##    2010-01-01         64658856.0
 Pandas visualization     ##    2011-01-01         64978721.0
Applications
 Time series                   Eurostat Database
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Read data from HTML                                                     191
Essential
concepts
 Getting started          Website used for the example:      Econometrics
 Procedural
 programming
 Object-orientation       Beautiful Soup
Numerical
programming
                          from bs4 import BeautifulSoup
 NumPy package            import requests
 NumPy array              url = "www.uni-goettingen.de/de/applied-econometrics/412565.html"
 Linear Algebra
                          r = requests.get("https://" + url)
Data formats and
handling
                          d = r.text
 Pandas                   soup = BeautifulSoup(d, "lxml")
 Series
 DataFrame
                          print(soup.title)
 Import/Export data
                          ## <title>Applied Econometrics - Georg-August-... ...</title>
Visual
illustrations
 Matplotlib
 Figures and subplots
                          Reading data from HTML in detail exceeds the content of this course.
 Plot types and styles    If you are interested in this kind of importing data, you can find detailed
 Pandas visualization
                          information on Beautiful Soup here.
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Motivation                                                     192
Essential
concepts
 Getting started
 Procedural               Bollinger
 programming
 Object-orientation       sap = data.DataReader("SAP", "iex", "1/1/2017", "8/31/2018")
Numerical                 sap.index = pd.to_datetime(sap.index)
programming
                          boll = sap["close"].rolling(window=20, center=False).mean()
 NumPy package
 NumPy array
                          std = sap["close"].rolling(window=20, center=False).std()
 Linear Algebra           upp = boll + std * 2
Data formats and          low = boll - std * 2
handling                  fig = plt.figure()
                          ax = fig.add_subplot(1, 1, 1)
 Pandas
 Series
 DataFrame                boll.plot(ax=ax, label="20 days Rolling mean")
 Import/Export data       upp.plot(ax=ax, label="Upper Band")
Visual                    low.plot(ax=ax, label="Lower Band")
illustrations
                          sap["close"].plot(ax=ax, label="SAP Price")
 Matplotlib
 Figures and subplots     ax.legend(loc="best")
 Plot types and styles    fig.savefig("out/boll.pdf")
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Motivation                                                                        193
Essential
concepts
 Getting started
 Procedural
 programming
                                  125          20 days Rolling mean
 Object-orientation                            Upper Band
                                  120          Lower Band
Numerical
programming
                                               SAP Price
 NumPy package
                                  115
 NumPy array
 Linear Algebra
                                  110
Data formats and
handling
 Pandas                           105
 Series
 DataFrame
                                  100
 Import/Export data
Visual
illustrations                     95
 Matplotlib
 Figures and subplots             90
 Plot types and styles
 Pandas visualization
                                  85
Applications
                                           1     3     5     7     9     1     1     3     5     7      9
                                        7-0 017-0 017-0 017-0 017-0 017-1 018-0 018-0 018-0 018-0 018-0
 Time series
 Moving window
                                  201       2     2     2      2    2     2     2     2     2      2
 Financial applications                                                 date
© 2018 PyEcon.org
                          Chapter 4                     194
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Visual illustrations
Numerical
programming
 NumPy package
 NumPy array
                          4.1   Matplotlib
 Linear Algebra
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 4.1            195
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Visual illustrations
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          matplotlib                                                         196
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
                              Image plot, Contour plot, Scatter plot, Polar plot, Line plot, 3-D
illustrations                 plot,
 Matplotlib
 Figures and subplots         Variety of hardcopy formats,
 Plot types and styles
 Pandas visualization         Works in Python scripts, the Python and IPython shell and the
Applications
 Time series
                              jupyter notebook,
 Moving window
 Financial applications
                              Interactive environments.
© 2018 PyEcon.org
                          matplotlib                                                        197
Essential
concepts
 Getting started
 Procedural
                          Usage of matplotlib
 programming
 Object-orientation       matplotlib has a vast number of functions and options, which is hard
Numerical
programming
                          to remember. But for almost every task there is an example you can
 NumPy package            take code from. A great source of information is the examples gallery
                          on the matplotlib homepage. Also note the Best practice Quick
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Simple plot                                                       198
Essential
concepts
 Getting started          plt.plot(array): plot the values of a list, the X-axis has by default
 Procedural
 programming              the range (0, 1, ..., n).
 Object-orientation
Numerical
programming
                          Import matplotlib and simple example
 NumPy package
                          import matplotlib.pyplot as plt
                          import numpy as np
 NumPy array
 Linear Algebra
Visual
illustrations                                       8
 Matplotlib
 Figures and subplots                               6
Applications 2
 Time series
 Moving window                                      0
                                                        0   2   4   6   8
 Financial applications
© 2018 PyEcon.org
                          Section 4.2              199
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Visual illustrations
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Figures                                                           200
Essential
concepts
 Getting started          Plots in matplotlib reside in a Figure object:
 Procedural
 programming              plt.figure(figsize) creates new Figure object with multiple options.
 Object-orientation
                          plt.gcf(): reference of the active figure.
Numerical
programming
 NumPy package            Create Figures
 NumPy array
 Linear Algebra           fig = plt.figure(figsize=(16, 8))
Data formats and          print(plt.gcf())
handling
 Pandas
                          ## Figure(1600x800)
 Series
 DataFrame
 Import/Export data
Visual
illustrations
                              A Figure object can be considered as an empty window,
 Matplotlib
 Figures and subplots
                              The Figure object has a number of options, such as the size or
 Plot types and styles        the aspect ratio,
 Pandas visualization
© 2018 PyEcon.org
                          Saving plots to file                                        201
Essential
concepts
 Getting started          plt.savefig("filename"): Saving active figure to file.
 Procedural
 programming              Available file formats are among others:
 Object-orientation
Numerical
programming                      Filename extension    Description
 NumPy package
 NumPy array                     .png                  Portable Network Graphics
 Linear Algebra
                                 .pdf                  Portable Document Format
Data formats and
handling                         .svg                  Scalable Vector Graphics
 Pandas
 Series
                                 .jpeg                 JPEG File Interchange Format
 DataFrame
 Import/Export data
                                 .jpg                  JPEG File Interchange Format
Visual                           .ps                   PostScript
illustrations
 Matplotlib
                                 .raw                  Raw image format
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Subplots                                                             202
Essential
concepts
 Getting started          fig.add_subplot(): adds subplot to the Figure fig.
 Procedural
 programming              Example: fig.add_subplot(2, 2, 1) creates four subplots and se-
 Object-orientation
                          lects the first.
Numerical
programming
 NumPy package
                          Adding subplots
 NumPy array
 Linear Algebra
                          ax1 = fig.add_subplot(2, 2, 1)
Data formats and
                          ax2 = fig.add_subplot(2, 2, 2)
handling                  ax3 = fig.add_subplot(2, 2, 3)
 Pandas
                          ax4 = fig.add_subplot(2, 2, 4)
                          fig.savefig("out/subplots.pdf")
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
                              The Figure object is filled with subplots in which the plots reside,
                              Using the plt.plot() command without creating a subplot in
 Figures and subplots
 Plot types and styles
 Pandas visualization
                              advance, matplotlib will create a Figure object and a subplot
Applications
 Time series
                              automatically,
 Moving window
 Financial applications
                              The Figure object and its subplots can be created in one line.
© 2018 PyEcon.org
                          Subplots                                                                           203
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                               1.0                                    1.0
Numerical                      0.8                                    0.8
programming
 NumPy package                 0.6                                    0.6
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Subplots                                                            204
Essential
concepts
 Getting started
 Procedural               Filling subplots with content
 programming
 Object-orientation       from numpy.random import randn
Numerical                 ax1.plot([5, 7, 4, 3, 1])
programming               ax2.hist(randn(100), bins=20, color="r")
                          ax3.scatter(np.arange(30), np.arange(30)*randn(30))
 NumPy package
 NumPy array
 Linear Algebra           ax4.plot(randn(40), "k--")
Data formats and          fig.savefig("out/content.pdf")
handling
 Pandas
 Series
 DataFrame                    The subplots in one Figure object can be filled with different plot
 Import/Export data
Visual
                              types,
illustrations
 Matplotlib
                              Using only plt.plot() matplotlib draws the plot in the last
 Figures and subplots         Figure object and last subplot selected.
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Subplots                                                                                                                                   205
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                7
Numerical                                                                                            12
                                6
programming                                                                                          10
                                5
 NumPy package                                                                                       8
                                4
 NumPy array                                                                                         6
                                3
 Linear Algebra                                                                                      4
                                2
                                                                                                     2
Data formats and                1
                                                                                                     0
handling                            0.0   0.5       1.0    1.5   2.0   2.5        3.0   3.5   4.0         2           1             0             1         2
 Pandas                        30
                                                                                                     2
 Series                        20
 DataFrame                     10                                                                    1
 Import/Export data             0
                                                                                                     0
                               10
Visual
                               20                                                                     1
illustrations
 Matplotlib                    30                                                                     2
 Figures and subplots               0           5         10      15         20         25      30            0   5       10   15       20   25       30   35   40
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Standard creation of plots                                                                                                                                          206
Essential
concepts
 Getting started          plt.subplots(nrows, ncols, sharex, sharey): creates figure and
 Procedural
 programming              subplots in one line. If sharex or sharey are True, all subplots share
 Object-orientation
                          the same X- or Y-ticks.
Numerical
programming
 NumPy package
                          Standard creation
 NumPy array              fig, axes = plt.subplots(2, 3, figsize=(16, 8), sharey=True)
 Linear Algebra
                          axes[1, 1].plot(np.arange(7), color="r")
Data formats and
handling
                          axes[0, 2].plot(np.arange(10, 0, -1))
 Pandas                   fig.savefig("out/standard.pdf")
 Series
 DataFrame
 Import/Export data
Visual                                   10
illustrations                            8
 Matplotlib                              6
                                         0
 Pandas visualization                     0.0   0.2   0.4   0.6   0.8   1.0   0.0       0.2       0.4       0.6       0.8       1.0         0         2         4         6         8
                                         10
Applications
                                         8
 Time series
                                         6
 Moving window
                                         4
 Financial applications                  2
                                         0
                                          0.0   0.2   0.4   0.6   0.8   1.0         0   1     2         3         4    5    6         0.0       0.2       0.4       0.6       0.8       1.0
© 2018 PyEcon.org
                          Section 4.3               207
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Visual illustrations
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot types                                                          208
Essential
concepts
 Getting started          ax.scatter(x, y): create a scatter plot of x vs y.
 Procedural
 programming              ax.hist(x, bins): create a histogram.
 Object-orientation
                          ax.fill_between(x, y, a): create a plot of x vs y and fills plot
Numerical
programming               between a and y.
 NumPy package
 NumPy array
 Linear Algebra
                          Types
Data formats and          fig, ax = plt.subplots(1, 3, figsize=(16, 8))
handling
 Pandas
                          ax[0].hist([1, 2, 3, 4, 5, 4, 3, 2, 3, 4, 2, 3, 4, 4], bins=5,
 Series                   color="yellow")
 DataFrame                x = np.arange(0, 10, 0.1)
 Import/Export data
                          y = np.sin(x)
Visual
illustrations
                          ax[1].fill_between(x, y, 0, color="green")
 Matplotlib               ax[2].scatter(x, y)
 Figures and subplots     fig.savefig("out/types.pdf")
 Plot types and styles
 Pandas visualization
Applications              A vast number of plot types can be found in the examples gallery.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot types                                                                                    209
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                5                       1.00                            1.00
Numerical
programming
                                                        0.75                            0.75
 NumPy package
                                4
 NumPy array                                            0.50                            0.50
 Linear Algebra
                                                        0.25                            0.25
Data formats and                3
handling
                                                        0.00                            0.00
 Pandas
 Series                         2                       0.25                            0.25
 DataFrame
 Import/Export data                                     0.50                            0.50
                                1
Visual                                                  0.75                            0.75
illustrations
 Matplotlib                                             1.00                            1.00
                                0
 Figures and subplots               1   2   3   4   5          0   2   4   6   8   10          0   2   4   6   8   10
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Adjusting the spacing around subplots                           210
Essential
concepts
 Getting started           plt.subplots_adjust(left, bottom, ..., hspace): set the space
 Procedural
 programming               between the subplots. wspace and hspace control the percentage of
 Object-orientation
                           the figure width and figure height, respectively, to use as spacing
Numerical
programming                between subplots.
 NumPy package
 NumPy array
 Linear Algebra
                          Adjust spacing
Data formats and           fig, axes = plt.subplots(2, 2, sharex=True, sharey=True)
handling
 Pandas
                           for i in range(2):
 Series                        for j in range(2):
 DataFrame                         axes[i][j].plot(randn(10))
 Import/Export data
                           plt.subplots_adjust(wspace=0, hspace=0)
Visual
illustrations
                           fig.savefig("out/spacing.pdf")
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Adjusting the spacing around subplots                   211
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical                           1.5
programming                         1.0
 NumPy package
                                    0.5
 NumPy array
 Linear Algebra                     0.0
Data formats and
                                    0.5
handling                            1.0
 Pandas                             1.5
 Series
                                    2.0
 DataFrame
                                    2.5
 Import/Export data
                                    1.5
Visual                              1.0
illustrations
 Matplotlib
                                    0.5
 Figures and subplots               0.0
 Plot types and styles              0.5
 Pandas visualization
                                    1.0
Applications                        1.5
 Time series
                                    2.0
 Moving window
 Financial applications
                                    2.5
                                          0   2   4   6   8   0   2   4   6   8
© 2018 PyEcon.org
                          Colors, markers and line styles                                    212
Essential
concepts
 Getting started          ax.plot(data, linestyle, color, marker): set data and styles
 Procedural
 programming              of subplot ax.
 Object-orientation
Numerical
programming
                          Styles
 NumPy package
                          fig, ax = plt.subplots(1, figsize=(15, 6))
                          ax.plot(randn(10), linestyle="--", color="darkcyan", marker="p")
 NumPy array
 Linear Algebra
Visual
illustrations                      1.0
 Matplotlib
 Figures and subplots              0.5
 Plot types and styles
 Pandas visualization              0.0
Applications
                                   0.5
 Time series
 Moving window
 Financial applications            1.0
                                         0      2         4         6        8
© 2018 PyEcon.org
                          Plot colors   213
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot line styles   214
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot markers                            215
Essential
concepts
 Getting started
 Procedural
                                         Marker   Description
 programming
 Object-orientation
                                         "."      point
Numerical                                ","      pixel
programming
 NumPy package
                                         "o"      circle
 NumPy array                             "v"      triangle_down
 Linear Algebra
© 2018 PyEcon.org
                          Ticks and labels                                                     216
Essential
concepts
 Getting started           ax.set_xticks(): set list of X-ticks, alalogous for Y-axis.
 Procedural
 programming               ax.set_xlabel(): set the X-label.
 Object-orientation
                           ax.set_title(): set the subplot title.
Numerical
programming
 NumPy package            Ticks and labels - default
 NumPy array
 Linear Algebra
                           fig, ax = plt.subplots(1, figsize=(15, 10))
Data formats and
                           ax.plot(randn(1000).cumsum())
handling                   fig.savefig("out/withoutlabls.pdf")
 Pandas
 Series
 DataFrame
 Import/Export data
                               Here a Figure object and a subplot were created and filled with a
Visual
illustrations                  plot,
 Matplotlib
 Figures and subplots          By default matplotlib places the ticks evenly distributed along the
 Plot types and styles
 Pandas visualization
                               data range. Individual ticks can be set as follows,
Applications                   By default there is no axis label or title.
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Ticks and labels                              217
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
                                0
 NumPy package
 NumPy array
 Linear Algebra                 10
Data formats and
handling
 Pandas                         20
 Series
 DataFrame
                                30
 Import/Export data
Visual
illustrations                   40
 Matplotlib
 Figures and subplots
 Plot types and styles          50
 Pandas visualization
Applications                    60
 Time series
 Moving window
                                     0   200   400   600   800   1000
 Financial applications
© 2018 PyEcon.org
                          Ticks and labels                                                      218
Essential
concepts
 Getting started
 Procedural                Set ticks and labels
 programming
 Object-orientation        ax.set_xticks([0, 250, 500, 750, 1000])
Numerical                  ax.set_xlabel("Days", fontsize=20)
programming                ax.set_ylabel("Change", fontsize=20)
 NumPy package
                           ax.set_title("Simulation", fontsize=30)
 NumPy array
 Linear Algebra
                           fig.savefig("out/labels.pdf")
Data formats and
handling
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Ticks and labels                                      219
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
                                                      Simulation
programming
                                       0
 NumPy package
 NumPy array
 Linear Algebra                        10
Data formats and
handling
 Pandas                                20
 Series
                              Change
 DataFrame
                                       30
 Import/Export data
Visual
illustrations                          40
 Matplotlib
 Figures and subplots
 Plot types and styles                 50
 Pandas visualization
Applications                           60
 Time series
 Moving window
                                            0   250       500      750   1000
 Financial applications                                  Days
© 2018 PyEcon.org
                          Legends                                                             220
Essential
concepts
 Getting started          Using multiple plots in one subplot one needs a legend.
 Procedural
 programming              ax.legend(loc): showing the legend at location loc.
 Object-orientation
                          Some options: "best", "upper right", "center left", ...
Numerical
programming
 NumPy package            Set legend
 NumPy array
 Linear Algebra           fig = plt.figure(figsize=(15, 10))
Data formats and          ax = fig.add_subplot(1, 1, 1)
handling                  ax.plot(randn(1000).cumsum(), label="first")
                          ax.plot(randn(1000).cumsum(), label="second")
 Pandas
 Series
 DataFrame                ax.plot(randn(1000).cumsum(), label="third")
 Import/Export data       ax.legend(loc="best", fontsize=20)
Visual                    fig.savefig("out/legend.pdf")
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
                              The legend displays the label and the color of the associated plot,
Applications                  Using the option "best" the legend will placed in a corner where
 Time series
 Moving window                is does not interfere the plots.
 Financial applications
© 2018 PyEcon.org
                          Legends                                         221
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical                      30
programming                                                     first
 NumPy package                                                  second
 NumPy array                   20                               third
 Linear Algebra
Visual
illustrations                  10
 Matplotlib
 Figures and subplots
 Plot types and styles         20
Pandas visualization
Applications
                               30
 Time series
 Moving window
                                    0   200   400   600   800      1000
 Financial applications
© 2018 PyEcon.org
                          Annotations on a subplot                                       222
Essential
concepts
 Getting started          ax.text(x, y, "text", fontsize): insert text into a subplot.
 Procedural
 programming              ax.annotate("text", xy, xytext, arrwoprops): insert arrow with
 Object-orientation
                          annotations.
Numerical
programming
 NumPy package
                          Annotations
 NumPy array
                          ax.text(400, -30, "here", fontsize=50)
 Linear Algebra
                          ax.annotate("there",
Data formats and
handling                              fontsize=40,
 Pandas                               xy=(0, 0),
 Series
                                      xytext=(400, 8),
                                      arrowprops=dict(facecolor="black",
 DataFrame
 Import/Export data
Visual
                                                       shrink=0.05))
illustrations             ax.set_yticks([-40, -30, -20, -10, 0, 10, 20, 30, 40])
 Matplotlib               fig.savefig("out/arrow.pdf")
 Figures and subplots
 Plot types and styles
 Pandas visualization
© 2018 PyEcon.org
                          Annotations                                        223
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                               40
Numerical
programming                                                        first
 NumPy package
                               30
                                                                   second
 NumPy array                                                       third
 Linear Algebra
                               20
Data formats and
                                               there
handling
 Pandas                        10
 Series
 DataFrame
 Import/Export data
                                0
Visual
illustrations                  10
 Matplotlib
 Figures and subplots
                               20
                                               here
 Plot types and styles
 Pandas visualization
                               30
Applications
 Time series
 Moving window                 40
                                    0   200   400      600   800      1000
 Financial applications
© 2018 PyEcon.org
                          Annotations                                                        224
Essential
concepts
 Getting started
 Procedural               Annotation Lehman
 programming
 Object-orientation       import pandas as pd
Numerical
                          from datetime import datetime
programming               date = datetime(2008, 9, 15)
 NumPy package
                          fig = plt.figure(figsize=(16, 8))
                          ax = fig.add_subplot(1, 1, 1)
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Annotations                                                           225
Essential
concepts
 Getting started
 NumPy package
 NumPy array
                               22500                  Lehman Bankruptcy
 Linear Algebra                20000
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Drawing on a subplot                                                      226
Essential
concepts
 Getting started          plt.Rectangle((x, y), width, height, angle): create a rect-
 Procedural
 programming              angle
 Object-orientation
                          plt.Circle((x,y), radius): create a circle.
Numerical
programming
 NumPy package            Drawing
 NumPy array
 Linear Algebra           fig = plt.figure(figsize=(6, 6))
Data formats and          ax = fig.add_subplot(1, 1, 1)
handling                  ax.set_xticks([0, 1, 2, 3, 4, 5])
                          ax.set_yticks([0, 1, 2, 3, 4, 5])
 Pandas
 Series
 DataFrame                rectangle = plt.Rectangle((1.5, 1),
 Import/Export data                                  width=0.8, height=2,
Visual                                               color="red", angle=30)
illustrations
 Matplotlib
                          circ = plt.Circle((3, 3),
 Figures and subplots                        radius=1, color="blue")
 Plot types and styles    ax.add_patch(rectangle)
 Pandas visualization
                          ax.add_patch(circ)
Applications              fig.savefig("out/draw.pdf")
 Time series
 Moving window
 Financial applications   A list of all available patches can be found here:   matplotlib-patches
© 2018 PyEcon.org
                          Drawing on a subplot                   227
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
                                    5
programming
 NumPy package
 NumPy array
 Linear Algebra                     4
Data formats and
handling
 Pandas
 Series                             3
 DataFrame
 Import/Export data
Visual
illustrations                       2
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization               1
Applications
 Time series
 Moving window
 Financial applications             0
                                        0   1    2   3   4   5
© 2018 PyEcon.org
                          Best practice: Visual illustrations                               228
Essential
concepts
 Getting started           Step 1
 Procedural
 programming               Create a Figure object and subplots
 Object-orientation
Numerical
programming
                           Best practice Step 1
 NumPy package
                           fig, ax = plt.subplots(1, 1, figsize=(16, 8))
 NumPy array
 Linear Algebra
Visual
                           Best practice Step 2
illustrations
 Matplotlib
                          x = np.arange(0, 10, 0.1)
 Figures and subplots     y = np.sin(x)
 Plot types and styles
                          ax.scatter(x, y)
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Best practice: Visual illustrations            229
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                1.00
Numerical
programming
                                0.75
 NumPy package
 NumPy array                    0.50
 Linear Algebra
                                0.25
Data formats and
handling
                                0.00
 Pandas
 Series                         0.25
 DataFrame
 Import/Export data             0.50
Visual                          0.75
illustrations
 Matplotlib                     1.00
 Figures and subplots                  0   2      4      6      8   10
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Best practice: Visual illustrations            230
Essential
concepts
 Getting started           Step 3
 Procedural
 programming               Set colors, markers and line styles
 Object-orientation
Numerical
programming
                           Best practice Step 3
 NumPy package
                           ax.scatter(x, y, color="green", marker="s")
 NumPy array
 Linear Algebra
© 2018 PyEcon.org
                          Best practice: Visual illustrations                    231
Essential
concepts
 Getting started
 Procedural
                                                        Sine wave
 programming
 Object-orientation
                                        1
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
handling
                                        0
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib                             1
 Figures and subplots                       0.0   2.5        5.0    7.5   10.0
 Plot types and styles                                    x-value
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Best practice: Visual illustrations                           232
Essential
concepts
 Getting started           Step 5
 Procedural
 programming               Set labels
 Object-orientation
                                                                 Sine wave
 programming
 Object-orientation
                                        1
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
handling
                                        0
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations                                     Linear
 Matplotlib                             1         Sine
 Figures and subplots                       0.0            2.5       5.0     7.5   10.0
 Plot types and styles                                             x-value
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 4.4              234
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Visual illustrations
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Line plots                                                        235
Essential
concepts
 Getting started          DataFrame/Series.plot(): plot a DataFrame or a Series.
 Procedural
 programming
 Object-orientation       Simple line plot
Numerical
programming               plt.close("all")
 NumPy package            p = pd.Series(np.random.rand(10).cumsum(), index=np.arange(0, 1000,
 NumPy array
                          100))
 Linear Algebra
                          print(p)
Data formats and
handling
 Pandas
                          ##   0        0.888442
 Series                   ##   100      1.549929
 DataFrame
                          ##   200      2.258732
                          ##   300      2.485168
 Import/Export data
Visual
illustrations
                          ##   400      3.156098
 Matplotlib               ##   500      3.373227
 Figures and subplots     ##   600      4.102376
 Plot types and styles
 Pandas visualization
                          ##   700      4.307634
                          ##   800      5.019096
Applications
 Time series
                          ##   900      5.687669
 Moving window            ##   dtype:   float64
 Financial applications
                          p.plot()
                          plt.savefig("out/line.pdf")
© 2018 PyEcon.org
                          Line plots                                   236
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array                           5
 Linear Algebra
Visual
illustrations
                                       2
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization                  1
Applications                               0   200   400   600   800
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Line plots                                                       237
Essential
concepts
 Getting started
 Procedural               Line plots
 programming
 Object-orientation       df = pd.DataFrame(np.random.randn(10, 3), index=np.arange(10),
Numerical                                   columns=["a", "b", "c"])
programming
                          print(df)
 NumPy package
 NumPy array
 Linear Algebra           ##               a           b           c
Data formats and
                          ##   0    0.362041    0.350474   -1.992641
handling                  ##   1   -0.481396    1.250534   -0.017076
 Pandas
                          ##   2   -1.007017   -0.843875   -1.163215
                          ##   3   -0.043806    0.896435    0.279640
 Series
 DataFrame
 Import/Export data       ##   4   -0.011092   -0.714289    0.762072
Visual                    ##   5   -1.758891    1.332606   -0.931393
illustrations             ##   6   -0.361416   -1.811150   -0.677346
 Matplotlib
 Figures and subplots
                          ##   7    0.503350   -0.806999    0.129074
 Plot types and styles    ##   8   -0.100652   -0.958269   -1.053158
 Pandas visualization     ##   9   -1.747851   -0.064166    0.267087
Applications
 Time series              df.plot(figsize=(15, 12))
 Moving window
                          plt.savefig("out/line2.pdf")
 Financial applications
© 2018 PyEcon.org
                          Line plots                          238
Essential
concepts
 Getting started
 Procedural                     1.5
                                                          a
 programming                                              b
 Object-orientation                                       c
Numerical                       1.0
programming
 NumPy package
 NumPy array
                                0.5
 Linear Algebra
Visual
illustrations
 Matplotlib                     1.0
 Figures and subplots
 Plot types and styles
 Pandas visualization
                                1.5
Applications
 Time series
 Moving window                  2.0
 Financial applications
                                      0   2   4   6   8
© 2018 PyEcon.org
                          Plotting and pandas                                                 239
Essential
concepts
 Getting started          The plot method applied to a DataFrame plots each column as a
 Procedural
 programming              different line and shows the legend automatically. Plotting DataFrames,
 Object-orientation
                          there are serveral arguments to change the style of the plot:
Numerical
programming
 NumPy package
 NumPy array                  Argument      Description
 Linear Algebra
                              kind          "line", "bar", etc
Data formats and
handling                      logy          logarithmic scale on Y-axis
 Pandas
 Series
                              use_index     If True, use index for tick labels
 DataFrame
                              rot           Rotation of tick labels
 Import/Export data
Visual
                              xticks        Values for x ticks
illustrations
 Matplotlib
                              yticks        Values for y ticks
 Figures and subplots         grid          Set grid True or False
 Plot types and styles
 Pandas visualization         xlim          X-axis limits
Applications                  ylim          Y-axis limits
 Time series
 Moving window
                              subplots      Plot each DataFrame column in a new subplot
 Financial applications
© 2018 PyEcon.org
                          Pandas plot                                                  240
Essential
concepts
 Getting started
 Procedural
                          Separated line plots
 programming
 Object-orientation       df.plot(grid=True, rot=45, subplots=True, title="Example",
Numerical                         figsize=(15, 10))
programming               plt.savefig("out/pandas.pdf")
 NumPy package
 NumPy array
 Linear Algebra
                                                             Example
Data formats and
handling
                                      0.5                                        a
 Pandas
                                      0.0
 Series                               0.5
 DataFrame                            1.0
 Import/Export data                   1.5
Visual                                1.5
                                      1.0                                        b
illustrations
                                      0.5
 Matplotlib                           0.0
                                      0.5
 Figures and subplots                 1.0
                                      1.5
 Plot types and styles
 Pandas visualization
                                      0.5                                        c
Applications                          0.0
                                      0.5
 Time series
                                      1.0
 Moving window                        1.5
                                      2.0
 Financial applications
                                            0
© 2018 PyEcon.org                                                          8
                          Standard creation of plots and pandas                       241
Essential
concepts
 Getting started          DataFrame.plot(ax = subplot): plot DataFrame into an existing
 Procedural
 programming              subplot.
 Object-orientation
Numerical
programming
                          Standard creation
 NumPy package            fig = plt.figure(figsize=(6, 6))
 NumPy array
                          ax = fig.add_subplot(1, 1, 1)
                          guests = np.array([[1334, 456], [1243, 597], [1477, 505],
 Linear Algebra
© 2018 PyEcon.org
                          Standard creation of plots and pandas                             242
Essential
concepts
 Getting started
 Procedural               Bar plot
 programming
 Object-orientation       canteen.plot(ax=ax, kind="bar")
Numerical                 ax.set_ylabel("guests", fontsize=20)
programming
                          ax.set_title("Canteen use in Göttingen", fontsize=20)
 NumPy package
 NumPy array
                          fig.savefig("out/canteen.pdf")
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Bar plot                                                              243
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                                           Canteen use in Göttingen
Numerical                                                                             Zentral
programming                                                                           Turm
 NumPy package
                                              1400
 NumPy array
 Linear Algebra                               1200
Data formats and
handling                                      1000
                                     guests
 Pandas
 Series
                                              800
 DataFrame
 Import/Export data
                                              600
Visual
illustrations
 Matplotlib
                                              400
 Figures and subplots
 Plot types and styles                        200
 Pandas visualization
Applications                                    0
                                                     Mon
Tue
Wed
Thu
Fri
                                                                                      Sat
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Bar plot                                                244
Essential
concepts
 Getting started
 Procedural               Bar plot - stacked
 programming
 Object-orientation       canteen.plot(ax=ax, kind="bar", stacked=True)
Numerical                 ax.set_ylabel("guests", fontsize=20)
programming
                          ax.set_title("Canteen use in Göttingen", fontsize=20)
 NumPy package
 NumPy array
                          fig.savefig("out/canteenstacked.pdf")
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Bar plot                                                              245
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                                           Canteen use in Göttingen
Numerical                                     2000                                    Zentral
programming                                                                           Turm
 NumPy package                                                                        Zentral
                                              1750                                    Turm
 NumPy array
 Linear Algebra
                                              1500
Data formats and
handling
                                              1250
                                     guests
 Pandas
 Series
 DataFrame                                    1000
 Import/Export data
Visual
                                              750
illustrations
 Matplotlib                                   500
 Figures and subplots
 Plot types and styles                        250
 Pandas visualization
Applications                                    0
                                                     Mon
Tue
Wed
Thu
Fri
                                                                                      Sat
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot financial data                                                246
Essential
concepts
 Getting started
 Procedural                BTC chart
 programming
 Object-orientation        fig = plt.figure(figsize=(16, 8))
Numerical                  ax = fig.add_subplot(1, 1, 1)
programming                ax.set_ylabel("price", fontsize=20)
 NumPy package
                           ax.set_xlabel("Date", fontsize=20)
 NumPy array
 Linear Algebra
                           BTC = pd.read_csv("data/btc-eur.csv", index_col=0, parse_dates=True)
Data formats and
                           BTCclose = BTC["Close"]
handling                   BTCclose.plot(ax=ax)
 Pandas
                           ax.set_title("BTC-EUR", fontsize=20)
                           fig.savefig("out/btc.pdf")
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot financial data                                                                              247
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation                                                         BTC-EUR
Numerical
                                     15000
programming
 NumPy package
                                     12500
 NumPy array
 Linear Algebra
                                     10000
                             price
 DataFrame
 Import/Export data
                                     2500
Visual                                  0
illustrations
                                                2      3      4         5                6         7         8         9
                                             201    201    201    201                 201    201       201       201
 Matplotlib
                                                                             Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot financial data                                                 248
Essential
concepts
 Getting started
 Procedural                Compare - bad illustration
 programming
 Object-orientation        amazon = pd.read_csv("data/amzn.csv", index_col=0,
Numerical                                        parse_dates=True)["Close"]
programming
                           siemens = pd.read_csv("data/sie.de.csv", index_col=0,
 NumPy package
 NumPy array
                                                  parse_dates=True)["Close"]
 Linear Algebra            fig = plt.figure(figsize=(16, 8))
Data formats and           ax = fig.add_subplot(1, 1, 1)
handling                   ax.set_ylabel("price")
                           amazon.plot(ax=ax, label="Amazon")
 Pandas
 Series
 DataFrame                 siemens.plot(ax=ax, label="Siemens")
 Import/Export data        ax.legend(loc="best")
Visual                     fig.savefig("out/compare.pdf")
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
                               In this illustration you can hardly compare the trend of the two
Applications                   stocks,
 Time series
 Moving window
                               Using pandas you can standardize both dataframes in one line.
 Financial applications
© 2018 PyEcon.org
                          Plot financial data                                                                                         249
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                                  Amazon
Numerical                                         Siemens
                                       1400
programming
 NumPy package                         1200
 NumPy array
 Linear Algebra                        1000
                                       800
handling
 Pandas                                600
 Series
 DataFrame                             400
 Import/Export data
                                       200
Visual
illustrations
                                                  7-03         7-0
                                                                  5
                                                                         7-0
                                                                            7
                                                                                    7-0
                                                                                        9
                                                                                               7-1
                                                                                                  1
                                                                                                            8-0
                                                                                                                  1
                                                                                                                            8-0
                                                                                                                                  3
 Matplotlib                                   201           201       201       201         201       201             201
                                                                                  Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot financial data                     250
Essential
concepts
 Getting started
 Procedural                Compare - good illustration
 programming
 Object-orientation        amazon = amazon/amazon[0] * 100
Numerical                  siemens = siemens/siemens[0] * 100
programming
                           fig = plt.figure(figsize=(16, 8))
 NumPy package
 NumPy array
                           ax = fig.add_subplot(1, 1, 1)
 Linear Algebra            ax.set_ylabel("percentage")
Data formats and           amazon.plot(ax=ax, label="Amazon")
handling                   siemens.plot(ax=ax, label="Siemens")
                           ax.legend(loc="best")
 Pandas
 Series
 DataFrame                 fig.savefig("out/comparenew.pdf")
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Plot financial data                                                                                             251
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                                      Amazon
Numerical                                             Siemens
programming
                                            160
 NumPy package
 NumPy array
 Linear Algebra
                                            140
                               percentage
Visual
illustrations
                                                      7-03         7-0
                                                                      5
                                                                             7-0
                                                                                7
                                                                                        7-0
                                                                                            9
                                                                                                   7-1
                                                                                                      1
                                                                                                                8-0
                                                                                                                      1
                                                                                                                                8-0
                                                                                                                                      3
 Matplotlib                                       201           201       201       201         201       201             201
                                                                                      Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Chapter 5                    252
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Applications
Numerical
programming
 NumPy package
 NumPy array
                          5.1 Time series
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 5.1     253
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Applications
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Date and time data types                                      254
Essential
concepts
 Getting started          Data types for date and time are included in the Python standard
 Procedural
 programming              library.
 Object-orientation
Numerical
programming
                          Datetime creation
 NumPy package            from datetime import datetime
 NumPy array              now = datetime.now()
 Linear Algebra
                          print(now)
Data formats and
handling
 Pandas
                          ## 2018-10-08 22:53:27.197198
 Series
 DataFrame                print(now.day)
 Import/Export data
Visual                    ## 8
illustrations
 Matplotlib
 Figures and subplots
                          print(now.hour)
 Plot types and styles
 Pandas visualization     ## 22
Applications
 Time series              From datetime you can get the attributes year, month, day, hour,
                          second.
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Set datetime                                                     255
Essential
concepts
 Getting started          datetime(year, month, day, hour, minute, second): set time
 Procedural
 programming              and date.
 Object-orientation
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Time difference                                                       256
Essential
concepts
 Getting started           timedelta(days, seconds): represent difference between two date-
 Procedural
 programming               time objects.
 Object-orientation
Numerical
programming
                           Datetime difference
 NumPy package             from datetime import timedelta
 NumPy array               delta = exam - now
 Linear Algebra
                           print(delta)
Data formats and
handling
 Pandas
                           ## 31 days, 11:06:32.802802
 Series
 DataFrame                 print("The exam will take place in " + str(delta.days) + " days.")
 Import/Export data
© 2018 PyEcon.org
                          Convert string and datetime                                   257
Essential
concepts
 Getting started          datetime.strftime("format"): convert datetime object into string.
 Procedural
 programming              datetime.strptime(datestring, "format"): convert date as a
 Object-orientation
                          string into a datetime object.
Numerical
programming
 NumPy package            Convert Datetime
 NumPy array
 Linear Algebra
                          stamp = datetime(2018, 4, 12)
Data formats and
                          print(stamp)
handling
 Pandas                   ## 2018-04-12 00:00:00
 Series
 DataFrame
 Import/Export data
                          print("German date format: " + stamp.strftime("%d.%m.%Y"))
Visual
illustrations
                          ## German date format: 12.04.2018
 Matplotlib
 Figures and subplots     val = "2018-5-5"
 Plot types and styles    d = datetime.strptime(val, "%Y-%m-%d")
 Pandas visualization
                          print(d)
Applications
 Time series
                          ## 2018-05-05 00:00:00
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Convert string and datetime                                         258
Essential
concepts
 Getting started
 Procedural               Converting examples
 programming
 Object-orientation       val = "31.01.2012"
Numerical                 d = datetime.strptime(val, "%d.%m.%Y")
programming
                          print(d)
 NumPy package
 NumPy array
 Linear Algebra
                          ## 2012-01-31 00:00:00
Data formats and
handling                  print(now.strftime("Today is %A and we are in week %W of the year
 Pandas                   %Y."))
 Series
 DataFrame
 Import/Export data
                          ## Today is Monday and we are in week 41 of the year 2018.
Visual
illustrations
                          print(now.strftime("%c"))
 Matplotlib
 Figures and subplots     ## Mon 08 Oct 2018 10:53:27 PM
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Overview: datetime formats                          259
Essential
concepts
 Getting started
 Procedural
 programming                        Type   Description
 Object-orientation
                                    %Y     4-digit year
Numerical
programming                         %m     2-digit month [01, 12]
 NumPy package
 NumPy array                        %d     2-digit day [01, 31]
 Linear Algebra
                                    %H     Hour (24-hour clock) [00, 23]
Data formats and
handling                            %I     Hour (12-hour clock) [01, 12]
 Pandas
 Series
                                    %M     2-digit minute [00, 59]
 DataFrame                          %S     Second [00, 61]
 Import/Export data
Visual
                                    %W     Week number of the year [00, 53]
illustrations
 Matplotlib
                                    %F     Shortcut for %Y-%m-%d
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Overview : datetime formats                          260
Essential
concepts
 Getting started
 Procedural
 programming                        Type   Description
 Object-orientation
                                    %a     Abbreviated weekday name
Numerical
programming                         %A     Full weekday name
 NumPy package
 NumPy array                        %b     Abbreviated month name
 Linear Algebra
                                    %B     Full month name
Data formats and
handling                            %c     Full date and time
 Pandas
 Series
                                    %x     Locale-appropriate formatted date
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Generating date ranges with pandas                                261
Essential
concepts
 Getting started          pd.date_range(start, end, freq): generate a date range.
 Procedural
 programming
 Object-orientation       Date ranges
Numerical
programming               import pandas as pd
 NumPy package            index = pd.date_range("2018-01-01", now)
 NumPy array              print(index[0:2])
 Linear Algebra
                          print(index[15:16])
Data formats and
handling
                          index = pd.date_range("2018-01-01", now, freq="M")
 Pandas                   print(index[0:2])
 Series
 DataFrame                ## DatetimeIndex(['2018-01-01', '2...ype='datetime64[ns]', freq='D')
 Import/Export data       ## DatetimeIndex(['2018-01-16'], dtype='datetime64[ns]', freq='D')
Visual                    ## DatetimeIndex(['2018-01-31', '2...ype='datetime64[ns]', freq='M')
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Overview: time series frequencies                     262
Essential
concepts
 Getting started
 Procedural
 programming                      Alias                   Offset type
 Object-orientation
                                  D                       Day
Numerical
programming                       B                       Business day
 NumPy package
 NumPy array                      H                       Hour
 Linear Algebra
                                  T                       Minute
Data formats and
handling                          S                       Second
 Pandas
 Series
                                  M                       Month end
 DataFrame                        BM                      Business month end
 Import/Export data
Visual
                                  Q-JAN, Q-FEB, ...       Quarter end
illustrations
 Matplotlib
                                  A-JAN, A-FEB, ...       Year end
 Figures and subplots             AS-JAN, AS-FEB, ...     Year begin
 Plot types and styles
 Pandas visualization
                                  BA-JAN, BA-FEB, ...     Business year end
Applications                      BAS-JAN, BAS-FEB, ...   Business year begin
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Resample date ranges                                                263
Essential
concepts
 Getting started          DataFrame.resample("frequency"): resample frecency of time se-
 Procedural
 programming              ries.
 Object-orientation
Visual
illustrations
 Matplotlib
                          print(df.head())           print(df.resample("3BM").sum().head())
 Figures and subplots
 Plot types and styles
 Pandas visualization     ##                0        ##                    0
Applications
                          ##   2016-01-01   0        ##   2016-01-29     406
 Time series              ##   2016-01-02   1        ##   2016-04-29    6734
 Moving window            ##   2016-01-03   2        ##   2016-07-29   15015
 Financial applications
                          ##   2016-01-04   3        ##   2016-10-31   24205
                          ##   2016-01-05   4        ##   2017-01-31   32246
© 2018 PyEcon.org
                          Section 5.2       264
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Applications
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Moving window functions                                              265
Essential
concepts
 Getting started          DataFrame.rolling(window): conduct rolling window computations.
 Procedural
 programming
 Object-orientation       Rolling mean
Numerical
programming               import matplotlib.pyplot as plt
 NumPy package            amazon = pd.read_csv("data/amzn.csv", index_col=0,
 NumPy array                                    parse_dates=True)["Adj Close"]
 Linear Algebra
                          fig = plt.figure(figsize=(16, 8))
Data formats and
handling
                          ax = fig.add_subplot(1, 1, 1)
 Pandas                   ax.set_ylabel("price")
 Series                   amazon.plot(ax=ax, label="Amazon")
 DataFrame
                          amazon.rolling(window=20).mean().plot(ax=ax, label="Rolling mean")
 Import/Export data
                          ax.legend(loc="best")
                          ax.set_title("Amazon price and rolling mean", fontsize=25)
Visual
illustrations
 Matplotlib               fig.savefig("out/amzn.pdf")
 Figures and subplots
 Plot types and styles
 Pandas visualization     Rolling functions are: mean(), median(), sum(), var(), std(),
Applications              min(), max().
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Moving window functions                                                                                            266
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                      1500
                                                                          Amazon price and rolling mean
                                                 Amazon
Numerical                                        Rolling mean
programming                           1400
 NumPy package
 NumPy array                          1300
 Linear Algebra
                                      1200
Data formats and
                              price
handling
                                      1100
 Pandas
 Series
                                      1000
 DataFrame
 Import/Export data
                                      900
Visual
illustrations
                                                 7-03              7-0
                                                                      5
                                                                               7-0
                                                                                  7
                                                                                           7-0
                                                                                               9
                                                                                                      7-1
                                                                                                            1
                                                                                                                   8-0
                                                                                                                         1
                                                                                                                                   8-0
                                                                                                                                         3
 Matplotlib                                  201                201         201        201         201          201          201
                                                                                        Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Moving window functions                               267
Essential
concepts
 Getting started
 Procedural               Standard deviation
 programming
 Object-orientation       fig = plt.figure(figsize=(16, 8))
Numerical                 ax = fig.add_subplot(1, 1, 1)
programming               pfizer = pd.read_csv("data/pfe.csv", index_col=0,
 NumPy package
                                               parse_dates=True)["Adj Close"]
 NumPy array
 Linear Algebra
                          pg = pd.read_csv("data/pg.csv", index_col=0,
Data formats and
                                           parse_dates=True)["Adj Close"]
handling                  all = pd.DataFrame(index=amazon.index)
 Pandas
                          all["amazon"] = pd.DataFrame(amazon)
                          all["pfizer"] = pd.DataFrame(pfizer)
 Series
 DataFrame
 Import/Export data       all["pg"] = pd.DataFrame(pg)
Visual                    all_std = all.rolling(window=20).std()
illustrations             all_std.plot(ax=ax)
 Matplotlib
 Figures and subplots
                          ax.set_title("Standard deviation", fontsize=25)
 Plot types and styles    fig.savefig("out/std.pdf")
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Moving window functions                                                                          268
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation                                     Standard deviation
                                                                                                                amazon
Numerical                                                                                                       pfizer
                               70                                                                               pg
programming
 NumPy package                 60
 NumPy array
                               50
 Linear Algebra
Visual                         0
illustrations
                                          5         7                 9                   1               1            3
                                       7-0       7-0               7-0              7-1             8-0             8-0
 Matplotlib                         201       201            201                 201          201             201
                                                                          Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Moving window functions                                       269
Essential
concepts
 Getting started
 Procedural               Logarithmic standard deviation
 programming
 Object-orientation       fig = plt.figure(figsize=(16, 8))
Numerical                 ax = fig.add_subplot(1, 1, 1)
programming
                          all_std.plot(ax=ax, logy=True)
 NumPy package
 NumPy array
                          ax.set_title("Logarithmic standard deviation", fontsize=25)
 Linear Algebra           fig.savefig("out/std_log.pdf")
Data formats and
handling
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Moving window functions                                                                                270
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                               102
                                                        Logarithmic standard deviation
                                     amazon
Numerical                            pfizer
                                     pg
programming
 NumPy package
 NumPy array
                               101
 Linear Algebra
 DataFrame
 Import/Export data
Visual
illustrations
                                                    5         7             9                   1               1            3
                                                 7-0       7-0           7-0              7-1             8-0             8-0
 Matplotlib                                   201       201        201                 201          201             201
                                                                                Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Exponentially weighted functions                                     271
Essential
concepts
 Getting started          DataFrame.ewm(span): compute exponentially weighted rolling win-
 Procedural
 programming              dow functions.
 Object-orientation
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Exponentially weighted functions                                                                             272
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                               1500
                                                                   Exponentially weighted functions
                                          Rolling mean
Numerical                                 Exp mean
                                          Amazon price
programming                    1400
 NumPy package
 NumPy array                   1300
 Linear Algebra
                               1200
Data formats and
handling
                               1100
 Pandas
 Series
                               1000
 DataFrame
 Import/Export data
                               900
Visual
illustrations
                                          7-03              7-0
                                                               5
                                                                         7-0
                                                                            7
                                                                                     7-0
                                                                                         9
                                                                                                7-1
                                                                                                      1
                                                                                                             8-0
                                                                                                                   1
                                                                                                                             8-0
                                                                                                                                   3
 Matplotlib                           201                201          201        201         201          201          201
                                                                                  Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Binary moving window functions                             273
Essential
concepts
 Getting started          DataFrame.pct_change(): get the daily percentage change.
 Procedural
 programming
 Object-orientation       Percentage change
Numerical
programming               fig = plt.figure(figsize=(16, 8))
 NumPy package            ax = fig.add_subplot(1, 1, 1)
 NumPy array
                          returns = all.pct_change()
 Linear Algebra
                          print(returns.head())
Data formats and
handling
 Pandas
                          ##                 amazon    pfizer       pg
 Series                   ##   Date
 DataFrame
                          ##   2017-02-23       NaN       NaN      NaN
                          ##   2017-02-24 -0.008155 0.005872 -0.000878
 Import/Export data
Visual
illustrations
                          ##   2017-02-27 0.004023 0.000584 -0.001757
 Matplotlib               ##   2017-02-28 -0.004242 -0.004668 0.001980
 Figures and subplots     ##   2017-03-01 0.009514 0.008792 0.006479
 Plot types and styles
                          returns.plot(ax=ax)
 Pandas visualization
Applications
                          ax.set_title("Returns", fontsize=25)
 Time series
 Moving window
                          fig.savefig("out/returns.pdf")
 Financial applications
© 2018 PyEcon.org
                          Binary moving window functions                                                                     274
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation                                                  Returns
                                                                                                               amazon
Numerical                      0.125                                                                           pfizer
                                                                                                               pg
programming
 NumPy package                 0.100
 NumPy array
                               0.075
 Linear Algebra
                               0.050
Data formats and
handling
                               0.025
 Pandas
 Series                        0.000
 DataFrame
 Import/Export data            0.025
Visual                         0.050
illustrations
                                             3         5         7             9            1            1               3
                                          7-0       7-0       7-0          7-0           7-1          8-0          8-0
 Matplotlib                            201       201       201         201         201          201          201
                                                                         Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Binary moving window functions                           275
Essential
concepts
 Getting started          DataFrame.rolling().corr(benchmark): compute correlation be-
 Procedural
 programming              tween two time series.
 Object-orientation
Numerical                 Correlation
programming
 NumPy package            fig = plt.figure(figsize=(16, 8))
 NumPy array              ax = fig.add_subplot(1, 1, 1)
 Linear Algebra
                          DJI = pd.read_csv("data/dji.csv", index_col=0,
Data formats and          parse_dates=True)["Adj Close"]
handling
 Pandas
                          DJI_ret = DJI.pct_change()
 Series                   corr = returns.rolling(window=20).corr(DJI_ret)
 DataFrame                corr.plot(ax=ax)
                          ax.grid()
 Import/Export data
Visual
illustrations
                          ax.set_title("20 days correlation", fontsize=25)
 Matplotlib               fig.savefig("out/corr.pdf")
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Binary moving window functions                                                                                      276
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation                                               20 days correlation
Numerical                      0.8
programming
 NumPy package                 0.6
 NumPy array
 Linear Algebra                0.4
 Pandas
                               0.0
 Series
 DataFrame
                               0.2
 Import/Export data
                                     amazon
Visual                               pfizer
                               0.4   pg
illustrations
                                                    5         7                    9                      1               1               3
                                                 7-0       7-0               7-0                    7-1             8-0             8-0
 Matplotlib                                   201       201            201                    201             201             201
                                                                                       Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Section 5.3                277
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                          Applications
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Cumulative returns                                 278
Essential
concepts
 Getting started
 Procedural               Returns
 programming
 Object-orientation       fig = plt.figure(figsize=(16, 8))
Numerical                 ax = fig.add_subplot(1, 1, 1)
programming               ret_index = (1+returns).cumprod()
 NumPy package
                          stocks = ["amazon", "pfizer", "pg"]
 NumPy array
 Linear Algebra
                          for i in stocks:
Data formats and
                              ret_index[i][0] = 1
handling                  print(ret_index.tail())
 Pandas
 Series
                          ##                  amazon     pfizer         pg
 DataFrame
 Import/Export data
                          ##   Date
Visual
                          ##   2018-02-15   1.715298   1.088693   0.932322
illustrations             ##   2018-02-16   1.699961   1.105461   0.934471
 Matplotlib               ##   2018-02-20   1.723031   1.097840   0.920217
                          ##   2018-02-21   1.740128   1.090218   0.907772
 Figures and subplots
 Plot types and styles
 Pandas visualization     ##   2018-02-22   1.742968   1.090218   0.914560
Applications
 Time series              ret_index.plot(ax=ax)
 Moving window            ax.set_title("Cumulative returns", fontsize=25)
 Financial applications
                          fig.savefig("out/cumret.pdf")
© 2018 PyEcon.org
                          Cumulative returns                                                                                       279
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation                                                   Cumulative returns
                                         amazon
Numerical                                pfizer
                                         pg
programming
 NumPy package                 1.6
 NumPy array
 Linear Algebra
                               1.4
Data formats and
handling
 Pandas                        1.2
 Series
 DataFrame
 Import/Export data            1.0
Visual
illustrations
                                         7-03        7-0
                                                        5
                                                               7-0
                                                                  7
                                                                                 7-0
                                                                                     9
                                                                                            7-1
                                                                                                  1
                                                                                                         8-0
                                                                                                               1
                                                                                                                         8-0
                                                                                                                               3
 Matplotlib                          201          201       201              201         201          201          201
                                                                              Date
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Cumulative returns                                         280
Essential
concepts
 Getting started
 Procedural               Monthly returns
 programming
 Object-orientation       returns_m = ret_index.resample("BM").last().pct_change()
Numerical                 print(returns_m.head())
programming
 NumPy package
 NumPy array
                          ##                 amazon   pfizer        pg
 Linear Algebra           ##   Date
Data formats and
                          ##   2017-02-28       NaN      NaN       NaN
handling                  ##   2017-03-31 0.049110 0.002638 -0.013396
 Pandas
                          ##   2017-04-28 0.043371 -0.008477 -0.020604
                          ##   2017-05-31 0.075276 -0.028124 0.008703
 Series
 DataFrame
 Import/Export data       ##   2017-06-30 -0.026764 0.028790 -0.010671
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Volatility calculation                                   281
Essential
concepts
 Getting started
 Procedural                Volatility
 programming
 Object-orientation        fig = plt.figure(figsize=(16, 8))
Numerical                  ax = fig.add_subplot(1, 1, 1)
programming
                           vola = returns.rolling(window=20).std() * np.sqrt(20)
 NumPy package
 NumPy array
                           vola.plot(ax=ax)
 Linear Algebra            ax.set_title("Volatility", fontsize=25)
Data formats and           fig.savefig("out/vola.pdf")
handling
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Volatility calculation                                                                                  282
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation                                        Volatility
                                0.14                                                                                amazon
Numerical                                                                                                           pfizer
                                                                                                                    pg
programming
                                0.12
 NumPy package
 NumPy array
                                0.10
 Linear Algebra
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Group analysis                                       283
Essential
concepts
 Getting started          DataFrame.describe(): show summarized analysis.
 Procedural
 programming
 Object-orientation       Describe
Numerical                 print(all.describe())
programming
 NumPy package
 NumPy array              ##                amazon       pfizer           pg
 Linear Algebra           ##   count    252.000000   251.000000   252.000000
Data formats and          ##   mean    1044.521903    33.892665    87.934304
handling
                          ##   std      158.041844     1.694680     2.728659
 Pandas
 Series
                          ##   min      843.200012    30.872143    79.919998
 DataFrame                ##   25%      953.567474    32.593733    86.241475
 Import/Export data
                          ##   50%      988.680023    33.147469    87.863598
Visual                    ##   75%     1136.952484    35.331834    90.363035
illustrations
 Matplotlib
                          ##   max     1485.339966    38.661823    92.988976
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Return analysis                                              284
Essential
concepts
 Getting started
 Procedural               Histogram
 programming
 Object-orientation       fig, ax = plt.subplots(3, 1, figsize=(10, 8), sharex=True)
Numerical                 for i in range(3):
programming
                              ax[i].set_title(stocks[i])
 NumPy package
 NumPy array
                              returns[stocks[i]].hist(ax=ax[i], bins=50)
 Linear Algebra           fig.savefig("out/return_hist.pdf")
Data formats and
handling
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Return analysis                                                                  285
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                                                                    amazon
Numerical
                               40
programming
 NumPy package                 30
 NumPy array
 Linear Algebra                20
Data formats and               10
handling
 Pandas                         0
 Series                                                             pfizer
 DataFrame                     40
 Import/Export data
                               30
Visual
illustrations                  20
 Matplotlib
                               10
 Figures and subplots
 Plot types and styles          0
 Pandas visualization                                                 pg
                               30
Applications
 Time series
 Moving window
                               20
 Financial applications
                               10
                                0
                                    0.050   0.025   0.000   0.025          0.050   0.075   0.100   0.125
© 2018 PyEcon.org
                          Ordinary Least Squares                                         286
Essential
concepts
 Getting started          Using the statsmodels module to determine regressions:
 Procedural
 programming              DataFrame.tolist(): return a list containing the DataFrame values.
 Object-orientation
                          sm.OLS(X, Y).fit(): get OLS fit of data (X, Y).
Numerical
programming
 NumPy package            Regression data
 NumPy array
 Linear Algebra           import statsmodels.api as sm
Data formats and          fig = plt.figure(figsize=(16, 8))
handling                  ax = fig.add_subplot(1, 1, 1)
 Pandas
 Series
                          Y = np.array(amazon.loc["2018-1-1":"2018-1-15"].tolist())
 DataFrame                X = np.arange(len(Y))
 Import/Export data       ax.scatter(x=X, y=Y, marker="o", color="red")
Visual                    fig.savefig("out/reg_data.pdf")
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Ordinary Least Squares                          287
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical                      1300
programming
 NumPy package
 NumPy array                   1280
Linear Algebra
                               1260
Data formats and
handling
 Pandas                        1240
 Series
 DataFrame
                               1220
 Import/Export data
Visual
illustrations                  1200
 Matplotlib
 Figures and subplots                 0   1   2   3   4   5   6   7   8
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Ordinary Least Squares         288
Essential
concepts
 Getting started
 Procedural               Regression
 programming
 Object-orientation       X_reg = sm.add_constant(X)
Numerical                 res = sm.OLS(Y, X_reg).fit()
programming
                          b, a = res.params
 NumPy package
 NumPy array
                          ax.plot(X, a*X + b)
 Linear Algebra           fig.savefig("out/ols.pdf")
Data formats and
handling
 Pandas
 Series
 DataFrame
 Import/Export data
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Ordinary Least Squares                                        289
Essential
concepts
 Getting started          Summary of OLS regression. To print in python use res.summary().
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Ordinary Least Squares                          290
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical                      1300
programming
 NumPy package
                               1280
 NumPy array
 Linear Algebra
                               1260
Data formats and
handling
 Pandas                        1240
 Series
 DataFrame
                               1220
 Import/Export data
Visual
                               1200
illustrations
 Matplotlib
 Figures and subplots                 0   1   2   3   4   5   6   7   8
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Newton-Raphson                                                     291
Essential
concepts
 Getting started          The Newton-Raphson method is an algorithm for finding successively
 Procedural
 programming              better approximations to the roots of real-valued functions.
 Object-orientation
Numerical
programming
                          Let F : Rk → Rk be a continuously differentiable function and JF (xn )
 NumPy package            the Jacobian matrix of F . The recursive Newton-Raphson method to
 NumPy array
 Linear Algebra           find the root of F is given by:
Data formats and
Applications
 Time series
 Moving window            Accordingly, we can determine the optimum of the function f by
 Financial applications
                          applying the method instead to f 0 = df /dx .
© 2018 PyEcon.org
                          Newton-Raphson                                                       292
Essential
concepts
 Getting started          As an illustrative application, we consider the function
 Procedural
 programming
 Object-orientation
                                               f (x ) = 3x 3 + 3x 2 − 5x ,     x ∈ R,
Numerical
programming               which is represented by the blue line in the following diagram. The
 NumPy package
 NumPy array
                          figure depicts the iterative solution path applying the Newton-Raphson
 Linear Algebra           method to find the root, e.g., x solving f (x ) = 0, by tangent points
Data formats and
handling
                          and tangents starting from the intial guess x0 = −1.
 Pandas
 Series
                                 15.0   f(x)
 DataFrame
 Import/Export data
                                 12.5
Visual
illustrations
                                 10.0
 Matplotlib
 Figures and subplots
                                  7.5
 Plot types and styles
 Pandas visualization
                                  5.0
Applications
 Time series                      2.5
 Moving window
 Financial applications
                                  0.0
                                               x0                                x3 x2   x1
                                        1.5    1.0      0.5     0.0      0.5      1.0    1.5
© 2018 PyEcon.org
                          Newton-Raphson implementation                                           293
Essential
concepts
 Getting started          The first step involves the definition of the function f (x ) and its
 Procedural
 programming              derivation f 0 (x ) in Python. We also specify a delta function that
 Object-orientation
                          determines the absolute deviation of the target function and the target
Numerical
programming               value, i.e., 0:
 NumPy package
 NumPy array
 Linear Algebra
                          Newton-Raphson requirements
Data formats and          def f(x):
handling
 Pandas
                              return 3*x**3 + 3*x**2 - 5*x
 Series                   def df(x):
 DataFrame                    return 9*x**2 + 6*x - 5
 Import/Export data
                          def dx(f, x):
Visual                        return abs(f(x))
illustrations
 Matplotlib
 Figures and subplots
                          Finally, we implement the Newton-Raphson algorithm as outlined above.
 Plot types and styles
 Pandas visualization     In addition, for a better understanding, we plot the solution path using
Applications              the tangent points for x0 , x1 , . . . , xN . The solution point is colored
                          black. Hence, the lines starting with ax.scatter() are not part of
 Time series
 Moving window
 Financial applications
                          the algorithm – they take global variables and are included just for the
                          visual illustration.
© 2018 PyEcon.org
                          Newton-Raphson implementation                      294
Essential
concepts
 Getting started
 Procedural               Newton-Raphson
 programming
 Object-orientation       def newton_raphson(fun, dfun, x0, e):
Numerical                     delta = dx(fun, x0)
programming
                              while delta > e:
 NumPy package
 NumPy array
                                  ax.scatter(x0, f(x0), color="red", s=80)
 Linear Algebra                   x0 = x0 - fun(x0) / dfun(x0)
Data formats and                  delta = dx(fun, x0)
handling                      ax.scatter(x0, f(x0), color="black", s=80)
 Pandas
 Series
                              return(x0)
 DataFrame                fig = plt.figure(figsize=(16, 8))
 Import/Export data       ax = fig.add_subplot(1, 1, 1)
Visual                    x = np.arange(-1.5, 1.7, 0.001)
illustrations
                          ax.plot(x, f(x))
 Matplotlib
 Figures and subplots     ax.grid()
 Plot types and styles    x_root = newton_raphson(f, df, -1, 0.1)
 Pandas visualization
                          fig.savefig("out/newton_raphson_root.pdf")
Applications              print(f"Root at: {x_root:.4f}")
 Time series
 Moving window
 Financial applications
                          ## Root at: 0.8878
© 2018 PyEcon.org
                          Newton-Raphson implementation                       295
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical                      14
programming
 NumPy package                 12
 NumPy array
 Linear Algebra                10
 Matplotlib
                                2
 Figures and subplots               1.5   1.0   0.5   0.0   0.5   1.0   1.5
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          Newton-Raphson optimization                                                296
Essential
concepts
 Getting started          With the definition of the second derivative f 00 , i.e. the derivative of the
 Procedural
 programming              derivative, we can employ the Newton-Raphson method to obtain an
 Object-orientation
                          optimum of the target function f (x ) numerically. Hence, the previous
Numerical
programming               example needs only minimal modifications:
 NumPy package
 NumPy array
 Linear Algebra
                          Newton-Raphson
Data formats and          def ddf(x):
handling
 Pandas
                              return 18*x + 6
 Series                   fig = plt.figure(figsize=(16, 8))
 DataFrame                ax = fig.add_subplot(1, 1, 1)
 Import/Export data
                          x = np.arange(-1.5, 1.7, 0.001)
Visual
illustrations
                          ax.plot(x, f(x))
 Matplotlib               ax.grid()
 Figures and subplots     x_opt = newton_raphson(df, ddf, 1, 0.1)
 Plot types and styles
                          fig.savefig("out/newton_raphson_optimum.pdf")
                          print(f"Minimum at: {x_opt:.4f}")
 Pandas visualization
Applications
© 2018 PyEcon.org
                          Newton-Raphson optimization                           297
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
                               15.0
Numerical
programming
 NumPy package                 12.5
 NumPy array
 Linear Algebra                10.0
 DataFrame
 Import/Export data             2.5
Visual
illustrations                   0.0
 Matplotlib
 Figures and subplots                 1.5   1.0   0.5   0.0   0.5   1.0   1.5
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org
                          The End... but not finally   298
Essential
concepts
 Getting started
 Procedural
 programming
 Object-orientation
Numerical
programming
 NumPy package
 NumPy array
 Linear Algebra
Visual
illustrations
 Matplotlib
 Figures and subplots
 Plot types and styles
 Pandas visualization
Applications
 Time series
 Moving window
 Financial applications
© 2018 PyEcon.org