Ipeadatapy Doc
Ipeadatapy Doc
Release
Luan Borelli
1 About Ipeadatapy 2
1.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Installation 3
2.1 PyPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Git . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Getting started 4
3.1 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 Usage overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.3 The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.4 Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 Advanced filtering using metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.6 Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.7 Data Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Functions 17
4.1 list_series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 describe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 timeseries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4 metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5 latest_updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.6 sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.7 themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.8 territories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.9 countries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.10 api_call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
i
ipeadatapy Documentation, Release
ipeadatapy is a data and metadata manipulation, visualization and extraction package made in Python using Ipeadata
database official API. In it’s essence it is an API wrapper.
CONTENTS: 1
CHAPTER
ONE
ABOUT IPEADATAPY
1.1 Purpose
The main purpose of Ipeadatapy package is to provide a way of extracting data from Ipeadata through Python using
Ipeadata’s API. So, in this sense, Ipeadatapy is what is called an API wrapper. Nevertheless, the goal of the package
is far from being only extract data. Ipeadatapy also is concerned with treating, cleaning and making more
understandable the data provided by the API as well as providing data filtering and research mechanisms. Briefly,
Ipeadatapy’s objective can be described as being to facilitate users to search and analyze time series data and
metadata from Ipeadata database using Python.
1.2 License
2
CHAPTER
TWO
INSTALLATION
2.1 PyPI
2.2 Git
2.3 Dependencies
• pandas
• requests
3
CHAPTER
THREE
GETTING STARTED
3.1 Prerequisites
The only technical prerequisites are the dependencies and, of course, Python itself. The unique knowledge
prerequisite is a basic Python language understanding. If you never worked with Python before and for some reason
ended up here, it is highly recommended to read Python’s official Beginner’s Guide to Python before starting with
this package.
Although ipeadatapy can be run in any kind of Python environment, since the package is all about data, it is
recommended, for a better experience, to work with a notebook style interactive interpreter. The more convenient
recommendation is to use Jupyter Notebook.
Jupyter Notebook is a web application for creating Jupyter notebooks. A Jupyter notebook is a JSON document
containing an ordered list of input/output cells which can contain code, text, mathematics, plots and rich media.
Jupyter notebooks can be converted to a number of open standard output formats (HTML, HTML presentation slides,
LaTeX, PDF, ReStructuredText, Markdown, Python) through ‘Download As’ in the web interface and jupyter convert
in a shell.
4
ipeadatapy Documentation, Release
If you are looking for series of a specific subject, you can filter the function output for only series containing some
keyword. For example, let’s filter the function return for only time series containing the word ‘BPM6’ in their names.
This functionality can be used as a searching mechanism.
>>> ipeadatapy.list_series('BPM6')
CODE NAME
6721 BPAG_AR Ativos de reserva (Nova metod. - BPM6)
6722 BPAG_BC Balanca comercial - Saldo (Nova metod. - BPM6)
6723 BPAG_BCM Balanca comercial - Importacoes (Nova metod. -...
6724 BPAG_BCX Balanca comercial - Exportacoes (Nova metod. -...
6725 BPAG_CF Conta Financeira - Saldo (Captacoes - Concesso...
6726 BPAG_CK Conta capital - Saldo (Nova metod. - BPM6)
6727 BPAG_CKD Conta capital - Desp. (Nova metod. - BPM6)
6728 BPAG_CKR Conta capital - Rec. (Nova metod. - BPM6)
6729 BPAG_CRCO Outros invest. - Cr?d. comerciais e adiantamen...
6730 BPAG_CRCOA Outros invest. - Cr?d. comerciais e adiantamen...
... ... ...
You can also use any other keyword you want. Be aware of the case sensitiveness of the keyword.
Let’s suppose that we found our desired time series. Its code is BPAG_AR. We can use a handy command called
describe() to confirm the details about this time series.
>>> ipeadatapy.describe('BPAG_AR')
Ativos de reserva (Nova metod. - BPM6)
Name Ativos de reserva (Nova metod. - BPM6)
Code BPAG_AR
Big Theme Macroeconomico
Theme Balanco de pagamentos
Source Banco Central do Brasil, Balanco de Pagamentos...
Source acronym Bacen/BP (BPM6)
Comment Metodologia do Manual de Balanco de Pagamentos...
Last update 2019-03-14T13:48:00.803-03:00
Frequency Anual
Measure US$
Unit milhoes
Status A
As you can see, this function returns some details about the specified time series: the name of the series, his code, the
big theme and theme which this series correspond, his source, the source acronym, the comment, his last update date
and time, his frequency, measure, unit and status. Thus, this function is a good way to have an overview of a specific
time series.
If you are not satisfied with these information, you can check a more complete metadata data frame about the series
by running:
>>> ipeadatapy.metadata('BPAG_AR')
BIG THEME FNTEXTURL FNTID
˓→SOURCE SOURCE ACRONYM ... SERTEMMUN THEME CODE TEMCODIGOPAI
˓→THEME MEASURE
Now that you are sure about your selected time series, you might be wondering how to observe what really matters:
the data. For this purpose, use the function timeseries():
>>> ipeadatapy.timeseries('BPAG_AR')
YEAR DAY MONTH CODE DATE VALUE (US$)
0 1995 1 1 BPAG_AR 1995-01-01T00:00:00-02:00 12918.900000
1 1996 1 1 BPAG_AR 1996-01-01T00:00:00-02:00 8666.100000
2 1997 1 1 BPAG_AR 1997-01-01T00:00:00-02:00 -7907.159127
3 1998 1 1 BPAG_AR 1998-01-01T00:00:00-02:00 -7970.207388
4 1999 1 1 BPAG_AR 1999-01-01T00:00:00-02:00 -7822.039996
5 2000 1 1 BPAG_AR 2000-01-01T00:00:00-02:00 -2261.654351
6 2001 1 1 BPAG_AR 2001-01-01T00:00:00-02:00 3306.600484
7 2002 1 1 BPAG_AR 2002-01-01T00:00:00-02:00 302.087225
8 2003 1 1 BPAG_AR 2003-01-01T00:00:00-02:00 8495.650494
9 2004 1 1 BPAG_AR 2004-01-01T00:00:00-02:00 2244.029835
10 2005 1 1 BPAG_AR 2005-01-01T00:00:00-02:00 4319.463872
11 2006 1 1 BPAG_AR 2006-01-01T00:00:00-02:00 30569.117416
12 2007 1 1 BPAG_AR 2007-01-01T00:00:00-02:00 87484.245682
13 2008 1 1 BPAG_AR 2008-01-01T00:00:00-02:00 2969.072068
14 2009 1 1 BPAG_AR 2009-01-01T00:00:00-02:00 46650.987800
15 2010 1 1 BPAG_AR 2010-01-01T00:00:00-02:00 49100.503587
16 2011 1 1 BPAG_AR 2011-01-01T00:00:00-02:00 58636.807211
17 2012 1 1 BPAG_AR 2012-01-01T00:00:00-02:00 18899.552358
18 2013 1 1 BPAG_AR 2013-01-01T00:00:00-02:00 -5926.487151
19 2014 1 1 BPAG_AR 2014-01-01T00:00:00-02:00 10832.657276
20 2015 1 1 BPAG_AR 2015-01-01T00:00:00-02:00 1568.772099
21 2016 1 1 BPAG_AR 2016-01-01T00:00:00-02:00 9237.436064
22 2017 1 1 BPAG_AR 2017-01-01T00:00:00-02:00 5092.868662
23 2018 1 1 BPAG_AR 2018-01-01T00:00:00-02:00 2927.674626
24 2019 1 1 BPAG_AR 2019-01-01T00:00:00-02:00 813.549854
If you just want the data for a specific year you can use the parameter year:
>>> ipeadatapy.timeseries("GM366_ERC366", year=2019)
YEAR DAY MONTH CODE DATE VALUE (R$)
12078 2019 2 1 GM366_ERC366 2019-01-02T00:00:00-02:00 3.8589
12079 2019 3 1 GM366_ERC366 2019-01-03T00:00:00-02:00 3.7677
12080 2019 4 1 GM366_ERC366 2019-01-04T00:00:00-02:00 3.7621
12081 2019 7 1 GM366_ERC366 2019-01-07T00:00:00-02:00 3.7056
12082 2019 8 1 GM366_ERC366 2019-01-08T00:00:00-02:00 3.7202
12083 2019 9 1 GM366_ERC366 2019-01-09T00:00:00-02:00 3.6925
12084 2019 10 1 GM366_ERC366 2019-01-10T00:00:00-02:00 3.6863
12085 2019 11 1 GM366_ERC366 2019-01-11T00:00:00-02:00 3.7135
12086 2019 14 1 GM366_ERC366 2019-01-14T00:00:00-02:00 3.7255
12087 2019 15 1 GM366_ERC366 2019-01-15T00:00:00-02:00 3.7043
If you just want the data for a specific month of a specific year, use both the parameters year and month:
>>> ipeadatapy.timeseries("GM366_ERC366", year=2019, month=4)
YEAR DAY MONTH CODE DATE VALUE (R$)
12139 2019 1 4 GM366_ERC366 2019-04-01T00:00:00-03:00 3.8676
12140 2019 2 4 GM366_ERC366 2019-04-02T00:00:00-03:00 3.8655
12141 2019 3 4 GM366_ERC366 2019-04-03T00:00:00-03:00 3.8430
12142 2019 4 4 GM366_ERC366 2019-04-04T00:00:00-03:00 3.8707
12143 2019 5 4 GM366_ERC366 2019-04-05T00:00:00-03:00 3.8616
12144 2019 8 4 GM366_ERC366 2019-04-08T00:00:00-03:00 3.8652
12145 2019 9 4 GM366_ERC366 2019-04-09T00:00:00-03:00 3.8557
12146 2019 10 4 GM366_ERC366 2019-04-10T00:00:00-03:00 3.8339
... ... ... ... ... ... ...
Similarly, if you just want the data for a specific day of a specific month of a specific year use together the parameters
year, month and day:
>>> ipeadatapy.timeseries("GM366_ERC366", year=2019, month=4, day=1)
YEAR DAY MONTH CODE DATE VALUE (R$)
12139 2019 1 4 GM366_ERC366 2019-04-01T00:00:00-03:00 3.8676
Another option is to return only data relative to years greater than some year, say, 2017. For this, use the parameter
yearGreaterThan:
>>> ipeadatapy.timeseries("GM366_ERC366", yearGreaterThan=2017)
YEAR DAY MONTH CODE DATE VALUE (R$)
11828 2018 2 1 GM366_ERC366 2018-01-02T00:00:00-02:00 3.2691
11829 2018 3 1 GM366_ERC366 2018-01-03T00:00:00-02:00 3.2529
11830 2018 4 1 GM366_ERC366 2018-01-04T00:00:00-02:00 3.2312
11831 2018 5 1 GM366_ERC366 2018-01-05T00:00:00-02:00 3.2403
11832 2018 8 1 GM366_ERC366 2018-01-08T00:00:00-02:00 3.2351
11833 2018 9 1 GM366_ERC366 2018-01-09T00:00:00-02:00 3.2391
11834 2018 10 1 GM366_ERC366 2018-01-10T00:00:00-02:00 3.2461
11835 2018 11 1 GM366_ERC366 2018-01-11T00:00:00-02:00 3.2295
11836 2018 12 1 GM366_ERC366 2018-01-12T00:00:00-02:00 3.2192
11837 2018 15 1 GM366_ERC366 2018-01-15T00:00:00-02:00 3.1957
... ... ... ... ... ... ...
[340 rows x 6 columns]
You can also select an interval of years, say, from 2017 to 2018 using together with yearGreaterThan the param-
eter yearSmallerThan:
>>> ipeadatapy.timeseries("GM366_ERC366", yearGreaterThan=2016, yearSmallerThan=2019)
YEAR DAY MONTH CODE DATE VALUE (R$)
11579 2017 2 1 GM366_ERC366 2017-01-02T00:00:00-02:00 3.2723
11580 2017 3 1 GM366_ERC366 2017-01-03T00:00:00-02:00 3.2626
11581 2017 4 1 GM366_ERC366 2017-01-04T00:00:00-02:00 3.2327
11582 2017 5 1 GM366_ERC366 2017-01-05T00:00:00-02:00 3.2123
11583 2017 6 1 GM366_ERC366 2017-01-06T00:00:00-02:00 3.2051
11584 2017 9 1 GM366_ERC366 2017-01-09T00:00:00-02:00 3.2091
11585 2017 10 1 GM366_ERC366 2017-01-10T00:00:00-02:00 3.1912
11586 2017 11 1 GM366_ERC366 2017-01-11T00:00:00-02:00 3.2148
11587 2017 12 1 GM366_ERC366 2017-01-12T00:00:00-02:00 3.1655
11588 2017 13 1 GM366_ERC366 2017-01-13T00:00:00-02:00 3.2028
11589 2017 16 1 GM366_ERC366 2017-01-16T00:00:00-02:00 3.2228
The same logic applies to the parameters monthGreaterThan and monthSmallerThan. For example, let’s
restrict the function output to an interval of months (e.g.: from june to december) for a specifc year, say, 2018:
From now on, use your creativity. There are a lot of possibilities with these parameter combinations. The avail-
able parameters for the function timeseries() can be found using the function help(): help(ipeadatapy.
timeseries).
3.4 Metadata
Every Ipeadata’s time series is accompanied by a set of metadata. Metadata are data about data. Some examples of
the elements of this set of metadata are country, big theme, theme, source and unit of measure. Some specific kinds of
metadata have their own function on Ipeadata API. Let’s see some of them:
3.4.1 Countries
You can have a look at the available Ipeadata’s countries by running the countries() function:
>>> ipeadatapy.countries()
ID COUNTRY
0 ZAF ?frica do Sul
1 DEU Alemanha
2 LATI Am?rica Latina
3 AGO Angola
4 SAU Ar?bia Saudita
5 DZA Arg?lia
6 ARG Argentina
7 AUS Austr?lia
3.4. Metadata 9
ipeadatapy Documentation, Release
8 AUT ?ustria
9 BEL B?lgica
10 BOL Bol?via
.. ... ...
3.4.2 Themes
You can also have a look on the available themes for Ipeadata using the function themes():
>>> ipeadatapy.themes()
ID NAME MACRO REGIONAL SOCIAL
0 28 Agropecu?ria NaN 1.0 NaN
1 23 Assist?ncia social NaN NaN 1.0
2 25 Avalia??o do governo NaN NaN NaN
3 10 Balan?o de pagamentos 1.0 NaN NaN
4 7 C?mbio 1.0 NaN NaN
5 5 Com?rcio exterior 1.0 1.0 NaN
6 2 Consumo e vendas 1.0 1.0 NaN
7 8 Contas nacionais 1.0 NaN NaN
8 81 Contas Regionais NaN 1.0 NaN
9 24 Corre??o monet?ria 1.0 NaN NaN
10 37 Demografia NaN NaN 1.0
.. .. ... ... ... ...
Let’s suppose you have the interest to know which of the themes of Ipeadata are related to the Macroeconomics big
theme. The parameter macro will solve this problem:
>>> ipeadatapy.themes(macro=1)
ID NAME MACRO REGIONAL SOCIAL
3 10 Balan?o de pagamentos 1.0 NaN NaN
4 7 C?mbio 1.0 NaN NaN
5 5 Com?rcio exterior 1.0 1.0 NaN
6 2 Consumo e vendas 1.0 1.0 NaN
7 8 Contas nacionais 1.0 NaN NaN
9 24 Corre??o monet?ria 1.0 NaN NaN
.. .. ... ... ... ...
Let’s now suppose that you just want the function to return themes that are related both to the macroeconomics and
regional themes. For this, use macro and regional parameters together:
The parameter social is also available and works in the same way of macro and regional. For more parameters
available for the function themes() run help(idpy.themes).
3.4. Metadata 10
ipeadatapy Documentation, Release
3.4.3 Sources
Other important metadata is the source. This metadata have his own functions, sources(). Let’s have a look:
>>> ipeadatapy.sources()
0 Abia
1 Abinee
2 ABPO
3 Abracal
4 Abras
5 ACSP/IEGV
6 Anac
7 Anatel
8 Anbima
9 Anbima
10 Anda
.. ...
3.4.4 Territories
For regional time series we also have some information about Brazilian territories through the function
territories():
>>> ipeadatapy.territories()
NAME ID ...
˓→ AREA CAPITAL
0 (n?o definido) ...
˓→ NaN None
1 Brasil 0 ...
˓→ 8531507.6 False
2 Regi?o Norte 1 ...
˓→ 3869637.9 False
3 Rond?nia 11 ...
˓→ 238512.8 False
4 Alta Floresta D'Oeste 1100015 ...
˓→ 7111.8 False
5 Ariquemes 1100023 ...
˓→ 4995.3 False
6 Cabixi 1100031 ...
˓→ 1530.7 False
7 Cacoal 1100049 ...
˓→ 3808.4 False
8 Cerejeiras 1100056 ...
˓→ 2645.0 False
9 Colorado do Oeste 1100064 ...
˓→ 1442.4 False
10 Corumbiara 1100072 ...
˓→ 3079.7 False
... ... ... ...
˓→ ... ...
3.4. Metadata 11
ipeadatapy Documentation, Release
>>> ipeadatapy.territories(areaGreaterThan=1000000)
NAME ID LEVEL AREA CAPITAL
1 Brasil 0 Brasil 8531507.6 False
2 Regi?o Norte 1 Regi?es 3869637.9 False
138 Amazonas 13 Estados 1577820.2 False
386 Par? 15 Estados 1253164.5 False
1161 Regi?o Nordeste 2 Regi?es 1558200.4 False
17960 Regi?o Centro-oeste 5 Regi?es 1612077.2 False
18452 AMC1872_1997 001 513AMC1872_1997001 AMC 1872-00 1947986.1 None
18454 AMC2097 001 51AMC2097001 AMC 20-00 1061175.7 None
Let’s now check the territories which the area is between 1000000 and 1100000:
>>> ipeadatapy.territories(areaGreaterThan=1000000, areaSmallerThan=1500000)
NAME ID LEVEL AREA CAPITAL
386 Par? 15 Estados 1253164.5 False
18454 AMC2097 001 51AMC2097001 AMC 20-00 1061175.7 None
Although only 4 metadata from Ipeadata have their own function, there are a lot more metadata available for the data
base time series. The function metadata() returns all Ipeadata time series in a data frame with all of his metadata.
Each of the collumns of the data frame represents a metadata.
>>> ipeadatapy.metadata()
BIG THEME SOURCE SOURCE
˓→ACRONYM ... SERIES STATUS THEME CODE MEASURE
0 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
1 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
2 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
3 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Cabe?a
4 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Cabe?a
5 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Cabe?a
6 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... I 1 Tonelada
7 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
8 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
9 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
10 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
11 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
12 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
13 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... I 1 Tonelada
14 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... I 1 Cabe?a
3.4. Metadata 12
ipeadatapy Documentation, Release
As you can see, this data frame is too big to be represented here. His dimension is 8549 rows by 15 columns.
Each of these columns represents one metadata. The columns are BIG THEME, SOURCE, SOURCE ACRONYM,
SOURCE URL, UNIT, COUNTRY, FREQUENCY, LAST UPDATE, CODE, COMMENT, NAME, NUMERICA,
SERIES STATUS, THEME CODE, and MEASURE. In the next section, we will learn how to use these metadata as
filtering options to improve our research.
Now that you have knowledge of some of the metadata of Ipeadata, let’s introduce yourself to a function called
metadata(). This function returns all Ipeadata’s time series in a data frame, similarly to the list_series()
function. However, the difference between the two functions is that metadata() returns not only the time series
but also their metadata. You might then be asking yourself why these two functions exists, since metadata() is
a more complete version of the list_series() function (metadata() features all of the list_series()
information plus metadata). The answer is: list_series() is intended to be a more simplistic version, aiming
unexperienced users and designed to be friendly to them. metadata(), in fact, is a more complete version as well
as more confusing because of the quantity of information returned. No more words, let’s run the function:
>>> ipeadatapy.metadata()
BIG THEME SOURCE SOURCE
˓→ACRONYM ... SERIES STATUS THEME CODE MEASURE
0 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
1 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
2 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
3 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Cabe?a
4 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Cabe?a
5 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Cabe?a
6 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... I 1 Tonelada
7 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
8 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
9 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
10 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
11 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
12 Macroecon?mico Instituto Brasileiro de Geografia e Estat?stic... IBGE/
˓→Coagro ... A 1 Tonelada
Why is this function so powerful and important? The first obvious answer is: it gives you more informations about
time series. The not-so-obvious answer is: it allows you to better filter time series from Ipeadata. Let’s state an
illustrative problem for better understanding:
Ipeadata API has 8565 time series in total. Let’s suppose you are doing research in macroeconomics
about the United States, but for some specific reason, your interest in data is restricted to data published
by The Economist. It also needs to be quarterly published. How to solve this problem using ipeadatapy
Python package?
Gotcha! Other metadata also can be used as filtering parameters. For all parameters run help(idpy.metadata).
Although data visualization is not the main purpose of Ipeadatapy package, one of ipeadatapy’s dependencies (pandas)
allows plots. Because ipeadatapy is a package related to time series, graphic representations are very important. Thus,
we consider a good idea to include the pandas’ plot() function here in our documentation. It is good to have
knowledge of the possibility of plotting the time series directly trough Python. Let’s do an example by plotting the
data of GM366_ERC366 time series for April, 2019:
Note that the plot() function parameters are, respectively, the x and y axis of the graph. These parameters must
match the column titles of the desired data in the timeseries() data frame. The tip for correctly filling these
parameters is to first run the timeseries() function alone for your desired time series, check the column names
to, then, use these column titles as parameters for the plot() function.
In this section we will show you how to extract data from ipeadatapy using the package together with one of his
dependencies (pandas) and other Python Built-In features. One of the most useful aspects of having an API wrapper
is to have the option of extracting data in a more efficient and practical way than extracting the same data from
the database’s website. With this package it’s possible to extract not only data but also quantitative and descriptive
metadatas from Ipeadata database. As we will see, you can even extract mass quantities of spreadsheets at once, being
also possible extraction filterings accordingly to your needs. Let’s start with the basics.
then, let’s suppose that we already know which time series we want to extract and already found his code using the
list_series() function. Let, e.g., this time series be the one which the code is GM366_ERC366. Thus, we can
show the data calling the following function:
>>> ipeadatapy.timeseries('GM366_ERC366')
YEAR DAY MONTH CODE DATE VALUE (R$)
0 1985 2 1 GM366_ERC366 1985-01-02T00:00:00-02:00 1.152000e-09
1 1985 3 1 GM366_ERC366 1985-01-03T00:00:00-02:00 1.152000e-09
2 1985 4 1 GM366_ERC366 1985-01-04T00:00:00-02:00 1.152000e-09
>>> ipeadatapy.timeseries('GM366_ERC366').to_excel('.../path/yourFileName.xlsx')
If you prefer, you can extract the data in csv format instead of xlsx. For this, you can use the function called
to_csv():
where ‘. . . /path/’ represents the desired directory path where the file will be placed and ‘yourFile.csv’ the name of the
file. If you just set the file name without a directory path, then the file will be saved in the directory where you are
running your Python. Pay attention to not omit the filename extensions, ‘.csv’ or ‘.xlsx’.
Let’s suppose that we want to extract more than one time series at once. We can do it by defining a list containing the
codes of the time series that we want to extract then running a ‘for’ structure that will loop through this list, extracting
a file for each of the timeseries contained there. For illustration purposes, let’s suppose that we want to extract these
three timeseries: GM366_ERC366, IBMEC12_TJTIT12 and PIBE. Then we need to define a list containing them:
The output will be three ‘.csv’ files: GM366_ERC366.csv, IBMEC12_TJTIT12.csv and PIBE.csv, saved in the direc-
tory where you runned your Python.
Ipeadatapy’s functions were defined to always return data in the form of data frames. Thus, every function output can
be extracted using pandas’ to_csv() or to_excel() functions in the same way we’ve shown in the past example.
FOUR
FUNCTIONS
4.1 list_series
[source]
4.2 describe
describe(series)
Describes the specified time series. series must be the time series’ code.
series str Time series code
[source]
17
ipeadatapy Documentation, Release
4.3 timeseries
[source]
4.3. timeseries 18
ipeadatapy Documentation, Release
4.4 metadata
series | str, optional Time series code. For the available time series
run list_series()
big_theme str, optional Big theme by which the return will be fitered.
Options: “Macroecon?mico”, “Regional” or “So-
cial”
source str, optional Source by which the return will be filtered. For
available sources run sources() function.
country str, optional Country ID by which the return will be filtered.
For available countries and their IDs run coun-
tries() function.
frequency str, optional Frequency by which the return will be filtered.
unit str, optional Unit by which the return will be filtered.
measure str, optional Measure by which the return will be filtered.
status str, optional Status by which the return will be filtered. Avail-
able options: “A” and “I”
source_ext str, optional Source extended name by which the return will
be filtered.
source_url str, optional Source URL by which the return will be filtered.
last_update str, optional Last update date by which the return will be fil-
tered.
code str, optional Time series code by which the return will be fil-
tered.
comment str, optional Time series comment by which the return will be
filtered.
name str, optional Time series name by which the return will be fil-
tered.
numerica bool, optional Numeric? True or False.
theme_id str, optional Theme by which the return will be filtered. For
available themes run themes() function
return pandas.DataFrame If no keyword is specified, returns a data frame
containing all Ipeadata’s time series. Else, re-
turns only the ones that respects the specified pa-
rameters
[source]
4.5 latest_updates
latest_updates()
Returns the latest time series’ updates from Ipeadata, from the most to the less recent updated time series.
[source]
4.4. metadata 19
ipeadatapy Documentation, Release
4.6 sources
sources()
Returns available Ipeadata’s sources in the form of a data frame.
[source]
4.7 themes
[source]
4.8 territories
name str, optional Territory name by which the return will be fitered.
level str, optional Territory name by which the return will be fitered.
territory_id str, optional Territory ID by which the return will be fitered.
area float, optional Territorial area by which the return will be fitered.
areaGreaterThan float, optional Territorial area restriction by which the return will be fitered. The
function will return only territories with area strictly greater than the
submitted value.
areaSmallerThan float, optional Territorial area restriction by which the return will be fitered. The
function will return only territories with area strictly smaller than
the submitted value.
capital bool, optional Return only capitals? True or False.
return pandasDataframe Returns available Ipeadata territories.
[source]
4.6. sources 20
ipeadatapy Documentation, Release
4.9 countries
countries()
Returns all available Ipeadata’s countries and country IDs in the form of a data frame.
[source]
4.10 api_call
api_call(api)
For advanced users. Returns raw Ipeadata API data in the form of a data frame.
api str
[source]
4.9. countries 21