<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
  <title>rOpenGov R packages for open government data analytics</title>
  <link>http://ropengov.org/</link>
  <description>Recent content on rOpenGov R packages for open government data analytics</description>
  <generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Mon, 09 Nov 2020 19:55:24 +0200</lastBuildDate>

<atom:link href="http://ropengov.org/index.xml" rel="self" type="application/rss+xml" />


<item>
  <title>usdoj: For Accessing U.S. Department of Justice (DOJ) Open Data</title>
  <link>http://ropengov.org/2023/04/usdoj-cran-release/</link>
  <pubDate>Sat, 01 Apr 2023 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2023/04/usdoj-cran-release/</guid>
  <description>&lt;p&gt;usdoj, a package for fetching data from the United States (US) Department of Justice (DOJ) API, was released as part of the rOpenGov project. usdoj provides easy access to US DOJ press releases, blog entries, and speeches. Optional parameters allow users to specify the number of results starting from the earliest or latest entries, and whether these results contain keywords. Data is cleaned for analysis and returned in a data frame.&lt;/p&gt;
&lt;p&gt;US DOJ press releases, blog posts, and speeches are an official media through which the United States government publicizes front line information about law, enforcement, and crime that may be of interest to members of the public, researchers and analysts, and members of other government branches. They include coverage for divisions such as the Federal Bureau of Investigation (FBI), the Offices of the United States Attorneys (USAO), the National Security Division (NSD), the Civil Division, the Tax Division, the Bureau of Alcohol, Tobacco, Firearms and Explosives (ATF), the Drug Enforcement Administration (DEA), and more. New media are published on a regular basis&lt;/p&gt;
&lt;p&gt;usdoj makes this media accessible in an analysis-ready format through three functions that search for and return relevant results: &lt;code&gt;doj_press_releases()&lt;/code&gt;, &lt;code&gt;doj_blog_posts()&lt;/code&gt;, and &lt;code&gt;doj_speeches()&lt;/code&gt;. Data is cleaned and structured before it is returned as a data frame with fields for the body text, date, title, url, the name of the corresponding division, to name just a few.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;usmap&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;lubridate&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;tidyverse&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;usdoj&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;

&lt;span style=&#34;color:#8f5902;font-style:italic&#34;&gt;# press_releases &amp;lt;- doj_press_releases(n_results = 100000, search_direction = &amp;#34;DESC&amp;#34;)&lt;/span&gt;
&lt;span style=&#34;color:#8f5902;font-style:italic&#34;&gt;# write_csv(press_releases, &amp;#34;press_releases_doj_intro.csv&amp;#34;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;read_csv&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;press_releases_doj_intro.csv&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;state&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;statepop&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;full&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;count&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;list&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;()&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;for&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;state_name&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;in&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;state&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;{&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;count&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;append&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;count&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;sum&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;str_count&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;state_name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)))&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;}&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;df&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;data.frame&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;state&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;unlist&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;state&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;count&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;unlist&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;count&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;ymd&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;min&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste0&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;month&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;label&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#204a87;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;day&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;, &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;year&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
  
&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;ymd&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;max&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste0&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;month&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;label&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#204a87;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;day&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;, &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;year&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;plot_usmap&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;data&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;df&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; 
           &lt;span style=&#34;color:#000&#34;&gt;values&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;count&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; 
           &lt;span style=&#34;color:#000&#34;&gt;color&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;#4682b4&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt; 
  &lt;span style=&#34;color:#000&#34;&gt;scale_fill_continuous&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;low&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; 
                        &lt;span style=&#34;color:#000&#34;&gt;high&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;#4682b4&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; 
                        &lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;n&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; 
                        &lt;span style=&#34;color:#000&#34;&gt;label&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;scales&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;comma&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt; 
  &lt;span style=&#34;color:#000&#34;&gt;theme&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;legend.position&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;right&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;labs&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;title&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;US DOJ Press Releases Involving the FBI Corresponding to State&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; 
       &lt;span style=&#34;color:#000&#34;&gt;subtitle&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste0&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;Raw Count From &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34; to &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; 
       &lt;span style=&#34;color:#000&#34;&gt;caption&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;This plot was generated using data from usdoj. It visualizes the raw count of press releases that are tagged 
&lt;/span&gt;&lt;span style=&#34;color:#4e9a06&#34;&gt;       as involving both the FBI and a state&amp;#39;s office of the United States Attorney.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div style=&#34;text-align: center;&#34;&gt;
&lt;figure &gt;
    &lt;a href=&#34;http://ropengov.org/post/2023-04-01-usdoj-cran-release.en_files/unnamed-chunk-1-1.png&#34;&gt;
        &lt;img src=&#34;http://ropengov.org/post/2023-04-01-usdoj-cran-release.en_files/unnamed-chunk-1-1.png&#34; width=&#34;800&#34;
alt=&#34;A choropleth map of the United States, visualizing the number of press releases containing mentions to the FBI in each state&#34;  /&gt;
    &lt;/a&gt;
    
&lt;/figure&gt;
&lt;/div&gt;
&lt;h2 id=&#34;demonstration-text-mining-united-states-department-of-justice-open-data&#34;&gt;Demonstration: Text Mining United States Department of Justice Open Data&lt;/h2&gt;
&lt;p&gt;The data returned by usdoj is in a format that can easily undergo additional processing for analysis. The purpose of this section is to show one way of doing this while walking through the steps for performing a TF-IDF (term frequency-inverse document frequency) analysis in order to see which words are characteristic to certain divisions, and not others.&lt;/p&gt;
&lt;h4 id=&#34;installing-and-loading-libraries&#34;&gt;Installing and Loading Libraries&lt;/h4&gt;
&lt;p&gt;usdoj can be installed from CRAN (using &lt;code&gt;install.packages(&amp;quot;usdoj&amp;quot;)&lt;/code&gt;) or from rOpenGov&amp;rsquo;s r-universe. For this tutorial we will also use the tidyverse and tidytext libraries.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;usdoj&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;usmap&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;tidyverse&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;tidytext&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;library&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;lubridate&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We will start by collecting US DOJ press releases with the corresponding function, &lt;code&gt;doj_press_releases()&lt;/code&gt;. By default, the most recently published records are returned. Passing &lt;code&gt;search_direction = &amp;quot;ASC&amp;quot;&lt;/code&gt; to the function will instead return data starting at the earliest published records. usdoj automatically flattens nested fields. The resulting data frame is easily text mined.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;doj_press_releases&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;n_results&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#0000cf;font-weight:bold&#34;&gt;700&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We will also save the date range present in the data for use in our visualization (later on).&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;ymd&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;min&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste0&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;month&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;label&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#204a87;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;day&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;, &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;year&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
  
&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;ymd&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;max&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste0&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;month&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;label&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#204a87;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;day&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;, &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;year&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;write_csv&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;intro_pt_2.csv&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;read_csv&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;intro_pt_2.csv&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;show_col_types&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#204a87;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A single field may contain multiple values. For example, the field &amp;ldquo;name&amp;rdquo; contains the (sometimes multiple) US DOJ divisions related to a press release, as shown by lines 7 and 9. A single press release may relate to USAOs across multiple states or may implicate multiple offices.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;head&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0000cf;font-weight:bold&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre&gt;&lt;code&gt;[1] &amp;quot;Office of the Attorney General&amp;quot;            
[2] &amp;quot;Civil Rights Division&amp;quot;                     
[3] &amp;quot;Civil Division&amp;quot;                            
[4] &amp;quot;Criminal Division&amp;quot;                         
[5] &amp;quot;Environment and Natural Resources Division&amp;quot;
[6] &amp;quot;Office of the Deputy Attorney General&amp;quot;     
[7] &amp;quot;Environment and Natural Resources Division&amp;quot;
[8] &amp;quot;Tax Division&amp;quot;                              
[9] &amp;quot;Criminal Division&amp;quot;                         
[10] &amp;quot;Tax Division&amp;quot; 
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In this demonstration we will process the text body, transforming the dense blocks of natural language text into a structure that is more easily quantifiable.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;tail&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;body&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0000cf;font-weight:bold&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For this demonstration, we will just compare the words relating to United States Attorney Offices (USAOs) across different states. We will do this by removing mentions of the other divisions from the &amp;ldquo;name&amp;rdquo; field and filtering for just press releases that contain USAO as a division.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;state_names&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;statepop&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;full&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;collapse&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;|USAO - &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt;  &lt;span style=&#34;color:#000&#34;&gt;str_extract&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste0&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;USAO - &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;state_names&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;usao_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;filter&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;str_detect&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;USAO&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The following code tokenizes the body text, a process through which dense paragraphs are separated into one-word-per-row.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;tokenized_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;usao_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;select&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;body&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;unnest_tokens&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;word&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;body&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For this demonstration we will remove digits because they occur frequently in the data set and, for our purposes, they don&amp;rsquo;t reveal much meaningful information.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;cleaned_tokenized_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;tokenized_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;slice&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;which&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;str_detect&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;word&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;[[:digit:]]&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In preparation of performing a TF-IDF analysis, we will count the number of times a word appears in each unique &amp;ldquo;name&amp;rdquo; grouping. In other words, if the same word appears in &amp;ldquo;Civil Division&amp;rdquo; and &amp;ldquo;Antitrust Division,&amp;rdquo; then the count will be &amp;ldquo;one&amp;rdquo; for each division (as opposed to &amp;ldquo;two,&amp;rdquo; reflecting the overall count). To remove typos and other such errors, we will also remove words that have been stated less than 5 times.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;counted_tokenized_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;cleaned_tokenized_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;count&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;word&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;filter&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;n&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#0000cf;font-weight:bold&#34;&gt;5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;head&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;counted_tokenized_press_releases&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We will now gather the overall word count per &amp;ldquo;name&amp;rdquo; grouping and use &lt;code&gt;bind_tf_idf()&lt;/code&gt; to see which words are characteristic of one grouping and not the others.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;total_words_per_group&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;counted_tokenized_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  &lt;span style=&#34;color:#000&#34;&gt;group_by&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  &lt;span style=&#34;color:#000&#34;&gt;summarize&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;total&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;sum&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;n&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;ungroup&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;()&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;counts_and_totals&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;left_join&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;counted_tokenized_press_releases&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;total_words_per_group&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;

&lt;span style=&#34;color:#000&#34;&gt;usao_press_releases_tf_idf&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;counts_and_totals&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;bind_tf_idf&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;word&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;n&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We can now visualize which words are characteristic of one &amp;ldquo;name&amp;rdquo; grouping and not another. In the following code we take the top 10 words per name grouping and plot them based on their TF-IDF scores.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#000&#34;&gt;top_usao_press_releases&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;usao_press_releases_tf_idf&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;group_by&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;arrange&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;desc&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;tf_idf&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;slice&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0000cf;font-weight:bold&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#0000cf;font-weight:bold&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;ungroup&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;()&lt;/span&gt; 

&lt;span style=&#34;color:#000&#34;&gt;ggplot&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;top_usao_press_releases&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt;
       &lt;span style=&#34;color:#000&#34;&gt;aes&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;tf_idf&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;reorder_within&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;word&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;tf_idf&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;),&lt;/span&gt;
           &lt;span style=&#34;color:#000&#34;&gt;fill&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;labs&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;title&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;TF-IDF Scores By USAO Grouping&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt;
       &lt;span style=&#34;color:#000&#34;&gt;subtitle&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;paste0&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;US DOJ Press Releases From &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;earliest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34; to &amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;latest_date&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;))&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt; 
  &lt;span style=&#34;color:#000&#34;&gt;geom_col&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;show.legend&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#204a87;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;facet_wrap&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;ncol&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#0000cf;font-weight:bold&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;scales&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;free&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt;
  &lt;span style=&#34;color:#000&#34;&gt;scale_y_reordered&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;()&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;+&lt;/span&gt; 
  &lt;span style=&#34;color:#000&#34;&gt;labs&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#000&#34;&gt;x&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#4e9a06&#34;&gt;&amp;#34;tf-idf&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#000&#34;&gt;y&lt;/span&gt; &lt;span style=&#34;color:#ce5c00;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#204a87;font-weight:bold&#34;&gt;NULL&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div style=&#34;text-align: center;&#34;&gt;
&lt;figure &gt;
    &lt;a href=&#34;http://ropengov.org/post/2023-04-01-usdoj-cran-release.en_files/unnamed-chunk-14-1.png&#34;&gt;
        &lt;img src=&#34;http://ropengov.org/post/2023-04-01-usdoj-cran-release.en_files/unnamed-chunk-14-1.png&#34; width=&#34;800&#34;
alt=&#34;A graph containing barcharts for each US state, visualizing the TF-IDF Scores by USAO Grouping&#34;  /&gt;
    &lt;/a&gt;
    
&lt;/figure&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>Regions package for Eurostat sub-national statistics</title>
  <link>http://ropengov.org/2021/06/regions-cran-release/</link>
  <pubDate>Wed, 16 Jun 2021 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2021/06/regions-cran-release/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2021/06/regions-cran-release/index.en_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;
&lt;style type=&#34;text/css&#34;&gt;
pre &gt; code.sourceCode { white-space: pre; position: relative; }
pre &gt; code.sourceCode &gt; span { display: inline-block; line-height: 1.25; }
pre &gt; code.sourceCode &gt; span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode &gt; span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre &gt; code.sourceCode { white-space: pre-wrap; }
pre &gt; code.sourceCode &gt; span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
  { counter-reset: source-line 0; }
pre.numberSource code &gt; span
  { position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code &gt; span &gt; a:first-child::before
  { content: counter(source-line);
    position: relative; left: -1em; text-align: right; vertical-align: baseline;
    border: none; display: inline-block;
    -webkit-touch-callout: none; -webkit-user-select: none;
    -khtml-user-select: none; -moz-user-select: none;
    -ms-user-select: none; user-select: none;
    padding: 0 4px; width: 4em;
    color: #aaaaaa;
  }
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
div.sourceCode
  {  background-color: #f8f8f8; }
@media screen {
pre &gt; code.sourceCode &gt; span &gt; a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ef2929; } /* Alert */
code span.an { color: #8f5902; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #c4a000; } /* Attribute */
code span.bn { color: #0000cf; } /* BaseN */
code span.cf { color: #204a87; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4e9a06; } /* Char */
code span.cn { color: #000000; } /* Constant */
code span.co { color: #8f5902; font-style: italic; } /* Comment */
code span.cv { color: #8f5902; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #8f5902; font-weight: bold; font-style: italic; } /* Documentation */
code span.dt { color: #204a87; } /* DataType */
code span.dv { color: #0000cf; } /* DecVal */
code span.er { color: #a40000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #0000cf; } /* Float */
code span.fu { color: #000000; } /* Function */
code span.im { } /* Import */
code span.in { color: #8f5902; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #204a87; font-weight: bold; } /* Keyword */
code span.op { color: #ce5c00; font-weight: bold; } /* Operator */
code span.ot { color: #8f5902; } /* Other */
code span.pp { color: #8f5902; font-style: italic; } /* Preprocessor */
code span.sc { color: #000000; } /* SpecialChar */
code span.ss { color: #4e9a06; } /* SpecialString */
code span.st { color: #4e9a06; } /* String */
code span.va { color: #000000; } /* Variable */
code span.vs { color: #4e9a06; } /* VerbatimString */
code span.wa { color: #8f5902; font-weight: bold; font-style: italic; } /* Warning */
&lt;/style&gt;


&lt;p&gt;The new version of our &lt;a href=&#34;https://ropengov.org/&#34;&gt;rOpenGov&lt;/a&gt; R package &lt;a href=&#34;https://regions.dataobservatory.eu/&#34;&gt;regions&lt;/a&gt; was released today on CRAN. This package is one of the engines of our experimental open data-as-service &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34;&gt;Green Deal Data Observatory&lt;/a&gt;, &lt;a href=&#34;https://economy.dataobservatory.eu/&#34;&gt;Economy Data Observatory&lt;/a&gt;, and &lt;a href=&#34;https://music.dataobservatory.eu/&#34;&gt;Digital Music Observatory&lt;/a&gt; prototypes, which aim to place open data packages into open-source applications.&lt;/p&gt;
&lt;p&gt;In international comparison the use of nationally aggregated indicators often have many disadvantages: they inhibit very different levels of homogeneity, and data is often very limited in number of observations for a cross-sectional analysis. When comparing European countries, a few missing cases can limit the cross-section of countries to around 20 cases which disallows the use of many analytical methods.&lt;/p&gt;
&lt;p&gt;Working with sub-national statistics has many advantages: the similarity of the aggregation level and high number of observations can allow more precise control of model parameters and errors, and the number of observations grows from 20 to 200-300.&lt;/p&gt;
&lt;div class=&#34;figure&#34; style=&#34;text-align: center&#34;&gt;&lt;span id=&#34;fig:original-map&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;indicator_with_map.png&#34; alt=&#34;The change from national to sub-national level comes with a huge data processing price: internal administrative boundaries, their names, codes codes change very frequently.&#34; width=&#34;80%&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 1: The change from national to sub-national level comes with a huge data processing price: internal administrative boundaries, their names, codes codes change very frequently.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Yet the change from national to sub-national level comes with a huge data processing price. While national boundaries are relatively stable, with only a handful of changes in each recent decade, the change of national boundaries requires a more-or-less global consensus. But states are free to change their internal administrative boundaries, and they do it with high frequency. This means that the names, identification codes and boundary definitions of sub-national regions change very frequently. Joining data from different sources and different years can therefore be very difficult.&lt;/p&gt;
&lt;div class=&#34;figure&#34; style=&#34;text-align: center&#34;&gt;&lt;span id=&#34;fig:recoded-map&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;recoded_indicator_with_map.png&#34; alt=&#34;Our regions R package helps the data processing, validation and imputation of sub-national, regional datasets and their coding.&#34; width=&#34;80%&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 2: Our regions R package helps the data processing, validation and imputation of sub-national, regional datasets and their coding.
&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;There are numerous advantages in switching from a national level of the analysis to a sub-national level. This, however, requires more effort in data processing, validation and imputation. Our &lt;a href=&#34;https://regions.dataobservatory.eu/&#34;&gt;regions&lt;/a&gt; package aims to help in this process.&lt;/p&gt;
&lt;p&gt;You can review the problem, and the code that created the two map comparisons, in the &lt;a href=&#34;https://regions.dataobservatory.eu/articles/maping.html&#34;&gt;Mapping Regional Data, Mapping Metadata Problems&lt;/a&gt; vignette article of the package. A more detailed problem description can be found in the &lt;a href=&#34;https://regions.dataobservatory.eu/articles/Regional_stats.html&#34;&gt;Working With Regional, Sub-National Statistical Products&lt;/a&gt; vignette.&lt;/p&gt;
&lt;p&gt;This package is an offspring of the &lt;a href=&#34;https://ropengov.github.io/eurostat/&#34;&gt;eurostat&lt;/a&gt; package on &lt;a href=&#34;https://ropengov.github.io/&#34;&gt;rOpenGov&lt;/a&gt;. It started as a tool to validate and re-code regional Eurostat statistics, but it aims to be a general solution for all sub-national statistics. It will be developed parallel with other rOpenGov packages.&lt;/p&gt;
&lt;div id=&#34;get-the-package&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Get the Package&lt;/h2&gt;
&lt;p&gt;You can install the development version from &lt;a href=&#34;https://github.com/&#34;&gt;GitHub&lt;/a&gt; with:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb1&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb1-1&#34;&gt;&lt;a href=&#34;#cb1-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;devtools&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;install_github&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;rOpenGov/regions&amp;quot;&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;or the released version from CRAN:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb2&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb2-1&#34;&gt;&lt;a href=&#34;#cb2-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;install.packages&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;regions&amp;quot;&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can review the complete package documentation on &lt;a href=&#34;https://regions.dataobservatory.eu/&#34;&gt;regions.dataobservaotry.eu&lt;/a&gt;. If you find any problems with the code, please raise an issue on &lt;a href=&#34;https://github.com/rOpenGov/regions&#34;&gt;Github&lt;/a&gt;. Pull requests are welcome if you agree with the &lt;a href=&#34;https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html&#34;&gt;Contributor Code of Conduct&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;If you use &lt;code&gt;regions&lt;/code&gt; in your work, please &lt;a href=&#34;https://doi.org/10.5281/zenodo.4965909&#34;&gt;cite the package&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>Economic and environmental impact analysis with iotables</title>
  <link>http://ropengov.org/2021/06/iotables-cran-release/</link>
  <pubDate>Fri, 04 Jun 2021 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2021/06/iotables-cran-release/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2021/06/iotables-cran-release/index.en_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;
&lt;style type=&#34;text/css&#34;&gt;
pre &gt; code.sourceCode { white-space: pre; position: relative; }
pre &gt; code.sourceCode &gt; span { display: inline-block; line-height: 1.25; }
pre &gt; code.sourceCode &gt; span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode &gt; span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre &gt; code.sourceCode { white-space: pre-wrap; }
pre &gt; code.sourceCode &gt; span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
  { counter-reset: source-line 0; }
pre.numberSource code &gt; span
  { position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code &gt; span &gt; a:first-child::before
  { content: counter(source-line);
    position: relative; left: -1em; text-align: right; vertical-align: baseline;
    border: none; display: inline-block;
    -webkit-touch-callout: none; -webkit-user-select: none;
    -khtml-user-select: none; -moz-user-select: none;
    -ms-user-select: none; user-select: none;
    padding: 0 4px; width: 4em;
    color: #aaaaaa;
  }
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
div.sourceCode
  {  background-color: #f8f8f8; }
@media screen {
pre &gt; code.sourceCode &gt; span &gt; a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ef2929; } /* Alert */
code span.an { color: #8f5902; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #c4a000; } /* Attribute */
code span.bn { color: #0000cf; } /* BaseN */
code span.cf { color: #204a87; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4e9a06; } /* Char */
code span.cn { color: #000000; } /* Constant */
code span.co { color: #8f5902; font-style: italic; } /* Comment */
code span.cv { color: #8f5902; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #8f5902; font-weight: bold; font-style: italic; } /* Documentation */
code span.dt { color: #204a87; } /* DataType */
code span.dv { color: #0000cf; } /* DecVal */
code span.er { color: #a40000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #0000cf; } /* Float */
code span.fu { color: #000000; } /* Function */
code span.im { } /* Import */
code span.in { color: #8f5902; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #204a87; font-weight: bold; } /* Keyword */
code span.op { color: #ce5c00; font-weight: bold; } /* Operator */
code span.ot { color: #8f5902; } /* Other */
code span.pp { color: #8f5902; font-style: italic; } /* Preprocessor */
code span.sc { color: #000000; } /* SpecialChar */
code span.ss { color: #4e9a06; } /* SpecialString */
code span.st { color: #4e9a06; } /* String */
code span.va { color: #000000; } /* Variable */
code span.vs { color: #4e9a06; } /* VerbatimString */
code span.wa { color: #8f5902; font-weight: bold; font-style: italic; } /* Warning */
&lt;/style&gt;


&lt;p&gt;We have released a new version of &lt;a href=&#34;https://iotables.dataobservatory.eu/&#34;&gt;iotables&lt;/a&gt; as part of the &lt;a href=&#34;http://ropengov.org/&#34;&gt;rOpenGov&lt;/a&gt; project. The package, as the name suggests, works with European symmetric input-output tables (SIOTs). SIOTs are among the most complex governmental statistical products. They show how each country’s 64 agricultural, industrial, service, and sometimes household sectors relate to each other. They are estimated from various components of the GDP, tax collection, at least every five years.&lt;/p&gt;
&lt;p&gt;SIOTs offer great value to policy-makers and analysts to make more than educated guesses on how a million euros, pounds or Czech korunas spent on a certain sector will impact other sectors of the economy, employment or GDP. What happens when a bank starts to give new loans and advertise them? How is an increase in economic activity going to affect the amount of wages paid and and where will consumers most likely spend their wages? As the national economies begin to reopen after COVID-19 pandemic lockdowns, we can utilize SIOTs to calculate direct and indirect employment effects or value added effects of government grant programs to sectors such as cultural and creative industries or actors such as venues for performing arts, movie theaters, bars and restaurants.&lt;/p&gt;
&lt;p&gt;Making such calculations requires a bit of matrix algebra and a solid understanding of input-output economics, direct, indirect effects, and multipliers. Economists, grant designers, and policy makers have those skills, but until now, such calculations were either made in cumbersome Excel sheets, or proprietary software, as the key to these calculations is to keep vectors and matrices, which have at least one dimension of 64, perfectly aligned. We made this process reproducible with &lt;a href=&#34;https://iotables.dataobservatory.eu/&#34;&gt;iotables&lt;/a&gt; and &lt;a href=&#34;https://CRAN.R-project.org/package=eurostat&#34;&gt;eurostat&lt;/a&gt; on &lt;a href=&#34;http://ropengov.org/&#34;&gt;rOpenGov&lt;/a&gt;&lt;/p&gt;
&lt;div id=&#34;accessing-and-tidying-the-data-programmatically&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Accessing and tidying the data programmatically&lt;/h2&gt;
&lt;p&gt;The iotables package is in a way an extension to the &lt;em&gt;eurostat&lt;/em&gt; R package, which provides programmatic access to the &lt;a href=&#34;https://ec.europa.eu/eurostat&#34;&gt;Eurostat&lt;/a&gt; data warehouse. The reason for releasing a new package is that working with SIOTs requires plenty of meticulous data wrangling based on various &lt;em&gt;metadata&lt;/em&gt; sources, apart from only accessing the &lt;em&gt;data&lt;/em&gt; itself. When working with matrix equations, the bar is higher than with tidy data. Not only must your rows and columns match, but their ordering must strictly conform to the quadrants of the matrix system, including the connecting trade or tax matrices.&lt;/p&gt;
&lt;p&gt;When you download a country’s SIOT table, you receive a long form data frame, a very-very long one, which contains the matrix values and their labels like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## Table naio_10_cp1700 cached at /var/folders/nb/sxk6cbzd5455n3_rhxnw2xnw0000gn/T//Rtmp7lAZZa/eurostat/naio_10_cp1700_date_code_FF.rds&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb2&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb2-1&#34;&gt;&lt;a href=&#34;#cb2-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# we save it for further reference here &lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-2&#34;&gt;&lt;a href=&#34;#cb2-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;saveRDS&lt;/span&gt;(naio_10_cp1700, &lt;span class=&#34;st&#34;&gt;&amp;quot;not_included/naio_10_cp1700_date_code_FF.rds&amp;quot;&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb2-3&#34;&gt;&lt;a href=&#34;#cb2-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-4&#34;&gt;&lt;a href=&#34;#cb2-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# should you need to retrieve the large tempfiles, they are in &lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-5&#34;&gt;&lt;a href=&#34;#cb2-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;dir&lt;/span&gt; (&lt;span class=&#34;fu&#34;&gt;file.path&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;tempdir&lt;/span&gt;(), &lt;span class=&#34;st&#34;&gt;&amp;quot;eurostat&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb3&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb3-1&#34;&gt;&lt;a href=&#34;#cb3-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;dplyr&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;slice_head&lt;/span&gt;(naio_10_cp1700, &lt;span class=&#34;at&#34;&gt;n =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;5&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 5 x 7
##   unit    stk_flow induse  prod_na geo       time        values
##   &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;    &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;     &amp;lt;date&amp;gt;       &amp;lt;dbl&amp;gt;
## 1 MIO_EUR DOM      CPA_A01 B1G     EA19      2019-01-01 141873.
## 2 MIO_EUR DOM      CPA_A01 B1G     EU27_2020 2019-01-01 174976.
## 3 MIO_EUR DOM      CPA_A01 B1G     EU28      2019-01-01 187814.
## 4 MIO_EUR DOM      CPA_A01 B2A3G   EA19      2019-01-01      0 
## 5 MIO_EUR DOM      CPA_A01 B2A3G   EU27_2020 2019-01-01      0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The metadata reads like this: the units are in millions of euros, we are analyzing domestic flows, and the national account items &lt;code&gt;B1-B2&lt;/code&gt; for the industry &lt;code&gt;A01&lt;/code&gt;. The information of a 64x64 matrix (the SIOT) and its connecting matrices, such as taxes, or employment, or &lt;span class=&#34;math inline&#34;&gt;\(CO_{2}\)&lt;/span&gt; emissions, must be placed exactly in one correct ordering of columns and rows. Every single data wrangling error will usually lead to an error (the matrix equation has no solution), or, what is worse, in a very difficult to trace algebraic error. Our package not only labels this data meaningfully, but creates very tidy data frames that contain each necessary matrix of vector with a key column.&lt;/p&gt;
&lt;p&gt;iotables package contains the vocabularies (abbreviations and human readable labels) of three statistical vocabularies: the so called &lt;code&gt;COICOP&lt;/code&gt; product codes, the &lt;code&gt;NACE&lt;/code&gt; industry codes, and the vocabulary of the &lt;code&gt;ESA2010&lt;/code&gt; definition of national accounts (which is the government equivalent of corporate accounting).&lt;/p&gt;
&lt;p&gt;Our package currently solves all equations for direct, indirect effects, multipliers and inter-industry linkages. Backward linkages show what happens with the suppliers of an industry, such as catering or advertising in the case of music festivals, if the festivals reopen. The forward linkages show how much extra demand this creates for connecting services that treat festivals as a ‘supplier’, such as cultural tourism.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;example&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Example&lt;/h2&gt;
&lt;p&gt;Let’s take Slovakia’s employment data as an example and match it with the latest structural information on from the &lt;a href=&#34;http://appsso.eurostat.ec.europa.eu/nui/show.do?wai=true&amp;amp;dataset=naio_10_cp1700&#34;&gt;Symmetric input-output table at basic prices (product by product)&lt;/a&gt; Eurostat product.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## Downloading employment data from the Eurostat database.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Table lfsq_egan22d cached at /var/folders/nb/sxk6cbzd5455n3_rhxnw2xnw0000gn/T//Rtmp7lAZZa/eurostat/lfsq_egan22d_date_code_FF.rds&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A quick look at the Eurostat website already shows that there is a lot of work ahead to make the data look like an actual symmetric input-output table.&lt;/p&gt;
&lt;p&gt;iotable’s &lt;code&gt;iotable_get()&lt;/code&gt; function downloads the data and does basic labelling and preprocessing on the raw Eurostat files. Because of the size of the unfiltered dataset on Eurostat, the following code may take several minutes to run.&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb7&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb7-1&#34;&gt;&lt;a href=&#34;#cb7-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;sk_io &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt;  &lt;span class=&#34;fu&#34;&gt;iotable_get&lt;/span&gt; ( &lt;span class=&#34;at&#34;&gt;labelled_io_data =&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;NULL&lt;/span&gt;, &lt;/span&gt;
&lt;span id=&#34;cb7-2&#34;&gt;&lt;a href=&#34;#cb7-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                        &lt;span class=&#34;at&#34;&gt;source =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;naio_10_cp1700&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;geo =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;SK&amp;quot;&lt;/span&gt;, &lt;/span&gt;
&lt;span id=&#34;cb7-3&#34;&gt;&lt;a href=&#34;#cb7-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                        &lt;span class=&#34;at&#34;&gt;year =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;2015&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;unit =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;MIO_EUR&amp;quot;&lt;/span&gt;, &lt;/span&gt;
&lt;span id=&#34;cb7-4&#34;&gt;&lt;a href=&#34;#cb7-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                        &lt;span class=&#34;at&#34;&gt;stk_flow =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;TOTAL&amp;quot;&lt;/span&gt;,&lt;/span&gt;
&lt;span id=&#34;cb7-5&#34;&gt;&lt;a href=&#34;#cb7-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                        &lt;span class=&#34;at&#34;&gt;labelling =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;iotables&amp;quot;&lt;/span&gt; )&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## Reading cache file /var/folders/nb/sxk6cbzd5455n3_rhxnw2xnw0000gn/T//Rtmp7lAZZa/eurostat/naio_10_cp1700_date_code_FF.rds&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Table  naio_10_cp1700  read from cache file:  /var/folders/nb/sxk6cbzd5455n3_rhxnw2xnw0000gn/T//Rtmp7lAZZa/eurostat/naio_10_cp1700_date_code_FF.rds&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Saving 808 input-output tables into the temporary directory
## /var/folders/nb/sxk6cbzd5455n3_rhxnw2xnw0000gn/T//Rtmp7lAZZa&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Saved the raw data of this table type in temporary directory /var/folders/nb/sxk6cbzd5455n3_rhxnw2xnw0000gn/T//Rtmp7lAZZa/naio_10_cp1700.rds.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;input_coefficient_matrix_create()&lt;/code&gt; creates an input coefficient matrix, which is used for most of the analytical functions.&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math inline&#34;&gt;\(a_{ij}\)&lt;/span&gt; = &lt;span class=&#34;math inline&#34;&gt;\(X_{ij}\)&lt;/span&gt; / &lt;span class=&#34;math inline&#34;&gt;\(x_j\)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;It checks that the columns are in correct order and additionally it fills up 0 values with 0.000001 to avoid division with zero.&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb12&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb12-1&#34;&gt;&lt;a href=&#34;#cb12-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;input_coeff_matrix_sk &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;input_coefficient_matrix_create&lt;/span&gt;(&lt;/span&gt;
&lt;span id=&#34;cb12-2&#34;&gt;&lt;a href=&#34;#cb12-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;at&#34;&gt;data_table =&lt;/span&gt; sk_io&lt;/span&gt;
&lt;span id=&#34;cb12-3&#34;&gt;&lt;a href=&#34;#cb12-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## Columns and rows of real_estate_imputed_a, extraterriorial_organizations are all zeros and will be removed.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then you can create the Leontieff-inverse, which contains all the structural information about the relationships of 64x64 sectors of the chosen country (in this case, Slovakia) ready for the main equations of input-output economics:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb14&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb14-1&#34;&gt;&lt;a href=&#34;#cb14-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;I_sk &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;leontieff_inverse_create&lt;/span&gt;(input_coeff_matrix_sk)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And extract the primary inputs:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb15&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb15-1&#34;&gt;&lt;a href=&#34;#cb15-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;primary_inputs_sk &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;coefficient_matrix_create&lt;/span&gt;(&lt;/span&gt;
&lt;span id=&#34;cb15-2&#34;&gt;&lt;a href=&#34;#cb15-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;at&#34;&gt;data_table =&lt;/span&gt; sk_io, &lt;/span&gt;
&lt;span id=&#34;cb15-3&#34;&gt;&lt;a href=&#34;#cb15-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;at&#34;&gt;total =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;#39;output&amp;#39;&lt;/span&gt;, &lt;/span&gt;
&lt;span id=&#34;cb15-4&#34;&gt;&lt;a href=&#34;#cb15-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;at&#34;&gt;return =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;#39;primary_inputs&amp;#39;&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## Columns and rows of real_estate_imputed_a, extraterriorial_organizations are all zeros and will be removed.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now let’s try to figure out what happens when the government tries to stimulate the economy in three sectors: agriculture, car manufacturing, and R&amp;amp;D with 1 billion euros. Direct effects measure the initial, direct impact of the change in demand and supply for a product. When production goes up, it will create demand in all supply industries (backward linkages) and create opportunities in the industries that use the product themselves (forward linkages).&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb17&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb17-1&#34;&gt;&lt;a href=&#34;#cb17-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;direct_effects_create&lt;/span&gt;( primary_inputs_sk, I_sk ) &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb17-2&#34;&gt;&lt;a href=&#34;#cb17-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;select&lt;/span&gt; ( &lt;span class=&#34;fu&#34;&gt;all_of&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;iotables_row&amp;quot;&lt;/span&gt;, &lt;span class=&#34;st&#34;&gt;&amp;quot;agriculture&amp;quot;&lt;/span&gt;,&lt;/span&gt;
&lt;span id=&#34;cb17-3&#34;&gt;&lt;a href=&#34;#cb17-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                    &lt;span class=&#34;st&#34;&gt;&amp;quot;motor_vechicles&amp;quot;&lt;/span&gt;, &lt;span class=&#34;st&#34;&gt;&amp;quot;research_development&amp;quot;&lt;/span&gt;))) &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb17-4&#34;&gt;&lt;a href=&#34;#cb17-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;filter&lt;/span&gt; (.data&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;iotables_row &lt;span class=&#34;sc&#34;&gt;%in%&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;gva_effect&amp;quot;&lt;/span&gt;, &lt;span class=&#34;st&#34;&gt;&amp;quot;wages_salaries_effect&amp;quot;&lt;/span&gt;, &lt;/span&gt;
&lt;span id=&#34;cb17-5&#34;&gt;&lt;a href=&#34;#cb17-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                                    &lt;span class=&#34;st&#34;&gt;&amp;quot;imports_effect&amp;quot;&lt;/span&gt;, &lt;span class=&#34;st&#34;&gt;&amp;quot;output_effect&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;##            iotables_row agriculture motor_vechicles research_development
## 1        imports_effect   1.3684350       2.3028203            0.9764921
## 2 wages_salaries_effect   0.2713804       0.3183523            0.3828014
## 3            gva_effect   0.9669621       0.9790771            0.9669467
## 4         output_effect   2.2876287       3.9840251            2.2579634&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Car manufacturing requires a large number of imported components, so increased demand for cars will also create growth in importing activities. Increase in R&amp;amp;D activity will mostly affect local wages because research is job-intensive. As we can see, the effect on imports, wages, gross value added (which will end up in the GDP) and output changes are very different in these three sectors.&lt;/p&gt;
&lt;p&gt;This is not the total effect, because some of the increased production will translate into income, which in turn will be used to create further demand in all parts of the domestic economy. The total effect is characterized by multipliers.&lt;/p&gt;
&lt;p&gt;The multipliers can be solved with the following function:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb19&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb19-1&#34;&gt;&lt;a href=&#34;#cb19-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;multipliers_sk &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;input_multipliers_create&lt;/span&gt;( &lt;/span&gt;
&lt;span id=&#34;cb19-2&#34;&gt;&lt;a href=&#34;#cb19-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  primary_inputs_sk &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb19-3&#34;&gt;&lt;a href=&#34;#cb19-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;    &lt;span class=&#34;fu&#34;&gt;filter&lt;/span&gt; (.data&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;iotables_row &lt;span class=&#34;sc&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;gva&amp;quot;&lt;/span&gt;), I_sk ) &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And select a few industries:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb20&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb20-1&#34;&gt;&lt;a href=&#34;#cb20-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;set.seed&lt;/span&gt;(&lt;span class=&#34;dv&#34;&gt;12&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb20-2&#34;&gt;&lt;a href=&#34;#cb20-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;multipliers_sk &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb20-3&#34;&gt;&lt;a href=&#34;#cb20-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  tidyr&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;pivot_longer&lt;/span&gt; ( &lt;span class=&#34;sc&#34;&gt;-&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;all_of&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;iotables_row&amp;quot;&lt;/span&gt;), &lt;/span&gt;
&lt;span id=&#34;cb20-4&#34;&gt;&lt;a href=&#34;#cb20-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                        &lt;span class=&#34;at&#34;&gt;names_to =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;industry&amp;quot;&lt;/span&gt;, &lt;/span&gt;
&lt;span id=&#34;cb20-5&#34;&gt;&lt;a href=&#34;#cb20-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                        &lt;span class=&#34;at&#34;&gt;values_to =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;GVA_multiplier&amp;quot;&lt;/span&gt;) &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb20-6&#34;&gt;&lt;a href=&#34;#cb20-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;select&lt;/span&gt; (&lt;span class=&#34;sc&#34;&gt;-&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;all_of&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;iotables_row&amp;quot;&lt;/span&gt;)) &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb20-7&#34;&gt;&lt;a href=&#34;#cb20-7&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;arrange&lt;/span&gt;( &lt;span class=&#34;sc&#34;&gt;-&lt;/span&gt;.data&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;GVA_multiplier) &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb20-8&#34;&gt;&lt;a href=&#34;#cb20-8&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  dplyr&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;sample_n&lt;/span&gt;(&lt;span class=&#34;dv&#34;&gt;8&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 8 x 2
##   industry               GVA_multiplier
##   &amp;lt;chr&amp;gt;                           &amp;lt;dbl&amp;gt;
## 1 motor_vechicles                  7.81
## 2 wood_products                    2.27
## 3 mineral_products                 2.83
## 4 human_health                     1.53
## 5 post_courier                     2.23
## 6 sewage                           1.82
## 7 basic_metals                     4.16
## 8 real_estate_services_b           1.48&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;package-vignettes&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Package vignettes&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/germany_1990.html&#34;&gt;Germany 1990&lt;/a&gt; provides an introduction of input-output economics and re-creates the examples of the &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/germany_1990.html&#34;&gt;Eurostat Manual of Supply, Use and Input-Output Tables&lt;/a&gt;, by Jörg Beutel (Eurostat Manual).&lt;/p&gt;
&lt;p&gt;The &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/united_kingdom_2010.html&#34;&gt;United Kingdom Input-Output Analytical Tables Daniel Antal, based on the work edited by Richard Wild&lt;/a&gt; is a use case on how to correctly import data from outside Eurostat (i.e. not with &lt;code&gt;eurostat::get_eurostat()&lt;/code&gt;) and join it properly to a SIOT. We also used this example to create unit tests of our functions from a published, official government statistical release.&lt;/p&gt;
&lt;p&gt;Finally, &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/working_with_eurostat.html&#34;&gt;Working With Eurostat Data&lt;/a&gt; is a detailed use case of working with all the current functionalities of the package by comparing two economies, Czechia and Slovakia and guides you through a lot more examples than this short blogpost.&lt;/p&gt;
&lt;p&gt;Our package was originally developed to calculate GVA and employment effects for the Slovak music industry, and similar calculations for the Hungarian film tax shelter. We can now programmatically create reproducible multipliers for all European economies in the &lt;a href=&#34;https://music.dataobservatory.eu/&#34;&gt;Digital Music Observatory&lt;/a&gt;, and create further indicators for economic policy making in the &lt;a href=&#34;https://economy.dataobservatory.eu/&#34;&gt;Economy Data Observatory&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;environmental-impact-analysis&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Environmental Impact Analysis&lt;/h2&gt;
&lt;p&gt;Our package allows the calculation of various economic policy scenarios, such as changing the VAT on meat or effects of re-opening music festivals on aggregate demand, GDP, tax revenues, or employment. But what about &lt;span class=&#34;math inline&#34;&gt;\(CO_{2}\)&lt;/span&gt;, methane and other greenhouse gas effects of the reopening festivals, or increasing meat prices?&lt;/p&gt;
&lt;p&gt;Technically our package can already calculate such effects, but to do so, you have to carefully match further statistical vocabulary items used by the European Environmental Agency about air pollutants and greenhouse gases.&lt;/p&gt;
&lt;p&gt;The last released version of &lt;em&gt;iotables&lt;/em&gt; is Importing and Manipulating Symmetric Input-Output Tables (Version 0.4.4). Zenodo. &lt;a href=&#34;https://zenodo.org/record/4897472&#34;&gt;https://doi.org/10.5281/zenodo.4897472&lt;/a&gt;, and we are already working on a new major release. In that release, we are planning to build in the necessary vocabulary into the metadata functions to increase the functionality of the package, and create new indicators for our &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34;&gt;Green Deal Data Observatory&lt;/a&gt;. This experimental data observatory is creating new, high quality statistical indicators from open governmental and open science data sources that has not seen daylight yet.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;ropengov-datathon&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;rOpenGov and the EU Datathon Challenges&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;http://ropengov.org/&#34;&gt;rOpenGov&lt;/a&gt; is a community of open governmental data and statistics developers with many packages that make programmatic access and work with open data possible in the R language. &lt;a href=&#34;https://reprex.nl/&#34;&gt;Reprex&lt;/a&gt; is a Dutch-startup that teamed up with rOpenGov and other open collaboration partners to create a technologically and financially feasible service to exploit reproducible research products for the wider business, scientific and evidence-based policy design community. Open data is a legal concept - it means that you have the rigth to reuse the data, but often the reuse requires significant programming and statistical know-how. We entered into the annual &lt;a href=&#34;https://reprex.nl/project/eu-datathon_2021/&#34;&gt;EU Datathon&lt;/a&gt; competition in all three challenges with our applications to not only provide open-source software, but daily updated, validated, documented, high-quality statistical indicators as open data in an open database. Our &lt;a href=&#34;https://iotables.dataobservatory.eu/&#34;&gt;iotables&lt;/a&gt; package is one of our many open-source building blocks to make open data more accessible to all.&lt;/p&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>Visualizing City of Helsinki procurements with geofi-package</title>
  <link>http://ropengov.org/2021/04/helsinki-ostodata/</link>
  <pubDate>Thu, 01 Apr 2021 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2021/04/helsinki-ostodata/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2021/04/helsinki-ostodata/index.en_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;
&lt;style type=&#34;text/css&#34;&gt;
pre &gt; code.sourceCode { white-space: pre; position: relative; }
pre &gt; code.sourceCode &gt; span { display: inline-block; line-height: 1.25; }
pre &gt; code.sourceCode &gt; span:empty { height: 1.2em; }
code.sourceCode &gt; span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre &gt; code.sourceCode { white-space: pre-wrap; }
pre &gt; code.sourceCode &gt; span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
  { counter-reset: source-line 0; }
pre.numberSource code &gt; span
  { position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code &gt; span &gt; a:first-child::before
  { content: counter(source-line);
    position: relative; left: -1em; text-align: right; vertical-align: baseline;
    border: none; display: inline-block;
    -webkit-touch-callout: none; -webkit-user-select: none;
    -khtml-user-select: none; -moz-user-select: none;
    -ms-user-select: none; user-select: none;
    padding: 0 4px; width: 4em;
    color: #aaaaaa;
  }
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
div.sourceCode
  {  background-color: #f8f8f8; }
@media screen {
pre &gt; code.sourceCode &gt; span &gt; a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ef2929; } /* Alert */
code span.an { color: #8f5902; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #c4a000; } /* Attribute */
code span.bn { color: #0000cf; } /* BaseN */
code span.cf { color: #204a87; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4e9a06; } /* Char */
code span.cn { color: #000000; } /* Constant */
code span.co { color: #8f5902; font-style: italic; } /* Comment */
code span.cv { color: #8f5902; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #8f5902; font-weight: bold; font-style: italic; } /* Documentation */
code span.dt { color: #204a87; } /* DataType */
code span.dv { color: #0000cf; } /* DecVal */
code span.er { color: #a40000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #0000cf; } /* Float */
code span.fu { color: #000000; } /* Function */
code span.im { } /* Import */
code span.in { color: #8f5902; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #204a87; font-weight: bold; } /* Keyword */
code span.op { color: #ce5c00; font-weight: bold; } /* Operator */
code span.ot { color: #8f5902; } /* Other */
code span.pp { color: #8f5902; font-style: italic; } /* Preprocessor */
code span.sc { color: #000000; } /* SpecialChar */
code span.ss { color: #4e9a06; } /* SpecialString */
code span.st { color: #4e9a06; } /* String */
code span.va { color: #000000; } /* Variable */
code span.vs { color: #4e9a06; } /* VerbatimString */
code span.wa { color: #8f5902; font-weight: bold; font-style: italic; } /* Warning */
&lt;/style&gt;


&lt;p&gt;City of Helsinki public procurements have been &lt;a href=&#34;https://hri.fi/data/dataset//helsingin-kaupungin-ostot&#34;&gt;available as open data since 2014&lt;/a&gt;. High quality data like this is obviously of great interest to many and [several interesting applications and visualizations] have been made available.&lt;/p&gt;
&lt;p&gt;With some additional &lt;a href=&#34;http://avoindata.prh.fi&#34;&gt;open data from Finnish Patent and Registration Office&lt;/a&gt; and data wrangling, the location of the city supplier companies could be made visible. &lt;a href=&#34;https://CRAN.R-project.org/package=geofi&#34;&gt;Geofi-package&lt;/a&gt;, just released in CRAN, provides excellent tools for this sort of task, alongside dplyr-package’s lightning-fast join and mutate operations.&lt;/p&gt;
&lt;p&gt;The Patent and Registration Office data could be accessed by making API calls with the unique Business ID (“Y-tunnus”) of each company. A limitation was that information was available only for limited companies, cooperatives and similar entities, leaving out public institutions, third sector (independent sector) actors and sole proprietor type enterprises.&lt;/p&gt;
&lt;p&gt;Hadley Wickham’s httr-package vignette &lt;a href=&#34;https://cran.r-project.org/web/packages/httr/vignettes/api-packages.html&#34;&gt;Best practices for API packages&lt;/a&gt; provided a good starting point for building our own custom function “prh_api”, which made it possible to access company information with relative ease. In practice the task was not only smooth sailing as the API had a limit of 300 calls per minute, to be shared between all API users. Downloading information took approximately 4 seconds for one company (15 calls per minute), which added up to a significant amount of hours when the dataset had over 30,000 unique Business IDs.&lt;/p&gt;
&lt;div id=&#34;downloading-and-processing-the-data&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Downloading and processing the data&lt;/h3&gt;
&lt;p&gt;Rows with invalid BIDs can be removed with hetu-package’s bid_ctrl-function:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb1&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb1-1&#34;&gt;&lt;a href=&#34;#cb1-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(hetu)&lt;/span&gt;
&lt;span id=&#34;cb1-2&#34;&gt;&lt;a href=&#34;#cb1-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(dplyr)&lt;/span&gt;
&lt;span id=&#34;cb1-3&#34;&gt;&lt;a href=&#34;#cb1-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;read.csv&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;http://openspending.hel.ninja/files/ostot/helsingin-ostot-all.csv&amp;quot;&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb1-4&#34;&gt;&lt;a href=&#34;#cb1-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;valid_ytunnus &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;bid_ctrl&lt;/span&gt;(helsingin_ostot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;toimittaja_ytunnus)&lt;/span&gt;
&lt;span id=&#34;cb1-5&#34;&gt;&lt;a href=&#34;#cb1-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot2 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; helsingin_ostot[&lt;span class=&#34;fu&#34;&gt;which&lt;/span&gt;(helsingin_ostot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;valid_ytunnus),]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;At this point I produced a vector or unique business ID’s from the dataset, so that same information would not be downloaded more than once, and use this dataset to download data from Patent and Registration Office API. However, as the process is so time consuming, I will not reproduce the process. Below is an example with just one Business ID number “0494571-4”:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb2&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb2-1&#34;&gt;&lt;a href=&#34;#cb2-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# A vector containing unique Business IDs&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-2&#34;&gt;&lt;a href=&#34;#cb2-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;unique_ytunnus &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;unique&lt;/span&gt;(helsingin_ostot2&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;toimittaja_ytunnus)&lt;/span&gt;
&lt;span id=&#34;cb2-3&#34;&gt;&lt;a href=&#34;#cb2-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-4&#34;&gt;&lt;a href=&#34;#cb2-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Example: Getting information for one Business ID and filtering the data&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-5&#34;&gt;&lt;a href=&#34;#cb2-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-6&#34;&gt;&lt;a href=&#34;#cb2-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;yrityksen_tiedot &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; jsonlite&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;fromJSON&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;http://avoindata.prh.fi/tr/v1/0494571-4&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;simplifyVector =&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;TRUE&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb2-7&#34;&gt;&lt;a href=&#34;#cb2-7&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;poimitut_tiedot &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;NULL&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-8&#34;&gt;&lt;a href=&#34;#cb2-8&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;poimitut_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;businessId &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; yrityksen_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;results&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;businessId&lt;/span&gt;
&lt;span id=&#34;cb2-9&#34;&gt;&lt;a href=&#34;#cb2-9&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;poimitut_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;street &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; yrityksen_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;results&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;addresses[[&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;]]&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;street[&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;]&lt;/span&gt;
&lt;span id=&#34;cb2-10&#34;&gt;&lt;a href=&#34;#cb2-10&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;poimitut_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;city &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; yrityksen_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;results&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;addresses[[&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;]]&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;city[&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;]&lt;/span&gt;
&lt;span id=&#34;cb2-11&#34;&gt;&lt;a href=&#34;#cb2-11&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;poimitut_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;postCode &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; yrityksen_tiedot&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;results&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;addresses[[&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;]]&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;postCode[&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;]&lt;/span&gt;
&lt;span id=&#34;cb2-12&#34;&gt;&lt;a href=&#34;#cb2-12&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-13&#34;&gt;&lt;a href=&#34;#cb2-13&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;poimitut_tiedot &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;as.data.frame&lt;/span&gt;(poimitut_tiedot)&lt;/span&gt;
&lt;span id=&#34;cb2-14&#34;&gt;&lt;a href=&#34;#cb2-14&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb2-15&#34;&gt;&lt;a href=&#34;#cb2-15&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot3 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;left_join&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x =&lt;/span&gt; helsingin_ostot2, &lt;span class=&#34;at&#34;&gt;y =&lt;/span&gt; poimitut_tiedot, &lt;span class=&#34;at&#34;&gt;by =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;toimittaja_ytunnus&amp;quot;&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;businessId&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With more than one Business ID, the code above can be made into its own function and used with lapply function.&lt;/p&gt;
&lt;p&gt;If the company information could be downloaded from the API, the information most likely contained the zip code, address and city of the company. If these are missing it was most likely due to the API call producing error 404. Below is visualized the number of missing zip codes:&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb3&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb3-1&#34;&gt;&lt;a href=&#34;#cb3-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Prepared dataset that has above operations&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb3-2&#34;&gt;&lt;a href=&#34;#cb3-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;load&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;~/helsingin_ostot3.RData&amp;quot;&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb3-3&#34;&gt;&lt;a href=&#34;#cb3-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(ggplot2)&lt;/span&gt;
&lt;span id=&#34;cb3-4&#34;&gt;&lt;a href=&#34;#cb3-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;ggplot&lt;/span&gt;(helsingin_ostot3, &lt;span class=&#34;fu&#34;&gt;aes&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;fill=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;is.na&lt;/span&gt;(postCode), &lt;span class=&#34;at&#34;&gt;x=&lt;/span&gt;year)) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb3-5&#34;&gt;&lt;a href=&#34;#cb3-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;geom_bar&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;position=&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;&amp;quot;stack&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;stat=&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;&amp;quot;count&amp;quot;&lt;/span&gt;) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb3-6&#34;&gt;&lt;a href=&#34;#cb3-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;labs&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;Year&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;y =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;Rows&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;fill =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;Missing &lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;\n&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;zip code&amp;quot;&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2021/04/helsinki-ostodata/index.en_files/figure-html/puuttuvat_visualisointi-1.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The closer we are to present day, the smaller the proportion of missing data becomes.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;top-20-municipalities-with-most-procurements&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Top-20 municipalities with most procurements&lt;/h3&gt;
&lt;p&gt;Company’s zip code is a good starting point to determine where purchased services, items and materials come from. The data could be visualized with zip code areas, but that would produce a hard to read map with too many details. Municipality level visualization will be adequate for our purposes.&lt;/p&gt;
&lt;p&gt;While zip code areas and municipality borders do not always align perfectly, the zip code area can be assigned to the municipality which has the majority of buildings in the zip code area &lt;a href=&#34;https://www.tilastokeskus.fi/tup/karttaaineistot/postinumeroalueet.html&#34;&gt;(Tilastokeskus 2020)&lt;/a&gt;. Keen readers may have noticed that the data from API already had city and even street level data, but as city names can be in Finnish or in Swedish, it is simpler to look up municipality names by using an unambiguous zip code value.&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb4&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb4-1&#34;&gt;&lt;a href=&#34;#cb4-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(geofi)&lt;/span&gt;
&lt;span id=&#34;cb4-2&#34;&gt;&lt;a href=&#34;#cb4-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(dplyr)&lt;/span&gt;
&lt;span id=&#34;cb4-3&#34;&gt;&lt;a href=&#34;#cb4-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;zipcodes &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; geofi&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;get_zipcodes&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;year =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;2021&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb4-4&#34;&gt;&lt;a href=&#34;#cb4-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-5&#34;&gt;&lt;a href=&#34;#cb4-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Transform sf-object to a regular data frame&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-6&#34;&gt;&lt;a href=&#34;#cb4-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;zipcodes &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;as.data.frame&lt;/span&gt;(zipcodes) &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb4-7&#34;&gt;&lt;a href=&#34;#cb4-7&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;select&lt;/span&gt;(kuntanro, posti_alue)&lt;/span&gt;
&lt;span id=&#34;cb4-8&#34;&gt;&lt;a href=&#34;#cb4-8&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot4 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;left_join&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x =&lt;/span&gt; helsingin_ostot3, &lt;span class=&#34;at&#34;&gt;y =&lt;/span&gt; zipcodes, &lt;span class=&#34;at&#34;&gt;by=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;postCode&amp;quot;&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;posti_alue&amp;quot;&lt;/span&gt;))&lt;/span&gt;
&lt;span id=&#34;cb4-9&#34;&gt;&lt;a href=&#34;#cb4-9&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-10&#34;&gt;&lt;a href=&#34;#cb4-10&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;municipalities &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; geofi&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;get_municipalities&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;year =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;2021&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb4-11&#34;&gt;&lt;a href=&#34;#cb4-11&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;municipalities &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; municipalities &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb4-12&#34;&gt;&lt;a href=&#34;#cb4-12&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;select&lt;/span&gt;(kunta, kunta_name)&lt;/span&gt;
&lt;span id=&#34;cb4-13&#34;&gt;&lt;a href=&#34;#cb4-13&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-14&#34;&gt;&lt;a href=&#34;#cb4-14&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot4 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;right_join&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x =&lt;/span&gt; municipalities, &lt;span class=&#34;at&#34;&gt;y =&lt;/span&gt; helsingin_ostot4, &lt;span class=&#34;at&#34;&gt;by=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;kunta&amp;quot;&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;kuntanro&amp;quot;&lt;/span&gt;))&lt;/span&gt;
&lt;span id=&#34;cb4-15&#34;&gt;&lt;a href=&#34;#cb4-15&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-16&#34;&gt;&lt;a href=&#34;#cb4-16&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Group procurements by municipality&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-17&#34;&gt;&lt;a href=&#34;#cb4-17&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot5 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; helsingin_ostot4 &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb4-18&#34;&gt;&lt;a href=&#34;#cb4-18&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;group_by&lt;/span&gt;(kunta_name) &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb4-19&#34;&gt;&lt;a href=&#34;#cb4-19&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;summarise&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;kunta_summa =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;sum&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;as.numeric&lt;/span&gt;(summa), &lt;span class=&#34;at&#34;&gt;na.rm =&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;FALSE&lt;/span&gt;)) &lt;/span&gt;
&lt;span id=&#34;cb4-20&#34;&gt;&lt;a href=&#34;#cb4-20&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-21&#34;&gt;&lt;a href=&#34;#cb4-21&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Print top 20 municipalities&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb4-22&#34;&gt;&lt;a href=&#34;#cb4-22&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;slice_max&lt;/span&gt;(helsingin_ostot5, &lt;span class=&#34;at&#34;&gt;order_by =&lt;/span&gt; kunta_summa, &lt;span class=&#34;at&#34;&gt;n =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;20&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre class=&#34;img-fluid&#34;&gt;&lt;code&gt;## Simple feature collection with 20 features and 2 fields (with 1 geometry empty)
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 215353.4 ymin: 6640920 xmax: 588843.4 ymax: 7298654
## projected CRS:  ETRS89 / TM35FIN(E,N)
## # A tibble: 20 x 3
##    kunta_name  kunta_summa                                                  geom
##    &amp;lt;chr&amp;gt;             &amp;lt;dbl&amp;gt;                                    &amp;lt;MULTIPOLYGON [m]&amp;gt;
##  1 &amp;lt;NA&amp;gt;       12164973424.                                                 EMPTY
##  2 Helsinki    5214226687. (((402737.7 6680700, 402069.8 6680535, 400326.8 6678…
##  3 Espoo        741586021. (((375773.7 6691597, 377355.9 6680366, 379983.8 6681…
##  4 Vantaa       629136624. (((392811.8 6694857, 399192.7 6692524, 396012.8 6689…
##  5 Kuopio       470253261. (((581015.6 7009317, 585462.3 7007121, 588843.4 7002…
##  6 Kouvola      149758496. (((511075 6780902, 512323 6772856, 508132 6769735, 5…
##  7 Tuusula      129772884. (((397271.6 6711736, 391885.8 6710060, 392411.8 6702…
##  8 Vaasa         90367794. (((259700 7001591, 253892.6 6990776, 242914.5 700120…
##  9 Turku         61425478. (((251038.1 6731422, 245370.1 6713651, 244865.9 6708…
## 10 Tampere       60124288. (((346626.3 6854536, 347270.1 6836030, 334112.1 6814…
## 11 Hyvinkää      51569708. (((396836.4 6726577, 392888.2 6717250, 385516.6 6715…
## 12 Kerava        49833397. (((399192.7 6692524, 392811.8 6694857, 392727.4 6700…
## 13 Raasepori     44527631. (((299881.1 6640940, 297472.1 6640920, 296412.5 6642…
## 14 Oulu          42513844. (((418101.2 7220618, 417351.5 7219858, 415092.4 7219…
## 15 Lahti         42463258. (((448838.6 6774406, 453144.3 6766188, 452204.9 6761…
## 16 Nurmijärvi    22549476. (((385516.6 6715109, 388809.1 6711136, 382590.6 6697…
## 17 Kemi          21518412. (((396561 7287772, 392601.9 7283067, 392361.4 728345…
## 18 Raisio        20195233. (((239031.6 6717088, 236093.7 6712495, 230992.2 6711…
## 19 Porvoo        18877028. (((441843.9 6673817, 440194.5 6673207, 436276.1 6673…
## 20 Padasjoki     18830945. (((422133.8 6800321, 420573.8 6797563, 415159 679860…&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As expected, the largest sums for procurements were from Helsinki itself and the neighbouring cities of Espoo and Vantaa. Somewhat surprisingly, Kuopio wedges ahead of Kouvola and Tuusula, which are located geographically closer to the Helsinki metropolitan area.&lt;/p&gt;
&lt;p&gt;However, the largest amount is credited to the NA group, with 12 billion euros over 8 years. This probably includes for the most part procurements from third sector entities and public sector organizations, highlighting their large role in the Finnish economy.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;choropleth-and-flow-map-of-the-top-20-municipalities&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Choropleth and flow map of the top 20 municipalities&lt;/h3&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb6&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb6-1&#34;&gt;&lt;a href=&#34;#cb6-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(sf)&lt;/span&gt;
&lt;span id=&#34;cb6-2&#34;&gt;&lt;a href=&#34;#cb6-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(dplyr)&lt;/span&gt;
&lt;span id=&#34;cb6-3&#34;&gt;&lt;a href=&#34;#cb6-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-4&#34;&gt;&lt;a href=&#34;#cb6-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Remove NA group&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-5&#34;&gt;&lt;a href=&#34;#cb6-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;helsingin_ostot6 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; helsingin_ostot5 &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb6-6&#34;&gt;&lt;a href=&#34;#cb6-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;filter&lt;/span&gt;(kunta_name &lt;span class=&#34;sc&#34;&gt;%in%&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;setdiff&lt;/span&gt;(helsingin_ostot5&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;kunta_name, &lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;cn&#34;&gt;NA&lt;/span&gt;)))&lt;/span&gt;
&lt;span id=&#34;cb6-7&#34;&gt;&lt;a href=&#34;#cb6-7&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-8&#34;&gt;&lt;a href=&#34;#cb6-8&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;kunnat_top20_summat &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;slice_max&lt;/span&gt;(helsingin_ostot6, &lt;span class=&#34;at&#34;&gt;order_by =&lt;/span&gt; kunta_summa, &lt;span class=&#34;at&#34;&gt;n =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;20&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb6-9&#34;&gt;&lt;a href=&#34;#cb6-9&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-10&#34;&gt;&lt;a href=&#34;#cb6-10&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Highlighting the top 20 with red borders&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-11&#34;&gt;&lt;a href=&#34;#cb6-11&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;ggplot&lt;/span&gt;() &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-12&#34;&gt;&lt;a href=&#34;#cb6-12&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;geom_sf&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;data =&lt;/span&gt; helsingin_ostot6, &lt;span class=&#34;fu&#34;&gt;aes&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;fill =&lt;/span&gt; kunta_summa), &lt;span class=&#34;at&#34;&gt;color =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;alpha&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;white&amp;quot;&lt;/span&gt;, &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;/&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;3&lt;/span&gt;)) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-13&#34;&gt;&lt;a href=&#34;#cb6-13&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;labs&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;fill =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;Helsingin ostot, €&amp;quot;&lt;/span&gt;) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-14&#34;&gt;&lt;a href=&#34;#cb6-14&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;scale_fill_gradient2&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;n.breaks =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;6&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;trans =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;log10&amp;quot;&lt;/span&gt;) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb6-15&#34;&gt;&lt;a href=&#34;#cb6-15&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;geom_sf&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;data =&lt;/span&gt; kunnat_top20_summat, &lt;span class=&#34;at&#34;&gt;col=&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;&amp;quot;red&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;size=&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2021/04/helsinki-ostodata/index.en_files/figure-html/top20_kunnat-1.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Geofi-package has the option to draw municipality central localities as POINT-geometries. With small modification these can be turned into LINESTRINGs, which have a starting point at the municipality and end point in Helsinki, that can be thought of as flow markers. The example below is very rudimentary, but &lt;a href=&#34;https://jcheshire.com/visualisation/mapping-flows/&#34;&gt;at their best flow maps can be very beautiful&lt;/a&gt; and convey information in fresh and elegant ways.&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb7&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb7-1&#34;&gt;&lt;a href=&#34;#cb7-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; geofi&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;municipality_central_localities&lt;/span&gt;
&lt;span id=&#34;cb7-2&#34;&gt;&lt;a href=&#34;#cb7-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-3&#34;&gt;&lt;a href=&#34;#cb7-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Turn ALL CAPS municipality names to Capital Case with custom capwords-function found in base R&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-4&#34;&gt;&lt;a href=&#34;#cb7-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;capwords &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;cf&#34;&gt;function&lt;/span&gt;(s, &lt;span class=&#34;at&#34;&gt;strict =&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;FALSE&lt;/span&gt;) {&lt;/span&gt;
&lt;span id=&#34;cb7-5&#34;&gt;&lt;a href=&#34;#cb7-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;    cap &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;cf&#34;&gt;function&lt;/span&gt;(s) &lt;span class=&#34;fu&#34;&gt;paste&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;toupper&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;substring&lt;/span&gt;(s, &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;, &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;)),&lt;/span&gt;
&lt;span id=&#34;cb7-6&#34;&gt;&lt;a href=&#34;#cb7-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                  {s &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;substring&lt;/span&gt;(s, &lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;); &lt;span class=&#34;cf&#34;&gt;if&lt;/span&gt;(strict) &lt;span class=&#34;fu&#34;&gt;tolower&lt;/span&gt;(s) &lt;span class=&#34;cf&#34;&gt;else&lt;/span&gt; s},&lt;/span&gt;
&lt;span id=&#34;cb7-7&#34;&gt;&lt;a href=&#34;#cb7-7&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;                             &lt;span class=&#34;at&#34;&gt;sep =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;collapse =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot; &amp;quot;&lt;/span&gt; )&lt;/span&gt;
&lt;span id=&#34;cb7-8&#34;&gt;&lt;a href=&#34;#cb7-8&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;    &lt;span class=&#34;fu&#34;&gt;sapply&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;strsplit&lt;/span&gt;(s, &lt;span class=&#34;at&#34;&gt;split =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot; &amp;quot;&lt;/span&gt;), cap, &lt;span class=&#34;at&#34;&gt;USE.NAMES =&lt;/span&gt; &lt;span class=&#34;sc&#34;&gt;!&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;is.null&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;names&lt;/span&gt;(s)))&lt;/span&gt;
&lt;span id=&#34;cb7-9&#34;&gt;&lt;a href=&#34;#cb7-9&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;}&lt;/span&gt;
&lt;span id=&#34;cb7-10&#34;&gt;&lt;a href=&#34;#cb7-10&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-11&#34;&gt;&lt;a href=&#34;#cb7-11&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;teksti &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;capwords&lt;/span&gt;(keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;teksti, &lt;span class=&#34;at&#34;&gt;strict =&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;TRUE&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb7-12&#34;&gt;&lt;a href=&#34;#cb7-12&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-13&#34;&gt;&lt;a href=&#34;#cb7-13&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;left_join&lt;/span&gt;(keskukset, &lt;span class=&#34;fu&#34;&gt;as.data.frame&lt;/span&gt;(helsingin_ostot6)[,&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;:&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;], &lt;span class=&#34;at&#34;&gt;by =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;teksti&amp;quot;&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;kunta_name&amp;quot;&lt;/span&gt;))&lt;/span&gt;
&lt;span id=&#34;cb7-14&#34;&gt;&lt;a href=&#34;#cb7-14&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-15&#34;&gt;&lt;a href=&#34;#cb7-15&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Count the distance between municipalities and Helsinki for later use&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-16&#34;&gt;&lt;a href=&#34;#cb7-16&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;NULL&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-17&#34;&gt;&lt;a href=&#34;#cb7-17&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;st_distance&lt;/span&gt;(keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom, &lt;span class=&#34;at&#34;&gt;y=&lt;/span&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom[&lt;span class=&#34;dv&#34;&gt;210&lt;/span&gt;,])&lt;/span&gt;
&lt;span id=&#34;cb7-18&#34;&gt;&lt;a href=&#34;#cb7-18&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;as.integer&lt;/span&gt;(keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;sc&#34;&gt;/&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;1000&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb7-19&#34;&gt;&lt;a href=&#34;#cb7-19&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-20&#34;&gt;&lt;a href=&#34;#cb7-20&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Make linestrings&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-21&#34;&gt;&lt;a href=&#34;#cb7-21&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset_linestring &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;st_cast&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;st_union&lt;/span&gt;(keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom[&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;,], keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom[&lt;span class=&#34;dv&#34;&gt;210&lt;/span&gt;,], &lt;span class=&#34;at&#34;&gt;by_feature=&lt;/span&gt;&lt;span class=&#34;cn&#34;&gt;TRUE&lt;/span&gt;),&lt;span class=&#34;st&#34;&gt;&amp;quot;LINESTRING&amp;quot;&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb7-22&#34;&gt;&lt;a href=&#34;#cb7-22&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;cf&#34;&gt;for&lt;/span&gt; (i &lt;span class=&#34;cf&#34;&gt;in&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;:&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;nrow&lt;/span&gt;(keskukset)) {&lt;/span&gt;
&lt;span id=&#34;cb7-23&#34;&gt;&lt;a href=&#34;#cb7-23&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  keskukset_linestring[i] &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;st_cast&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;st_union&lt;/span&gt;(keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom[i,], keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom[&lt;span class=&#34;dv&#34;&gt;210&lt;/span&gt;,], &lt;span class=&#34;at&#34;&gt;by_feature=&lt;/span&gt;&lt;span class=&#34;cn&#34;&gt;TRUE&lt;/span&gt;),&lt;span class=&#34;st&#34;&gt;&amp;quot;LINESTRING&amp;quot;&lt;/span&gt;) &lt;/span&gt;
&lt;span id=&#34;cb7-24&#34;&gt;&lt;a href=&#34;#cb7-24&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;}&lt;/span&gt;
&lt;span id=&#34;cb7-25&#34;&gt;&lt;a href=&#34;#cb7-25&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-26&#34;&gt;&lt;a href=&#34;#cb7-26&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset_helsinkiin &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; keskukset&lt;/span&gt;
&lt;span id=&#34;cb7-27&#34;&gt;&lt;a href=&#34;#cb7-27&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-28&#34;&gt;&lt;a href=&#34;#cb7-28&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset_helsinkiin&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; keskukset_linestring&lt;/span&gt;
&lt;span id=&#34;cb7-29&#34;&gt;&lt;a href=&#34;#cb7-29&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-30&#34;&gt;&lt;a href=&#34;#cb7-30&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset_helsinkiin &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; keskukset_helsinkiin[&lt;span class=&#34;fu&#34;&gt;which&lt;/span&gt;(keskukset_helsinkiin&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;teksti &lt;span class=&#34;sc&#34;&gt;%in%&lt;/span&gt; kunnat_top20_summat&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;kunta_name),]&lt;/span&gt;
&lt;span id=&#34;cb7-31&#34;&gt;&lt;a href=&#34;#cb7-31&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-32&#34;&gt;&lt;a href=&#34;#cb7-32&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Line thickness: 0 for Helsinki, Espoo and Vantaa, and then 4,3,2,2,1...&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-33&#34;&gt;&lt;a href=&#34;#cb7-33&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;ggplot&lt;/span&gt;() &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-34&#34;&gt;&lt;a href=&#34;#cb7-34&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;geom_sf&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;data =&lt;/span&gt; helsingin_ostot6, &lt;span class=&#34;fu&#34;&gt;aes&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;fill =&lt;/span&gt; kunta_summa), &lt;span class=&#34;at&#34;&gt;color =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;alpha&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;white&amp;quot;&lt;/span&gt;, &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;/&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;3&lt;/span&gt;)) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-35&#34;&gt;&lt;a href=&#34;#cb7-35&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;labs&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;fill =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;Helsingin ostot, €&amp;quot;&lt;/span&gt;) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-36&#34;&gt;&lt;a href=&#34;#cb7-36&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;scale_fill_gradient2&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;n.breaks =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;6&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;trans =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;log10&amp;quot;&lt;/span&gt;) &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb7-37&#34;&gt;&lt;a href=&#34;#cb7-37&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;geom_sf&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;data =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;arrange&lt;/span&gt;(keskukset_helsinkiin, &lt;span class=&#34;fu&#34;&gt;desc&lt;/span&gt;(kunta_summa)), &lt;span class=&#34;at&#34;&gt;col=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;alpha&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;red&amp;quot;&lt;/span&gt;, &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;/&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;), &lt;span class=&#34;at&#34;&gt;size=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;dv&#34;&gt;0&lt;/span&gt;,&lt;span class=&#34;dv&#34;&gt;0&lt;/span&gt;,&lt;span class=&#34;dv&#34;&gt;0&lt;/span&gt;,&lt;span class=&#34;dv&#34;&gt;4&lt;/span&gt;,&lt;span class=&#34;dv&#34;&gt;3&lt;/span&gt;,&lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;,&lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;,&lt;span class=&#34;fu&#34;&gt;rep&lt;/span&gt;(&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;, &lt;span class=&#34;dv&#34;&gt;13&lt;/span&gt;)))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2021/04/helsinki-ostodata/index.en_files/figure-html/flowmap-1.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;For the above example to work, it is important to keep the desired data object in the class “sf” so that &lt;a href=&#34;https://github.com/tidyverse/ggplot2/issues/3391#issuecomment-508527985&#34;&gt;ggplot2 can find geom column without trouble&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;effect-of-distance-and-number-of-companies-in-a-municipality&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Effect of distance and number of companies in a municipality&lt;/h3&gt;
&lt;p&gt;Finally, I will illustrate how the number of companies in a municipality and municipality’s distance from Helsinki affect how much city of Helsinki buys from there.&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb8&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb8-1&#34;&gt;&lt;a href=&#34;#cb8-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(sf)&lt;/span&gt;
&lt;span id=&#34;cb8-2&#34;&gt;&lt;a href=&#34;#cb8-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-3&#34;&gt;&lt;a href=&#34;#cb8-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Get number of companies in each municipality from Statfin&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-4&#34;&gt;&lt;a href=&#34;#cb8-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# /PXWeb/api/v1/fi/StatFin/yri/alyr/statfin_alyr_pxt_11dc.px&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-5&#34;&gt;&lt;a href=&#34;#cb8-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(pxweb)&lt;/span&gt;
&lt;span id=&#34;cb8-6&#34;&gt;&lt;a href=&#34;#cb8-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;library&lt;/span&gt;(fuzzyjoin)&lt;/span&gt;
&lt;span id=&#34;cb8-7&#34;&gt;&lt;a href=&#34;#cb8-7&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-8&#34;&gt;&lt;a href=&#34;#cb8-8&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;pxweb_query_list &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb8-9&#34;&gt;&lt;a href=&#34;#cb8-9&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;list&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;Vuosi&amp;quot;&lt;/span&gt;&lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;2019&amp;quot;&lt;/span&gt;),&lt;/span&gt;
&lt;span id=&#34;cb8-10&#34;&gt;&lt;a href=&#34;#cb8-10&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;       &lt;span class=&#34;st&#34;&gt;&amp;quot;Kunta&amp;quot;&lt;/span&gt;&lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;*&amp;quot;&lt;/span&gt;),&lt;/span&gt;
&lt;span id=&#34;cb8-11&#34;&gt;&lt;a href=&#34;#cb8-11&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;       &lt;span class=&#34;st&#34;&gt;&amp;quot;Tiedot&amp;quot;&lt;/span&gt;&lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;Tplukumaara2&amp;quot;&lt;/span&gt;))&lt;/span&gt;
&lt;span id=&#34;cb8-12&#34;&gt;&lt;a href=&#34;#cb8-12&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-13&#34;&gt;&lt;a href=&#34;#cb8-13&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Download data &lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-14&#34;&gt;&lt;a href=&#34;#cb8-14&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;px_data &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb8-15&#34;&gt;&lt;a href=&#34;#cb8-15&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;span class=&#34;fu&#34;&gt;pxweb_get&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;url =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;https://pxnet2.stat.fi/PXWeb/api/v1/fi/StatFin/yri/alyr/statfin_alyr_pxt_11dc.px&amp;quot;&lt;/span&gt;,&lt;/span&gt;
&lt;span id=&#34;cb8-16&#34;&gt;&lt;a href=&#34;#cb8-16&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;            &lt;span class=&#34;at&#34;&gt;query =&lt;/span&gt; pxweb_query_list)&lt;/span&gt;
&lt;span id=&#34;cb8-17&#34;&gt;&lt;a href=&#34;#cb8-17&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-18&#34;&gt;&lt;a href=&#34;#cb8-18&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Convert to data.frame &lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-19&#34;&gt;&lt;a href=&#34;#cb8-19&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;px_data_frame &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;as.data.frame&lt;/span&gt;(px_data, &lt;span class=&#34;at&#34;&gt;column.name.type =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;text&amp;quot;&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;variable.value.type =&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;text&amp;quot;&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb8-20&#34;&gt;&lt;a href=&#34;#cb8-20&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-21&#34;&gt;&lt;a href=&#34;#cb8-21&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;yritykset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;left_join&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x =&lt;/span&gt; px_data_frame, &lt;span class=&#34;at&#34;&gt;y =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;as.data.frame&lt;/span&gt;(helsingin_ostot5), &lt;span class=&#34;at&#34;&gt;by=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;Kunta&amp;quot;&lt;/span&gt;&lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;&amp;quot;kunta_name&amp;quot;&lt;/span&gt;))&lt;/span&gt;
&lt;span id=&#34;cb8-22&#34;&gt;&lt;a href=&#34;#cb8-22&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-23&#34;&gt;&lt;a href=&#34;#cb8-23&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Remove &amp;quot;KOKO SUOMI&amp;quot;, &amp;quot;Tuntematon&amp;quot; (Unknown) ja municipalities that had no procurements from Helsinki&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-24&#34;&gt;&lt;a href=&#34;#cb8-24&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;yritykset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; yritykset[&lt;span class=&#34;fu&#34;&gt;which&lt;/span&gt;(&lt;span class=&#34;sc&#34;&gt;!&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;is.na&lt;/span&gt;(yritykset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;kunta_summa)),]&lt;/span&gt;
&lt;span id=&#34;cb8-25&#34;&gt;&lt;a href=&#34;#cb8-25&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Remove geom-column&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-26&#34;&gt;&lt;a href=&#34;#cb8-26&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;yritykset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;as.data.frame&lt;/span&gt;(yritykset)&lt;/span&gt;
&lt;span id=&#34;cb8-27&#34;&gt;&lt;a href=&#34;#cb8-27&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;yritykset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; yritykset[,&lt;span class=&#34;sc&#34;&gt;-&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;5&lt;/span&gt;]&lt;/span&gt;
&lt;span id=&#34;cb8-28&#34;&gt;&lt;a href=&#34;#cb8-28&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-29&#34;&gt;&lt;a href=&#34;#cb8-29&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; geofi&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;municipality_central_localities&lt;/span&gt;
&lt;span id=&#34;cb8-30&#34;&gt;&lt;a href=&#34;#cb8-30&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-31&#34;&gt;&lt;a href=&#34;#cb8-31&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;left_join&lt;/span&gt;(keskukset, &lt;span class=&#34;fu&#34;&gt;as.data.frame&lt;/span&gt;(helsingin_ostot6)[,&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;:&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;], &lt;span class=&#34;at&#34;&gt;by =&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;teksti&amp;quot;&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;kunta_name&amp;quot;&lt;/span&gt;))&lt;/span&gt;
&lt;span id=&#34;cb8-32&#34;&gt;&lt;a href=&#34;#cb8-32&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-33&#34;&gt;&lt;a href=&#34;#cb8-33&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;NULL&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-34&#34;&gt;&lt;a href=&#34;#cb8-34&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;st_distance&lt;/span&gt;(keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom, &lt;span class=&#34;at&#34;&gt;y=&lt;/span&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;geom[&lt;span class=&#34;dv&#34;&gt;210&lt;/span&gt;,])&lt;/span&gt;
&lt;span id=&#34;cb8-35&#34;&gt;&lt;a href=&#34;#cb8-35&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;as.integer&lt;/span&gt;(keskukset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel &lt;span class=&#34;sc&#34;&gt;/&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;1000&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb8-36&#34;&gt;&lt;a href=&#34;#cb8-36&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-37&#34;&gt;&lt;a href=&#34;#cb8-37&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;keskukset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; keskukset &lt;span class=&#34;sc&#34;&gt;%&amp;gt;%&lt;/span&gt; &lt;/span&gt;
&lt;span id=&#34;cb8-38&#34;&gt;&lt;a href=&#34;#cb8-38&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  dplyr&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;select&lt;/span&gt;(teksti, distance_to_hel, geom)&lt;/span&gt;
&lt;span id=&#34;cb8-39&#34;&gt;&lt;a href=&#34;#cb8-39&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;  &lt;/span&gt;
&lt;span id=&#34;cb8-40&#34;&gt;&lt;a href=&#34;#cb8-40&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Fuzzyjoin-package removes the need for custom functions &lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-41&#34;&gt;&lt;a href=&#34;#cb8-41&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;yritykset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; fuzzyjoin&lt;span class=&#34;sc&#34;&gt;::&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;regex_left_join&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x =&lt;/span&gt; yritykset, &lt;span class=&#34;at&#34;&gt;y =&lt;/span&gt; keskukset, &lt;span class=&#34;at&#34;&gt;by=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;st&#34;&gt;&amp;quot;Kunta&amp;quot;&lt;/span&gt; &lt;span class=&#34;ot&#34;&gt;=&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;teksti&amp;quot;&lt;/span&gt;), &lt;span class=&#34;at&#34;&gt;ignore_case =&lt;/span&gt; &lt;span class=&#34;cn&#34;&gt;TRUE&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb8-42&#34;&gt;&lt;a href=&#34;#cb8-42&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-43&#34;&gt;&lt;a href=&#34;#cb8-43&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Remove outlier, Helsinki&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-44&#34;&gt;&lt;a href=&#34;#cb8-44&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;yritykset &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; yritykset[&lt;span class=&#34;sc&#34;&gt;-&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;which&lt;/span&gt;(yritykset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;Kunta &lt;span class=&#34;sc&#34;&gt;==&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;&amp;quot;Helsinki&amp;quot;&lt;/span&gt;),]&lt;/span&gt;
&lt;span id=&#34;cb8-45&#34;&gt;&lt;a href=&#34;#cb8-45&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-46&#34;&gt;&lt;a href=&#34;#cb8-46&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Draw scatter-plots with smoothened curves&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb8-47&#34;&gt;&lt;a href=&#34;#cb8-47&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;par&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;mfrow=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;c&lt;/span&gt;(&lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;,&lt;span class=&#34;dv&#34;&gt;2&lt;/span&gt;))&lt;/span&gt;
&lt;span id=&#34;cb8-48&#34;&gt;&lt;a href=&#34;#cb8-48&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;scatter.smooth&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x=&lt;/span&gt;yritykset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;`&lt;/span&gt;&lt;span class=&#34;at&#34;&gt;Yritysten toimipaikat (lkm)&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;`&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;y=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;log10&lt;/span&gt;(yritykset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;kunta_summa), &lt;span class=&#34;at&#34;&gt;span =&lt;/span&gt; &lt;span class=&#34;dv&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;sc&#34;&gt;/&lt;/span&gt;&lt;span class=&#34;dv&#34;&gt;5&lt;/span&gt;)&lt;/span&gt;
&lt;span id=&#34;cb8-49&#34;&gt;&lt;a href=&#34;#cb8-49&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;scatter.smooth&lt;/span&gt;(&lt;span class=&#34;at&#34;&gt;x=&lt;/span&gt;yritykset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;distance_to_hel, &lt;span class=&#34;at&#34;&gt;y=&lt;/span&gt;&lt;span class=&#34;fu&#34;&gt;log10&lt;/span&gt;(yritykset&lt;span class=&#34;sc&#34;&gt;$&lt;/span&gt;kunta_summa))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2021/04/helsinki-ostodata/index.en_files/figure-html/scatterplotit_ja_regressiot-1.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb9&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb9-1&#34;&gt;&lt;a href=&#34;#cb9-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# Compare two different regression models&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb9-2&#34;&gt;&lt;a href=&#34;#cb9-2&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;fit1 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;lm&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;log10&lt;/span&gt;(kunta_summa) &lt;span class=&#34;sc&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;`&lt;/span&gt;&lt;span class=&#34;at&#34;&gt;Yritysten toimipaikat (lkm)&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;`&lt;/span&gt;, &lt;span class=&#34;at&#34;&gt;data=&lt;/span&gt;yritykset)&lt;/span&gt;
&lt;span id=&#34;cb9-3&#34;&gt;&lt;a href=&#34;#cb9-3&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;fit2 &lt;span class=&#34;ot&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&#34;fu&#34;&gt;lm&lt;/span&gt;(&lt;span class=&#34;fu&#34;&gt;log10&lt;/span&gt;(kunta_summa) &lt;span class=&#34;sc&#34;&gt;~&lt;/span&gt; &lt;span class=&#34;st&#34;&gt;`&lt;/span&gt;&lt;span class=&#34;at&#34;&gt;Yritysten toimipaikat (lkm)&lt;/span&gt;&lt;span class=&#34;st&#34;&gt;`&lt;/span&gt; &lt;span class=&#34;sc&#34;&gt;+&lt;/span&gt; distance_to_hel, &lt;span class=&#34;at&#34;&gt;data=&lt;/span&gt;yritykset)&lt;/span&gt;
&lt;span id=&#34;cb9-4&#34;&gt;&lt;a href=&#34;#cb9-4&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb9-5&#34;&gt;&lt;a href=&#34;#cb9-5&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# If needed, draw regression plots&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb9-6&#34;&gt;&lt;a href=&#34;#cb9-6&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# abline(lm(log10(kunta_summa) ~ `Yritysten toimipaikat (lkm)`, data=yritykset))&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb9-7&#34;&gt;&lt;a href=&#34;#cb9-7&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;co&#34;&gt;# abline(lm(log10(kunta_summa) ~ `Yritysten toimipaikat (lkm)` + distance_to_hel, data=yritykset))&lt;/span&gt;&lt;/span&gt;
&lt;span id=&#34;cb9-8&#34;&gt;&lt;a href=&#34;#cb9-8&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;/span&gt;
&lt;span id=&#34;cb9-9&#34;&gt;&lt;a href=&#34;#cb9-9&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;summary&lt;/span&gt;(fit1)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre class=&#34;img-fluid&#34;&gt;&lt;code&gt;## 
## Call:
## lm(formula = log10(kunta_summa) ~ `Yritysten toimipaikat (lkm)`, 
##     data = yritykset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.2431 -0.7991  0.0834  0.9631  2.4039 
## 
## Coefficients:
##                                 Estimate Std. Error t value            Pr(&amp;gt;|t|)
## (Intercept)                   4.75557723 0.07845464   60.62 &amp;lt;0.0000000000000002
## `Yritysten toimipaikat (lkm)` 0.00039387 0.00003411   11.55 &amp;lt;0.0000000000000002
##                                  
## (Intercept)                   ***
## `Yritysten toimipaikat (lkm)` ***
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
## 
## Residual standard error: 1.159 on 294 degrees of freedom
## Multiple R-squared:  0.312,  Adjusted R-squared:  0.3097 
## F-statistic: 133.3 on 1 and 294 DF,  p-value: &amp;lt; 0.00000000000000022&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;sourceCode&#34; id=&#34;cb11&#34;&gt;&lt;pre class=&#34;sourceCode r&#34;&gt;&lt;code class=&#34;sourceCode r&#34;&gt;&lt;span id=&#34;cb11-1&#34;&gt;&lt;a href=&#34;#cb11-1&#34; aria-hidden=&#34;true&#34; tabindex=&#34;-1&#34;&gt;&lt;/a&gt;&lt;span class=&#34;fu&#34;&gt;summary&lt;/span&gt;(fit2)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre class=&#34;img-fluid&#34;&gt;&lt;code&gt;## 
## Call:
## lm(formula = log10(kunta_summa) ~ `Yritysten toimipaikat (lkm)` + 
##     distance_to_hel, data = yritykset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.3314 -0.7943  0.1029  0.9196  2.7113 
## 
## Coefficients:
##                                  Estimate  Std. Error t value
## (Intercept)                    5.24186205  0.13519748  38.772
## `Yritysten toimipaikat (lkm)`  0.00036832  0.00003366  10.943
## distance_to_hel               -0.00158104  0.00036166  -4.372
##                                           Pr(&amp;gt;|t|)    
## (Intercept)                   &amp;lt; 0.0000000000000002 ***
## `Yritysten toimipaikat (lkm)` &amp;lt; 0.0000000000000002 ***
## distance_to_hel                          0.0000172 ***
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
## 
## Residual standard error: 1.126 on 292 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.3543, Adjusted R-squared:  0.3499 
## F-statistic: 80.13 on 2 and 292 DF,  p-value: &amp;lt; 0.00000000000000022&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We notice that the number of companies in a municipality and distance from Helsinki are significantly correlated with how successful companies from these municipalities are in selling goods and services to Helsinki. There are, however, some interesting outliers in smaller municipalities that punch above their weight in Helsinki’s procurements. The dataset provides an excellent starting point in identifying these companies and, perhaps, learning from their example.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusion&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;Many of the largest companies in Finland have their headquarters in the capital region (cf. &lt;a href=&#34;https://kaks.fi/wp-content/uploads/2019/04/helsinki-vs-muu-suomi_manninen_tolli.pdf&#34;&gt;Manninen &amp;amp; Tölli 2019&lt;/a&gt;), which may explain why Helsinki, Espoo and Vantaa are so well represented in Helsinki’s procurements. It might be interesting to compare in the future whether regional capitals such as Turku and Tampere also buy majority of their goods and services from the capital region or if they have their own local ecosystems.&lt;/p&gt;
&lt;p&gt;Idealized conditions of perfect competition (no barriers to entry or exist, perfect information, zero transaction costs etc.) do not exist even within a relatively homogeneous national framework, let alone within a heterogeneous single market area such as the EU. For different industry advocacy groups, government organizations and companies support for greater access to EU single market offers great potential and active policy measures aim to lower those barriers to entry to foster competitiveness. Perhaps there is still work left undone in opening up access to local markets such as Helsinki.&lt;/p&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>Hetu-package for handling of Finnish personal identity codes</title>
  <link>http://ropengov.org/2020/10/hetu-package-for-handling-of-finnish-personal-identity-codes/</link>
  <pubDate>Thu, 29 Oct 2020 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2020/10/hetu-package-for-handling-of-finnish-personal-identity-codes/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2020/10/hetu-package-for-handling-of-finnish-personal-identity-codes/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;div id=&#34;general-information&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;General information&lt;/h2&gt;
&lt;p&gt;Hetu-package for R is meant for algorithmic handling of Finnish personal identity numbers (PINs). The package is especially useful for those who wish to extract information from or validate a large number of PINs at a time.&lt;/p&gt;
&lt;p&gt;The toolset for analyzing Finnish PINs was initially developed as a part of sorvi-package, but was later made into a separate package. The development of the hetu-package reached an important milestone in Fall 2020 when it was published in &lt;a href=&#34;https://CRAN.R-project.org/package=hetu&#34;&gt;CRAN&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The development of hetu-package is closely related to &lt;a href=&#34;https://CRAN.R-project.org/package=sweidnumbr&#34;&gt;sweidnumbr&lt;/a&gt;, a similar package meant for analyzing Swedish personal identity numbers (PINs) and organizational identity numbers (OINs). Hetu-package shares similar function names with sweidnumbr, when applicable.&lt;a href=&#34;#fn1&#34; class=&#34;footnote-ref&#34; id=&#34;fnref1&#34;&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;finnish-personal-identification-code-hetu&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Finnish personal identification code, hetu&lt;/h2&gt;
&lt;p&gt;Personal identification code (or: national identification number, national identity number, personal identification number or PIN) is meant to be a unique identifier for individuals. Finnish personal identification number (henkilötunnus, hetu for short) consists of date (DDMMYY), century marker (-, + or A), personal number (NNN) and checkmark (C). Males have an odd personal number and females an even personal number.&lt;a href=&#34;#fn2&#34; class=&#34;footnote-ref&#34; id=&#34;fnref2&#34;&gt;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Personal identity codes are widely used in public and private sectors alike. They are not confidential or secret information, but like every personal information, handling hetu-codes requires consent from the individual or a valid reason.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;algorithmic-handling-of-hetu-pins&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Algorithmic handling of hetu-pins&lt;/h2&gt;
&lt;p&gt;Analyzing and extracting information from Finnish personal identity numbers is rather straightforward even with a naked eye. Hetu-package naturally excels in handling large number of PINs, which would be cumbersome otherwise.&lt;/p&gt;
&lt;p&gt;Hetu-package has functions to extract the following information:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;hetu_date / pin_date: date of birth&lt;/li&gt;
&lt;li&gt;hetu_sex / pin_sex: sex, Male or Female&lt;/li&gt;
&lt;li&gt;hetu_age / pin_age: age in years, months or days (at the time of the query or at a desired date)&lt;/li&gt;
&lt;li&gt;hetu_ctrl / pin_ctrl: validity check for the PIN, TRUE or FALSE&lt;/li&gt;
&lt;/ul&gt;
&lt;div id=&#34;use-of-hetu-package&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Use of hetu-package&lt;/h3&gt;
&lt;p&gt;Installing the package in R from CRAN:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;hetu&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Loading the package and setting a few imaginary PINs for testing:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(hetu)
example_pins &amp;lt;- c(&amp;quot;010101-0101&amp;quot;, &amp;quot;111111-111C&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Hetu-function is the backbone of the package and majority of the information that can be extracted is available as a simple data frame:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;knitr::kable(hetu(example_pins))&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;hetu&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;sex&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;p.num&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;checksum&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;date&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;day&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;month&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;year&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;century&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;valid.pin&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;010101-0101&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;Female&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;010&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1901-01-01&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1901&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;-&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;TRUE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;111111-111C&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;Male&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;111&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;C&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1911-11-11&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1911&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;-&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;TRUE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;There are several alternatives in extracting specific information about a group of PINs, for example date of birth. If the output of the hetu-function is saved as an object, all columns can be normally subsetted. For the convenience of the end user, the information in the data frame can also be extracted by using extract-parameter in the hetu-function or by using one of the specialized functions:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Extracting sex
hetu(example_pins, extract = &amp;quot;sex&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Female&amp;quot; &amp;quot;Male&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;hetu_sex(example_pins)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Female&amp;quot; &amp;quot;Male&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Extracting date of birth
hetu(example_pins, extract = &amp;quot;date&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;1901-01-01&amp;quot; &amp;quot;1911-11-11&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;hetu_date(example_pins)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;1901-01-01&amp;quot; &amp;quot;1911-11-11&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Extracting information on validity
hetu(example_pins, extract = &amp;quot;valid.pin&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;hetu_ctrl(example_pins)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Information that can be extracted only with extract-parameter
hetu(example_pins, extract = &amp;quot;p.num&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;010&amp;quot; &amp;quot;111&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In contrast to other information, extracting age works only with a specialized function. In this example we will also introduce the ability to generate random PINs with rhetu-function:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;example_pins2 &amp;lt;- rhetu(5, start = &amp;quot;1950-01-01&amp;quot;, end = &amp;quot;1995-05-07&amp;quot;)
# Age in years
hetu_age(example_pins2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The age in years has been calculated at 2021-01-31.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 33 69 62 31 43&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Age in months
hetu_age(example_pins2, timespan = &amp;quot;months&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The age in months has been calculated at 2021-01-31.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 403 839 752 383 521&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Age in 2011
hetu_age(example_pins2, date = &amp;quot;2011-01-01&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The age in years has been calculated at 2011-01-01.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 23 59 52 21 33&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Visualization: boxplot grouped by sex
example_pins3 &amp;lt;- rhetu(20, start = &amp;quot;1950-01-01&amp;quot;, end = &amp;quot;1995-05-07&amp;quot;, p.male = 0.5)
boxplot(hetu_age(example_pins3)~hetu_sex(example_pins3), xlab = &amp;quot;&amp;quot;, ylab = &amp;quot;Age in years&amp;quot;, col=c(&amp;quot;cyan&amp;quot;, &amp;quot;magenta&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The age in years has been calculated at 2021-01-31.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2020/10/hetu-package-for-handling-of-finnish-personal-identity-codes/index_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;In some cases diagnostics information for invalid PINs might be useful:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;hetu_diagnostic(&amp;quot;321399-000G&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##           hetu is.temp valid.p.num valid.checksum correct.checksum valid.date
## 21 321399-000G   FALSE       FALSE          FALSE            FALSE      FALSE
##    valid.day valid.month valid.year valid.length valid.century
## 21     FALSE       FALSE       TRUE         TRUE          TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Print only certain columns
hetu_diagnostic(&amp;quot;321399-000G&amp;quot;, extract = c(&amp;quot;valid.p.num&amp;quot;, &amp;quot;valid.length&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##           hetu valid.p.num valid.length
## 21 321399-000G       FALSE         TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;business-identity-numbers-y-tunnus-bid&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Business Identity Numbers (Y-tunnus, BID)&lt;/h2&gt;
&lt;p&gt;As in sweidnumbr, hetu-package has two functions that can be used with Finnish Business Identity Numbers (y-tunnus). Finnish business identity numbers have the form 1234567-8, where the last number is a checknumber.&lt;a href=&#34;#fn3&#34; class=&#34;footnote-ref&#34; id=&#34;fnref3&#34;&gt;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; The following functions are available:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bid_ctrl(bid): checks the valiity of the BID, TRUE or FALSE&lt;/li&gt;
&lt;li&gt;rbid(n): generates n BIDs&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;example_bids &amp;lt;- rbid(2)
example_bids&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;7128741-6&amp;quot; &amp;quot;1963928-5&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bid_ctrl(example_bids)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE TRUE&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;No additional information can be extracted from BIDs.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;references&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;/div&gt;
&lt;div class=&#34;footnotes&#34;&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id=&#34;fn1&#34;&gt;&lt;p&gt;More information about sweidnumbr can be found e.g. from this blogpost: Magnusson, Mans &amp;amp; Bulow, Erik. 2015. R made personal (at least for swedes)!. URL: &lt;a href=&#34;https://ropengov.org/2015/08/r-made-personal-at-least-for-swedes/&#34; class=&#34;uri&#34;&gt;https://ropengov.org/2015/08/r-made-personal-at-least-for-swedes/&lt;/a&gt;&lt;a href=&#34;#fnref1&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&#34;fn2&#34;&gt;&lt;p&gt;Digital and Population Data Services Agency (Digi- ja väestötietovirasto). The personal identity code. URL: &lt;a href=&#34;https://dvv.fi/en/personal-identity-code&#34; class=&#34;uri&#34;&gt;https://dvv.fi/en/personal-identity-code&lt;/a&gt;&lt;a href=&#34;#fnref2&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&#34;fn3&#34;&gt;&lt;p&gt;Finnish Patent and Registration Office. The Business Information System (BIS). URL: &lt;a href=&#34;https://www.prh.fi/en/kaupparekisteri/rekisterointipalvelut/ytj.html&#34; class=&#34;uri&#34;&gt;https://www.prh.fi/en/kaupparekisteri/rekisterointipalvelut/ytj.html&lt;/a&gt;&lt;a href=&#34;#fnref3&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>geofi R-package for accessing Statistics Finland spatial data</title>
  <link>http://ropengov.org/2020/02/geofi-en/</link>
  <pubDate>Tue, 11 Feb 2020 10:53:45 +0000</pubDate>
  
<guid>http://ropengov.org/2020/02/geofi-en/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2020/02/geofi-en/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;p&gt;We have slowly started developing a follow-up package with &lt;a href=&#34;https://ropengov.github.io/&#34;&gt;ropengov&lt;/a&gt;-posse for &lt;a href=&#34;https://ropengov.github.io/gisfin/&#34;&gt;gisfin&lt;/a&gt;-package named &lt;a href=&#34;https://ropengov.github.io/geofi/index.html&#34;&gt;&lt;code&gt;geofi&lt;/code&gt;&lt;/a&gt;. Package provides access to few sources of &lt;em&gt;Finnish open geospatial data&lt;/em&gt; from R. We are focusing in administrative regions at the moment and our primary source of data is Statistics Finland and their &lt;code&gt;wfs&lt;/code&gt;-api. You can use functions in &lt;code&gt;geofi&lt;/code&gt; fecth data such as &lt;em&gt;municipality borders&lt;/em&gt;, &lt;em&gt;postal code areas&lt;/em&gt; sekä &lt;em&gt;population grids&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;geofi&lt;/code&gt; is not published in CRAN yet and you cant install it using &lt;code&gt;install.packages()&lt;/code&gt;. But you can install it directly from Github with &lt;code&gt;remotes::install_github(&#34;ropengov/geofi&#34;)&lt;/code&gt; and try out the following examples. For quick access try our Shiny app at: &lt;a href=&#34;https://muuankarski.shinyapps.io/geofi_selain/&#34;&gt;&lt;code&gt;https://muuankarski.shinyapps.io/geofi_selain/&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Municipalility borders&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(geofi)
library(ggplot2)

municipalities &amp;lt;- get_municipalities(year = 2020, scale = 4500)
ggplot(municipalities) + 
  geom_sf(aes(fill = as.integer(kunta))) +
  scale_fill_viridis_c()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2020/02/geofi-en/index_files/figure-html/municipality_map-1.png&#34; width=&#34;900&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Postal code areas&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;zipcodes &amp;lt;- get_zipcodes(year = 2020) 
ggplot(zipcodes) + 
  geom_sf(aes(fill = as.integer(posti_alue)), color = alpha(&amp;quot;white&amp;quot;, 1/3)) +
  scale_fill_viridis_c()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2020/02/geofi-en/index_files/figure-html/zipcode_map-1.png&#34; width=&#34;900&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Population grids&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pop_grid &amp;lt;- get_population_grid(year = 2018, resolution = 5)
ggplot(pop_grid) + 
  geom_sf(aes(fill = objectid), color = alpha(&amp;quot;white&amp;quot;, 1/3)) +
  scale_fill_viridis_c()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2020/02/geofi-en/index_files/figure-html/population_grid_data-1.png&#34; width=&#34;900&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Regions (maakunnant), health care districts (sairaanhoitopiirit) and many more regional breakdowns are based on municipality divide. &lt;code&gt;get_municipalities()&lt;/code&gt;-function returns data containing &lt;a href=&#34;https://ropengov.github.io/geofi/reference/municipality_key_2020.html&#34;&gt;these attribute variables&lt;/a&gt; (year 2020), that you can use to aggregate from municipality level upwards.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(dplyr)
municipalities &amp;lt;- get_municipalities(year = 2019, scale = 4500)
regions &amp;lt;- municipalities %&amp;gt;% 
  group_by(maakunta_name_fi) %&amp;gt;% summarise()
ggplot(regions) + 
  geom_sf(aes(fill = maakunta_name_fi)) +
  scale_fill_viridis_d()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2020/02/geofi-en/index_files/figure-html/aggregate-1.png&#34; width=&#34;900&#34; /&gt;&lt;/p&gt;
&lt;p&gt;You can join &lt;code&gt;geofi&lt;/code&gt; datas with other attribute, too. Below is an example on how to get (non-spatial statistical) data from Statistics Finland and create a map on population at municipality level.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(tidyr)
library(pxweb)
library(janitor)

municipalities17 &amp;lt;- get_municipalities(year = 2017)

# pull municipality data from Statistics Finland
pxweb_query_list &amp;lt;-
  list(&amp;quot;Alue 2019&amp;quot;=c(&amp;quot;*&amp;quot;),
       &amp;quot;Tiedot&amp;quot;=c(&amp;quot;*&amp;quot;),
       &amp;quot;Vuosi&amp;quot;=c(&amp;quot;2017&amp;quot;))
px_data &amp;lt;-
  pxweb_get(url = &amp;quot;http://pxnet2.stat.fi/PXWeb/api/v1/fi/Kuntien_avainluvut/2019/kuntien_avainluvut_2019_aikasarja.px&amp;quot;,
            query = pxweb_query_list)
# Convert to data.frame
tk_data &amp;lt;- as.data.frame(px_data, column.name.type = &amp;quot;text&amp;quot;, variable.value.type = &amp;quot;text&amp;quot;)
tk_data2 &amp;lt;- tk_data %&amp;gt;%
  rename(name = `Alue 2019`) %&amp;gt;%
  mutate(name = as.character(name),
         # Paste Tiedot and Vuosi
         Tiedot = paste(Tiedot, Vuosi)) %&amp;gt;%
  select(-Vuosi) %&amp;gt;%
  spread(Tiedot, `Kuntien avainluvut`) %&amp;gt;%
  as_tibble()
tk_data3 &amp;lt;- janitor::clean_names(tk_data2)

# Join with Statistics Finland attribute data
dat &amp;lt;- left_join(municipalities17, tk_data3)

ggplot(dat) + 
  geom_sf(aes(fill = vakiluku_2017), color = alpha(&amp;quot;white&amp;quot;, 1/3)) +
  scale_fill_viridis_c(trans = &amp;quot;sqrt&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2020/02/geofi-en/index_files/figure-html/municipalities_with_data-1.png&#34; width=&#34;900&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Take a look at the Github-site and join us!&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>Cost of academic publishing in Finland 2010-2017</title>
  <link>http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/</link>
  <pubDate>Wed, 05 Dec 2018 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;div id=&#34;subscription-costs-to-scientific-publishers-in-finland-2010-2017&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Subscription costs to scientific publishers in Finland 2010-2017&lt;/h2&gt;
&lt;p&gt;This post provides a brief overview of the subscription prices paid by Finnish research institutions to academic publishers in 2010-2017.&lt;/p&gt;
&lt;p&gt;Finland is possibly the only country that has systematically released subscription prices that research libraries pay to academic publishers &lt;a href=&#34;https://avointiede.fi/fi/avoimet-julkaisut/kustantajahintatietoja&#34;&gt;as open data&lt;/a&gt;. The data is available for all major research institutions in Finland. Recently, an updated data set for 2010–2017 was made openly available at &lt;a href=&#34;https://avointiede.fi/fi/avoimet-julkaisut/kustantajahintatietoja&#34;&gt;avointiede.fi&lt;/a&gt;. In addition, &lt;a href=&#34;https://www.kansalliskirjasto.fi/extra/finelib_julkinen/&#34;&gt;full text agreements&lt;/a&gt; with many publishers have been &lt;a href=&#34;http://finelib.fi/negotiations/agreements/&#34;&gt;made available&lt;/a&gt;. The subscription price data was initially provided by Finnish Ministry of Education and Culture, and its Open Science and Research Initiative funded 2014–2017, after a successful Freedom of Information request by the Finnish Open Science community, as summarized &lt;a href=&#34;https://www.mostlyphysics.net/blog/2016/6/13/finland-takes-leading-role-in-the-openness-of-academic-journal-pricing&#34;&gt;elsewhere&lt;/a&gt;. This post updates our &lt;a href=&#34;http://ropengov.github.io/r/2016/06/10/FOI/&#34;&gt;earlier analysis&lt;/a&gt;. For source code, see &lt;a href=&#34;https://github.com/rOpenGov/Finland-Subscription-Costs&#34;&gt;main.R&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;overall-subscription-costs-2010-2017&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overall subscription costs 2010-2017&lt;/h2&gt;
&lt;p&gt;Based on the data collected by the Ministry of Education, Finland paid in total
198.7 million EUR subscription and other
fees on scientific publishing in 2010-2017. The average annual costs for in Finland were 25 MEUR.&lt;/p&gt;
&lt;p&gt;Data for the top-10 publishers in the UK 2010-2014 is available in &lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72/#B45&#34;&gt;Lawson, Meghreblian &amp;amp; Brook, 2017&lt;/a&gt; (&lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72&#34;&gt;Table 1&lt;/a&gt;). During this period the UK paid altogether 4319 MEUR (rough estimate based on the exchange rate June 12, 2016) for the top-10 publishers. Finland paid 62 MEUR for the same top-10 publishers in 2010-2014. This is 17.1% of the UK expenditure &lt;em&gt;per capita&lt;/em&gt;. It could be that the data is not directly comparable but this will require further investigation.&lt;/p&gt;
&lt;p&gt;Information for Finland is available by agreement type, organization
type, and subscription category.&lt;/p&gt;
&lt;div id=&#34;costs-by-publisher&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Costs by publisher&lt;/h3&gt;
&lt;p&gt;Overall, the Finnish data covers 376 unique publishers. The figure indicates the total subscription fees paid to the top publishers 2010-2017. One third of the total costs go to Elsevier, which has been often &lt;a href=&#34;https://gowers.wordpress.com/2014/04/24/elsevier-journals-some-facts/&#34;&gt;criticized&lt;/a&gt; for its huge &lt;a href=&#34;http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0127502&#34;&gt;profit margins&lt;/a&gt;. The costs are given per bundle, so we cannot compare average journal prices among individual publishers based on this data.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/index_files/figure-html/foi-totalcosts2b-1.png&#34; width=&#34;960&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The total costs paid to scientific publishers by Finland have increased roughly 10% per year in 2010-2017 (annual increase is indicated in the left figure). The top-10 publishers correspond to 75% of the overall costs (right figure).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/index_files/figure-html/foi-costbytime-1.png&#34; width=&#34;860px&#34; /&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## Warning: The `i` argument of ``[`()` can&amp;#39;t be a matrix as of tibble 3.0.0.
## Convert to a vector.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let us compare the relative increase in publisher costs. The costs are normalized to 1 in 2010, and the top 10 publishers with the highest cost increase in 2010-2017 are shown. The 275 publishers that did not have declared costs in 2010 or 2017 are excluded.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/index_files/figure-html/foi-timebypublisher2b-1.png&#34; width=&#34;1344&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;costs-by-organization&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Costs by organization&lt;/h3&gt;
&lt;p&gt;The Finnish data collection includes 82 organizations. The universities (‘yliopisto’) are responsible of 78.9% of all costs (left figure); University of Helsinki had the highest total costs in 2010-2017 (37.9 MEUR; top institutions shown in the right figure).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/index_files/figure-html/foi-totalcosts2d-1.png&#34; width=&#34;420px&#34; /&gt;&lt;img src=&#34;http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/index_files/figure-html/foi-totalcosts2d-2.png&#34; width=&#34;420px&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Finally, let us compare the relative share of costs per
institution. The top organizations with the highest total costs are
shown.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2018/12/cost-of-academic-publishing-in-finland-2010-2017/index_files/figure-html/foi-timebyorganization2c-1.png&#34; width=&#34;1344&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>Presenting at useR2017 in Brussels</title>
  <link>http://ropengov.org/2017/07/user2017/</link>
  <pubDate>Wed, 05 Jul 2017 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2017/07/user2017/</guid>
  <description>&lt;p&gt;Annual &lt;a href=&#34;https://user2017.brussels/&#34;&gt;useR&lt;/a&gt;-conference is organised this year in Brussels, Belgium. rOpenGov will be presenting in two occasion on Thursday 6th. Leo Lahti will give an general overview of rOpenGov in his lightning talk &lt;a href=&#34;https://github.com/rOpenGov/slides/raw/master/20170706-UseR-Bru/2017-useR-rOpenGov.pdf&#34;&gt;rOpenGov - Community project for open government data&lt;/a&gt; at 17.50 in room 3.02. At 13.30 in Plenary room Markus Kainu will speak about &lt;a href=&#34;http://software.markuskainu.fi/ropengov/user2017_slides/slides.pdf&#34;&gt;Community-based learning and knowledge sharing - Teaching R withing organisation using edu -package&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/images/ropengov_user2017.jpg&#34; alt=&#34;Last minute preparation with critical comments from Joona Lehtomäki&#34;&gt;&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>eurostat package published in R Journal</title>
  <link>http://ropengov.org/2017/04/eurostat-preprint/</link>
  <pubDate>Fri, 14 Apr 2017 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2017/04/eurostat-preprint/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2017/04/eurostat-preprint/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;p&gt;A formal publication on the &lt;code&gt;eurostat&lt;/code&gt; &lt;a href=&#34;http://cran.r-project.org/&#34;&gt;CRAN&lt;/a&gt; R package has now appeared on-line in &lt;a href=&#34;https://journal.r-project.org/archive/2017/RJ-2017-019/index.html&#34;&gt;R Journal&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Further details on installation and use are provided at the &lt;a href=&#34;http://ropengov.github.io/eurostat&#34;&gt;package homepage&lt;/a&gt;, &lt;a href=&#34;http://ropengov.github.io/eurostat/articles/eurostat_tutorial.html&#34;&gt;tutorial&lt;/a&gt; and &lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/source/pdfs/eurostat_cheatsheet.pdf&#34;&gt;cheat sheet&lt;/a&gt;. &lt;a href=&#34;http://ropengov.github.io/eurostat/articles/blogposts.html&#34;&gt;Blog posts&lt;/a&gt; provide further and more advanced examples on the package use.&lt;/p&gt;
&lt;p&gt;We are also collecting information about publications using the eurostat R package, suggestions are &lt;a href=&#34;http://ropengov.github.io/eurostat/articles/publications.html&#34;&gt;welcome&lt;/a&gt;.&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>Scientific journal subscription costs in Finland 2010-2015</title>
  <link>http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/</link>
  <pubDate>Fri, 10 Jun 2016 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;p&gt;&lt;a href=&#34;http://openscience.fi/-/transparency-and-openness-to-scientific-publishing-the-finnish-research-organisations-pay-millions-of-euros-annually-to-the-large-publishers&#34;&gt;Detailed information on journal subscription costs paid to individual
publishers by the Finnish research
institutions&lt;/a&gt; has been released by the Finnish Ministry of Education and Culture, and its Open Science and Research Initiative funded 2014–2017 (&lt;a href=&#34;http://urn.fi/urn:nbn:fi:csc-kata20160609091336769027&#34;&gt;Academic Publishing Costs in Finland 2010–2015&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;With this, &lt;strong&gt;Finland becomes to our knowledge
the first country where annual subscription fees for all individual
publishers and all major research institutions have been made
available&lt;/strong&gt;, spanning the years 2010-2015. Similar information has
been previously released for some, but not all publishers and research
institutions in the UK and US; and related activities are ongoing in
several countries (see the recent &lt;a href=&#34;http://stuartlawson.org/2016/06/publicly-available-data-on-international-journal-subscription-costs&#34;&gt;blog post by Stuart
Lawson&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Access to literature is fundamental to academic research but this has been challenged with the rapidly increasing prices of academic journals, which university libraries find increasingly difficult to fund. According to the &lt;a href=&#34;http://www.lib.washington.edu/scholpub/facts/economics&#34;&gt;data from the US Association of Research Libraries&lt;/a&gt; the academic journal subscription charges increased 4x faster than inflation in 1986-2007. The relative variation in prices is also considerable among the publishers (&lt;a href=&#34;http://www.econ.ucsb.edu/~tedb/Journals/PNAS-2014-Bergstrom-1403006111.pdf&#34;&gt;Bergstrom et al. PNAS 2014&lt;/a&gt;). Limited access to detailed pricing information and agreement details are likely to result in suboptimal contracts (&lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72/#B7&#34;&gt;Cockerill, 2006&lt;/a&gt;; &lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72/#B66&#34;&gt;Shieber, 2009&lt;/a&gt;). Improved access to subscription costs can hence be expected to lead to better deals and lower costs for the universities. It can also facilitate transition to the Open Access (OA) publishing model.&lt;/p&gt;
&lt;p&gt;Motivated by all this, and following &lt;a href=&#34;http://gowers.wordpress.com/2014/04/24/elsevier-journals-some-facts/&#34;&gt;Tim Gowers successful FOI request on
the subscription costs for scientific journals in
UK&lt;/a&gt;,
we made a similar request in Finland together with the &lt;a href=&#34;https://www.facebook.com/groups/241398182642057/permalink/411482855633588&#34;&gt;Open Knowledge
Finland association and other Open Science
advocates&lt;/a&gt;. After
the Finnish universities &lt;a href=&#34;https://github.com/okffi-science/2014-tietopyynto-lisenssimaksut&#34;&gt;turned down my FOI request in summer
2014&lt;/a&gt;,
we appealed in court, which decided the case positive for us in August
2015. For an English summary of this process, see a separate post in the &lt;a href=&#34;http://www.mostlyphysics.net/blog/&#34;&gt;MostlyPhysics&lt;/a&gt; blog.&lt;/p&gt;
&lt;p&gt;Here I provide a brief preliminary analysis of the
&lt;a href=&#34;http://avointiede.fi/ajankohtaista/-/asset_publisher/UJglmibGKmbR/content/lapinakyvyytta-ja-avoimuutta-tieteelliseen-julkaisemiseen-tutkimusorganisaatioilta-vuosittain-miljoonia-euroja-suurille-kustantajille?_101_INSTANCE_UJglmibGKmbR_viewMode=view&#34;&gt;data&lt;/a&gt;
on journal subscription fees that was collected and released by the
Ministry of Education Open Science Initiative in Finland. I have
abbreviated some terms as detailed in the source code of
this analysis is &lt;a href=&#34;https://github.com/rOpenGov/2016-Finland-SubscriptionCosts&#34;&gt;maintained in
Github&lt;/a&gt;.&lt;/p&gt;
&lt;div id=&#34;overall-subscription-costs-2010-2015&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overall subscription costs 2010-2015&lt;/h2&gt;
&lt;p&gt;Finland paid in total 131.1 million EUR
subscription and other fees on scientific publishing in 2010-2015. The overall breakup of the costs is available as a separate &lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;table&lt;/a&gt;. The
average annual costs for in Finland were 22 MEUR in
2010-2015; this is one third of the annual subscription costs in
Austria (70 MEUR; &lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72/#B4&#34;&gt;Bauer et al.,
2015&lt;/a&gt;), and
two thirds of the annual expenditure (31 MEUR) in &lt;a href=&#34;http://publicaddress.net/9549&#34;&gt;New
Zealand&lt;/a&gt;. Data for the top-10
publishers in the UK 2010-2014 is available in &lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72/#B45&#34;&gt;Lawson, Meghreblian &amp;amp;
Brook,
2015&lt;/a&gt;
(&lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72&#34;&gt;Table
1&lt;/a&gt;). During this
period the UK paid altogether 4319 MEUR (rough estimate based on the
exchange rate June 12, 2016) for the top-10 publishers. Finland paid
61 MEUR for the same top-10 publishers in the same period, which is
roughly 17% of the UK expenditure per capita (unexpectedly low?). The costs
in the other countries seem unexpectedly high compared to Finland, I
wonder what is the explanation. Either Finland can negotiate
considerably better deals, or, perhaps more likely, the figures from
the different countries are not directly comparable but this will
require further investigation.&lt;/p&gt;
&lt;p&gt;The data set covers all Finnish universities, major public
institutions, and a number of special libraries, information services
and other smaller institutions. Open access article processing charges
(APCs) are not included in this data collection as far as I can see,
although it would be interesting in its own right. It was recently
reported by &lt;a href=&#34;https://avointiede.fi/documents/10864/12232/Avoimen+julkaisemisen+tuen+malli/73838e9b-7924-446c-9c7a-cc8f759919bb&#34;&gt;Naukkarinen
(2016)&lt;/a&gt;
that in 2014 &lt;a href=&#34;https://l.facebook.com/l.php?u=https%3A%2F%2Favointiede.fi%2Fdocuments%2F10864%2F12232%2FAvoimen%2Bjulkaisemisen%2Btuen%2Bmalli%2F73838e9b-7924-446c-9c7a-cc8f759919bb&amp;amp;h=-AQFlKomT&#34;&gt;18% of the articles in Finnish universities were
published as open
access&lt;/a&gt;. It
was also estimated that publishing all articles as Open Access would
have cost 17 MEUR, whereas the subscription fees in 2014 were 22
MEUR. This suggests that the transition to the Open Access model might
be a good idea. Overall, there are roughly 35 000 peer-reviewed
academic journals globally (&lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72/#B71&#34;&gt;Ware &amp;amp; Mabe,
2015&lt;/a&gt;);
less than a third of these (11 000) are open access (&lt;a href=&#34;https://olh.openlibhums.org/articles/10.16995/olh.72/#B12&#34;&gt;DOAJ,
2016&lt;/a&gt;). A
common complaint regarding the (golden) OA model is the relatively
high cost of the article processing charges, typically paid from
primary research funding. With this funding model, money going OA is
frequently seen as being taken away from other research activities,
such as experiments or research personnell. In contrast, publication
costs in the conventional subscription model are funded through
university libraries with secret discloure agreements with the
publishers and hence largely masked from the research community. In
order to assess the true costs of scientific publishing and in order
to facilitate shift to OA, the costs of the subscription model must be
made more transparent and money used for conventional subscription
must be diverted to funding the costs of OA publishing.&lt;/p&gt;
&lt;p&gt;The full data set details the subscription fees also by agreement
type, organization type, and &lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;subscription
category&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;costs-by-publisher&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Costs by publisher&lt;/h2&gt;
&lt;p&gt;The Finnish data covers 244 individual publishers (&lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;see annual costs by publisher&lt;/a&gt;). The figure indicates the total subscription fees paid to the top publishers 2010-2015. Over one third of the total costs go to Elsevier, which has been often &lt;a href=&#34;https://gowers.wordpress.com/2014/04/24/elsevier-journals-some-facts/&#34;&gt;criticized&lt;/a&gt; for its huge &lt;a href=&#34;http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0127502&#34;&gt;profit margins&lt;/a&gt;. The costs are given per bundle, so we cannot compare individual publishers on a per article or per citation basis based on this data set. It was recently estimated elsewhere, however, that Elsevier’s prices per citation are roughly 3x higher than with non-profit publishers; Emerald, Sage, and Taylor &amp;amp; Francis had roughly 10x higher prices (&lt;a href=&#34;http://www.econ.ucsb.edu/~tedb/Journals/PNAS-2014-Bergstrom-1403006111.pdf&#34;&gt;Bergstrom et al. PNAS 2014&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/index_files/figure-html/foi-totalcosts2b-1.png&#34; width=&#34;960&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The total costs paid to scientific publishers by Finland have increased roughly 10% per year in 2010-2015 (annual increase is indicated in the left figure). The top-10 publishers correspond to 77% of the overall costs (right figure). See a separate table for full &lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;annual costs by publisher&lt;/a&gt; (the top-10 publishers shown below).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/index_files/figure-html/foi-costbytime-1.png&#34; width=&#34;870px&#34; /&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;Publisher (costs in MEUR)&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2010&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2011&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2012&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2013&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2014&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2015&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Total&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;%&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Total&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;17.30&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;19.01&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;21.33&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;22.13&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;24.31&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;27.03&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;131.10&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;100.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Elsevier&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.41&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.84&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.38&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.65&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.10&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.58&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;44.96&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;34.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Wiley&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.66&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.06&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.29&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.43&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.57&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.65&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;13.67&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Ebsco&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.81&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.76&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.97&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.09&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.70&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.33&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.88&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Springer&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.30&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.36&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.41&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.45&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.48&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.50&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.49&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.48&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;ProQuest&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.83&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.97&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.98&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.01&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.24&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.77&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Thomson Reuters&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.49&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.54&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.67&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.73&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.77&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.94&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.13&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;SAGE Publications&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.50&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.55&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.65&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.66&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.71&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.86&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.92&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.99&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;American Chemical Society (ACS)&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.51&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.53&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.59&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.62&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.63&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.78&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.66&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.79&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Nature Publishing Group&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.38&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.42&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.58&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.64&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.67&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.62&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.30&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.52&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Let us compare the &lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;relative increase in publisher costs&lt;/a&gt;. The costs are normalized to 1 in 2010, and the top 10 publishers with the highest cost increase in 2010-2015 are shown. The 120 publishers that did not have declared costs in 2010 or 2015 (see &lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;here&lt;/a&gt;) are excluded.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/index_files/figure-html/foi-timebypublisher2b-1.png&#34; width=&#34;1344&#34; /&gt;&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;costs-by-organization&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Costs by organization&lt;/h2&gt;
&lt;p&gt;The Finnish data collection includes 63 organizations (&lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;see annual costs by organization&lt;/a&gt;). The universities (‘yliopisto’) are responsible of 79% of all costs (left figure); University of Helsinki had the highest total costs in 2010-2015 (24.4 MEUR; top institutions shown in the right figure). The Table indicates the annual costs for the top organizations.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;Organization (costs in MEUR)&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2010&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2011&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2012&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2013&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2014&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;2015&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Total&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;%&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Total&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;17.30&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;19.01&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;21.33&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;22.13&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;24.31&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;27.03&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;131.10&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;100.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;University of Helsinki&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.32&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.52&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.91&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.12&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.46&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.05&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;24.39&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;18.60&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Aalto University&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.25&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.40&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.67&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.71&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.88&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.07&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;15.98&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;12.19&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;University of Turku&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.50&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.60&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.76&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.79&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.04&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.35&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.04&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.42&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;University of Oulu&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.49&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.57&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.74&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.83&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.03&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.17&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.84&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.26&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;University of Eastern Finland&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.90&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.35&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.43&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.53&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.69&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.82&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.73&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.66&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;University of Jyväskylä&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.07&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.14&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.27&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.31&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.57&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.73&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8.08&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;University of Tampere&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.91&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.09&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.14&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.28&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.41&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.82&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Tampere University of Technology&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.90&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.05&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.17&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.19&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.21&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.30&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.82&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Åbo Akademi University&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.78&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.79&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.78&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.89&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.97&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.96&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.78&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/index_files/figure-html/foi-totalcosts2d-1.png&#34; width=&#34;420px&#34; /&gt;&lt;img src=&#34;http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/index_files/figure-html/foi-totalcosts2d-2.png&#34; width=&#34;420px&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Finally, let us compare the &lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;relative increase in costs across the institutions&lt;/a&gt;. The costs are normalized to 1 in 2010, and the top 10 organizations with the highest cost increase in 2010-2015 are shown. The 12 organizations that did not have declared costs in 2010 or 2015 (see &lt;a href=&#34;http://data.okf.fi/ropengov/20160613-FOI/dashboard.html&#34;&gt;here&lt;/a&gt;) are excluded.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2016/06/scientific-journal-subscription-costs-in-finland-2010-2015/index_files/figure-html/foi-timebyorganization2b-1.png&#34; width=&#34;1344&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>R made personal (at least for swedes)!</title>
  <link>http://ropengov.org/2015/08/r-made-personal-at-least-for-swedes/</link>
  <pubDate>Thu, 20 Aug 2015 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2015/08/r-made-personal-at-least-for-swedes/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2015/08/r-made-personal-at-least-for-swedes/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;div id=&#34;background&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Background&lt;/h3&gt;
&lt;p&gt;– Who are you? asked Mr Doe.&lt;/p&gt;
&lt;p&gt;– I’m a Hindu! Namrata from India replied.&lt;/p&gt;
&lt;p&gt;– I’m a statistician! said Günther from Germany.&lt;/p&gt;
&lt;p&gt;People of different nationalities tend to identify themselves using different characteristics. In India, your identity might rely on your religion, while in other countries your profession might take its place. In Sweden, you might identify yourself with your almost-world-known (!?) personal identification number (“pin”). This 10 digit number is given to you almost immediately after birth and it often stays with you until your very last breath. The number is similar to a “social security number” but it has a much broader use and it is considered public. It is used in public registers (for education, work, tax payment, healthcare, car ownership etc) and it often serves as a membership number or customer id within companies and member unions. It is also essential for example in the public health and quality registers maintained in Sweden (and other Scandinavian countries) and used for reaserch.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;motivation&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Motivation&lt;/h3&gt;
&lt;p&gt;Naturally, the “pin” is used extensively to distinguish individuals in data sets analysed by R. The number also helps to match data from different sources and it can bring some demographic background data into the bargain, such as birth date (age), sex and geographic origin (depending on your birth year).&lt;/p&gt;
&lt;p&gt;Up until now however, with the lack of a consistent R convention to handle “pins”, the number might be treated as either a 10 or 12 digit numeric (with or without century prefix), a character (with hyphen or a ‘+’-sign to distinguish birth date from suffix numbers) or as a factor variable. But the pin is not a number (to add, subtract or logarithm pins is just nonsense) and it contains more information than captured by the individual characters in a string. Luckily, the new R package &lt;code&gt;sweidnumbr&lt;/code&gt; (released on CRAN) is here for rescue!&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;example&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Example&lt;/h3&gt;
&lt;p&gt;Let’s look at some data (all pins are fake; they have a valid syntax but do not identify any real individuals):&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(sweidnumbr)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## sweidnumbr: R tools to handle swedish identity numbers.
## https://github.com/rOpenGov/sweidnumbr&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;knitr::kable(tail(fake_pins,10))&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;pin&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;53&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;19471130-3022&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;TWIST, LIS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;54&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;19440311-1131&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;NOBLESSE, RAGNAR JOHN&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;55&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;20000805-0523&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;NILSSON, CHOK&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;56&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;19240622-2286&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;CADBURY, LOVISA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;57&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;19020517-1798&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;PLOPP, AUGUST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;58&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;20050111-1123&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;MINT, MARIA ADA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;59&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;19370215-1590&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;NILSSON, BARRY&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;60&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;19970430-3023&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;BERG, ANTO&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;61&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;20031010-1023&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;CENTER, PALL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;62&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;20010218-1823&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;CACAO, EDA&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;So far, pin is just a standard character vector but let’s change that to benefit from all of &lt;code&gt;sweidnumbr&lt;/code&gt;’s features:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pin &amp;lt;- as.pin(fake_pins$pin)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Assumption: Pin of format YYMMDDNNNC is assumed to be less than 100 years old&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;str(pin)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##  &amp;#39;AsIs&amp;#39; chr [1:62] &amp;quot;191212121212&amp;quot; &amp;quot;201212121212&amp;quot; &amp;quot;191212121212&amp;quot; ...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can now also investigate some demographic characteristics almost on the fly (note that pins contained geographical information only up to 1989):&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;par(mfrow = c(1,2))
hist(pin_age(pin), 20, col = &amp;quot;lightgreen&amp;quot;, main = &amp;quot;Age distribution&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## The age has been calculated at 2021-01-29.&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pie(table(pin_sex(pin)), main = &amp;quot;Sex distribution&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;http://ropengov.org/2015/08/r-made-personal-at-least-for-swedes/index_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pin_birthplace(pin[1:8])&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] Stockholms län              Born after 31 december 1989
## [3] Stockholms län              Born after 31 december 1989
## [5] Born after 31 december 1989 Stockholm stad             
## [7] Stockholms län              Born after 31 december 1989
## 28 Levels: Stockholm stad Stockholms län Uppsala län ... Born after 31 december 1989&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;formats&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Formats&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;as.pin&lt;/code&gt; can recognize pins in several different formats such as:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt; as.pin(c(&amp;quot;191212121212&amp;quot;, &amp;quot;1212121212&amp;quot;, &amp;quot;121212-1212&amp;quot;, &amp;quot;121212+1212&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Assumption: Pin of format YYMMDDNNNC is assumed to be less than 100 years old&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;191212121212&amp;quot; &amp;quot;201212121212&amp;quot; &amp;quot;201212121212&amp;quot; &amp;quot;191212121212&amp;quot;
## Personal identity number(s)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It also checks that the numbers follow the correct pin syntax:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;as.pin(&amp;quot;181212121212&amp;quot;) # Pins were introduced in 1946 and only for people not deceased before that&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning in as.pin.character(&amp;quot;181212121212&amp;quot;): Erroneous pin(s) (set to NA).&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] NA
## Personal identity number(s)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;pin_ctrl(&amp;quot;191212121211&amp;quot;) # The last digit is a control number that is checked against preceeding digits&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] FALSE&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;luhn_algo(&amp;quot;191212121211&amp;quot;) # The correct control number can be calculated by the Luhn algorithm&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## &amp;#39;multiplier&amp;#39; set to: c(0, 0, 2, 1, 2, 1, 2, 1, 2, 1, 2, 0)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 2&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;organisational-numbers&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Organisational numbers&lt;/h3&gt;
&lt;p&gt;Not only individual has their personal identification number, so do companies and NGO:s. These features are covered by the &lt;em&gt;oin&lt;/em&gt; group of
functions in the package. Feel free to try them out …&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;other-countries&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Other countries&lt;/h3&gt;
&lt;p&gt;An analogous conversion function is availale for the Finnish social security numbers in the &lt;a href=&#34;https://github.com/rOpenGov/sorvi/blob/master/vignettes/sorvi_tutorial.md&#34;&gt;sorvi&lt;/a&gt; package.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;keep-in-touch&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Keep in touch!&lt;/h3&gt;
&lt;p&gt;… and feel free to suggest enhancements and report bugs to &lt;a href=&#34;https://github.com/rOpenGov/sweidnumbr/issues&#34; class=&#34;uri&#34;&gt;https://github.com/rOpenGov/sweidnumbr/issues&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>Digital humanities with R</title>
  <link>http://ropengov.org/2015/06/digital-humanities-with-r/</link>
  <pubDate>Fri, 12 Jun 2015 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2015/06/digital-humanities-with-r/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2015/06/digital-humanities-with-r/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;p&gt;Digital humanities is one of the fields where reproducible research is
now becoming increasingly popular. We got an opportunity to highlight the
latest advancement in this field in Paris &lt;a href=&#34;http://dhdhi.hypotheses.org/2428&#34;&gt;Digital Humanities
event&lt;/a&gt;, June 12 at &lt;a href=&#34;http://www.dhi-paris.fr/&#34;&gt;Deutsches
Historische Institut Paris&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The ‘digital humanities lab’ is an interactive experiment where the
audience actively participates in designing the analysis of massive
library catalogues. In particular, we focused on publishing activity
in the early modern period by mining the &lt;a href=&#34;http://estc.bl.uk/F/?func=file&amp;amp;file_name=login-bl-estc&#34;&gt;British Library ESTC
catalogue
1470-1800&lt;/a&gt;. To
facilitate this interactive session, we prepared reproducible
presentation slides with
&lt;a href=&#34;http://rmarkdown.rstudio.com/&#34;&gt;Rmarkdown&lt;/a&gt;. To be completed in the
workshop, carrying out data analysis on-the-fly together with the
audience. To reproduce the preliminary slides (we will complete them during the
workshop!), clone the &lt;a href=&#34;https://github.com/rOpenGov/slides&#34;&gt;rOpenGov slide
repository&lt;/a&gt; and run the following
commands in R:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(rmarkdown)
render(&amp;quot;slides/20150612-Paris/20150611-Paris.Rmd&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unfortunately the ESTC data itself is not public, so we could not
share it. The slides and source code are fully reproducible, however,
so you can modify the template for your own purposes and check out our
summaries of the ESTC data collection at our &lt;a href=&#34;https://github.com/rOpenGov/estc&#34;&gt;estc
site&lt;/a&gt;. You may also like to read the
related &lt;a href=&#34;(http://douglasduhaime.com/blog/mapping-the-early-english-book-trade)&#34;&gt;excellent blog
post&lt;/a&gt;
of Douglas Duhaime on the same data set.&lt;/p&gt;
&lt;p&gt;We used a combination of &lt;a href=&#34;http://www.rstudio.com/&#34;&gt;RStudio&lt;/a&gt; and the
&lt;a href=&#34;http://github.com/rOpenGov/estc&#34;&gt;estc&lt;/a&gt; and
&lt;a href=&#34;http://github.com/rOpenGov/bibliographica&#34;&gt;bibliographica&lt;/a&gt; R packages
that are designed for bibliographic data analysis. Combined with the
vast analytical capabilities of the R statistical ecosystem, these
custom tools for digital humanities provide a rapid development
toolkit for reproducible research of historical document collections.&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>A hierarchical model of Finnish apartment prices</title>
  <link>http://ropengov.org/2015/06/regional-trends-in-finnish-apartment-prices/</link>
  <pubDate>Thu, 11 Jun 2015 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2015/06/regional-trends-in-finnish-apartment-prices/</guid>
  <description>&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;h2 id=&#34;probabilistic-programming-approach-for-regional-trends-in-apartment-prices&#34;&gt;Probabilistic programming approach for regional trends in apartment prices&lt;/h2&gt;
&lt;p&gt;Basing on open data from &lt;a href=&#34;http://www.stat.fi/index_en.html&#34;&gt;Statistics Finland&lt;/a&gt;, we at &lt;a href=&#34;http://reaktor.com/datascience&#34;&gt;Reaktor&lt;/a&gt; modelled Finnish apartment prices and their trends on zip-code level, in the years 2005–2014. Estimates from the model are available as an &lt;a href=&#34;http://kannattaakokauppa.fi/#/en/&#34;&gt;interactive visualization&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;why-do-statistical-modeling&#34;&gt;Why do (statistical) modeling?&lt;/h2&gt;
&lt;p&gt;The original price data consists of local (geometric) mean sales prices per year. The number of sales is available as well. If there are less than six sales, the mean price is censored.&lt;/p&gt;
&lt;p&gt;Partly missing data and noise from low number of transactions make it hard to evaluate local price levels, let alone their changes, except on the most urban areas.&lt;/p&gt;
&lt;p&gt;Yearly numbers of transactions for a few random zip codes are depicted on the left below. Censored slots are with red. On the right, all year-zip slots are ordered on the x-axis by their available number of sales data. 17.5% of slots are censored, and about half of the mean prices are either missing or based on less than 30 transactions.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://raw.githubusercontent.com/reaktor/Neliohinnat/master/figs/harvuus-en.png&#34; alt=&#34;Data are sparse&#34;&gt;&lt;/p&gt;
&lt;p&gt;Mean price of 6&amp;ndash;30 sales &lt;em&gt;is not a reliable estimate of the local mean&lt;/em&gt;, and deriving trends from so few sales is not going to be successful. (Still, it is repeatedly tried: There have been several top and bottom lists of apartment prices and their development published in the Finnish media lately. The media estimates are based on this raw data.)&lt;/p&gt;
&lt;p&gt;A statistical model has a concept of a &lt;em&gt;price level behind individual sales&lt;/em&gt;.
It can then make a distinction between systematic variation of the underlying price level over time and place, and &lt;em&gt;random&lt;/em&gt; variation that is not explainable within the model. This is in contrast to looking at raw data without a model; then all variation is taken at face value.&lt;/p&gt;
&lt;p&gt;When the model is &lt;em&gt;estimated&lt;/em&gt;, it produces the underlying price level as its output. Of course, because the model cannot explain all variation in data, the price level estimates will also have a random component: Instead of a fixed value, we get a probability distribution. Means, trends, confidence intervals, etc., can be computed from these &lt;em&gt;posterior distributions&lt;/em&gt;. Provided the model is sensible, these estimates of underlying price trends are more informative than the raw data. Uncertainty of the estimates also reveals when data is not enough to draw any conclusions.&lt;/p&gt;
&lt;p&gt;Some properties of zip code areas, like population density, will correlate strongly with apartment prices. Such properties, if known and included in the model, allow us to &lt;em&gt;generalize&lt;/em&gt; over zip codes: estimates of price level become available even on places where the data is sparse or there is no data at all. Of course, uncertainty will then be higher, and the model will tell us that.&lt;/p&gt;
&lt;p&gt;Adjacency of the areas, or their hierarchy, can be used in the similar way to allow generalization between close-by areas. Once topography or hierarchy is parameterized to the model, it can see correlations between areas that are close to each other, or within the same larger area in the hierarchy. It will then generalize over areas, which helps especially when an area has few observations by itself.&lt;/p&gt;
&lt;p&gt;Note that as with demogrpahic covariates, no strict assumptions about geographic dependency are coded into the model. Rather, including these auxiliary parts allows the model to use dependency where it exists.&lt;/p&gt;
&lt;p&gt;Below, the map on the left shows raw mean prices over the whole period 2005&amp;ndash;2014. White areas are without any available data. Map on the right shows the (mean) price level estimated from a model.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://raw.githubusercontent.com/reaktor/Neliohinnat/master/figs/raw-vs-model-en.png&#34; alt=&#34;Mean prices and model estimates from Espoo&#34;&gt;&lt;/p&gt;
&lt;p&gt;Below, yearly mean prices and estimates of the underlying price level are depicted for some zip codes at Espoo, part of the capital area of Finland. Shading around the lines indicate uncertainty of the estimates. Even within this relatively urban region, estimates from some areas are quite noisy: 02150 or Otaniemi, 02240 or Friisilä, 02330 or Kattilalaakso, etc. Some areas have no sales at all. (But they may not have apartments either. The model does not know whether apartments exist.)&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://raw.githubusercontent.com/reaktor/Neliohinnat/master/figs/espoota-en.png&#34; alt=&#34;Espoo curves&#34;&gt;&lt;/p&gt;
&lt;p&gt;The model can be used for forecasting, but future prices or trends will have large uncertainty, even larger than indicated by the model. The current model has quadratic shape for the temporal dependency. It was chosen to fit the data of the last decade and to give an idea of past price development that is easy to summarise. There is no reason why future changes in economy and policy would follow the same pattern.  Althought relative development of areas is more accurately predicted than absolute price levels or trends, &lt;em&gt;the model is at its best at describing past development of apartment prices, especially their spatial differences. There is no guarantee future will follow the same pattern&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The model, data, and environment are described in more detail below.&lt;/p&gt;
&lt;h2 id=&#34;environment-and-data&#34;&gt;Environment and data&lt;/h2&gt;
&lt;p&gt;We used &lt;a href=&#34;http://www.r-project.org&#34;&gt;R&lt;/a&gt; for almost all data manipulation, modeling and visualizations. Model itself was estimated with &lt;a href=&#34;http://mc-stan.org&#34;&gt;Stan&lt;/a&gt;. Source code for the project, except for the web site, is available in our &lt;a href=&#34;https://github.com/reaktor/Neliohinnat&#34;&gt;GitHub repo&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The libraries &lt;a href=&#34;http://cran.r-project.org/web/packages/pxweb/index.html&#34;&gt;pxweb&lt;/a&gt; and &lt;a href=&#34;https://github.com/ropengov/gisfin&#34;&gt;gisfin&lt;/a&gt; make it easy to get &lt;a href=&#34;http://www.stat.fi/til/ashi/index.html&#34;&gt;apartment prices&lt;/a&gt; and other data from the public API&amp;rsquo;s. The libraries are developed in the  &lt;a href=&#34;http://louhos.github.io/&#34;&gt;Louhos&lt;/a&gt; and &lt;a href=&#34;http://ropengov.github.io/&#34;&gt;Ropengov&lt;/a&gt; projects. Zip code areas are from &lt;a href=&#34;http://www.palomaki.info/apps/pnro/&#34;&gt;Duukkis&lt;/a&gt;. Scripts used for downloading data and manipulating it are available in our &lt;a href=&#34;https://github.com/reaktor/Neliohinnat&#34;&gt;repo&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;map&#34;&gt;Map&lt;/h2&gt;
&lt;p&gt;Zip code polygons are available at least through the &lt;a href=&#34;http://www.stat.fi/tup/rajapintapalvelut/paavo.html&#34;&gt;Paavo API&lt;/a&gt; and from &lt;a href=&#34;http://www.palomaki.info/apps/pnro/&#34;&gt;Duukkis&lt;/a&gt;. Finland&amp;rsquo;s archipelago is extensive and complex, so zip codes extend well onto the Baltic sea.  Paavo offers two versions of the polygons, with or without the sea area. The zip code areas with sea can be intersected with the sea shore line, giving us a quite beautiful Finnish zip code map. The problem is its size: over 20MB in the GeoJSON format. We ended up using the polygons from Duukkis: They are a good compromise between size and accuracy.&lt;/p&gt;
&lt;p&gt;The set of zip codes varied a little bit from source to source, so some zip codes areas may be missing from the visualizations. A few small areas have no population, or the information about population is missing. These polygons are without a price estimate and appear as grey on the maps.&lt;/p&gt;
&lt;p&gt;Note that the model produces price and trend estimates even for areas with no apartments: just the relative position of the zip code and the local population density are enough for computing the estimate.&lt;/p&gt;
&lt;h2 id=&#34;model&#34;&gt;Model&lt;/h2&gt;
&lt;p&gt;Of the past sales, the model has yearly (geometric) average per location, and the associated number of sales, if these are not censored ($n&amp;lt;6$). The latter scales the variance of the mean as an estimator of the population mean. Population mean is here the hypothetical mean of all potential &lt;em&gt;apartment sales&lt;/em&gt; on the areas. Of course all apartments are not sold at the same rate, so mean is biased towards prices of the apartments that are sold more often.&lt;/p&gt;
&lt;p&gt;Sparseness of the data is a problem especially for estimates of temporal price changes, and also for comparison of areas. Predictive covariates for the zip code areas are therefore valuable. A quite extensive set of demographic variables is available in the &lt;a href=&#34;http://www.stat.fi/tup/rajapintapalvelut/paavo.html&#34;&gt;Paavo data&lt;/a&gt;, but of these the model so far has only the population density included. It is probably the most predictive of the covariates, although not necessarily causal from the economics point of view.&lt;/p&gt;
&lt;p&gt;Spatial structure is included as a zip code prefix hierarchy. For example 02940 is within the Uusima district (0), city of Espoo (02), and northern Espoo (029). The hierarchy allows the model to see similarity within these and other equivalent nested areas. Real spatial continuity in the form of a Markov field or a latent gaussian field would be an alternative, but it would be much harder to estimate with the chosen tools, and may not be better on modelling administrative areas that &lt;em&gt;are&lt;/em&gt; nested, after all.&lt;/p&gt;
&lt;p&gt;Temporal change of prices is as interesting as their overall level. The model could have a separate price level for each year, but continuity over time would then be lost, and there would be no predictions. Also the relationships between price trends and covariates (population density) and between trends and spatial or hierarchical structure would be hard to define, for there would be no unique trend. These reasons and simplicity favor a simple temporal parameterization, as the current quadratic model. Combining hierarchy and covariates with a more flexible temporal model, like a gaussian process, is an interesting research question.&lt;/p&gt;
&lt;p&gt;In total, there are three parameters on the zip code level affecting log-scale prices: price level, its trend, and change of trend. On the next geographic hierarchy level three other parameters appear: the influences of (logarithmic) population density on price, trend and its change. These six parameters, three plus their interactions with population density, appear also on upper hierarchy levels. On each hierarchy level, the model has multinormal priors for the three or six parameters, and hyperpriors for the variance and covariance of the multinormal distribution. The covariances bind different parameters together, so that for example price level helps the estimation of price trend, or the influence of population density.&lt;/p&gt;
&lt;p&gt;In summary, the lowest level of the model for the log prices is&lt;/p&gt;
&lt;p&gt;$$
\log h_{it} =
\beta_{i1} + \beta_{i2} t + \beta_{i3} t^2 + \beta_{i’4} d_i + \beta_{i’5} d_i t + \beta_{i’6} d_i t^2 ,
$$&lt;/p&gt;
&lt;p&gt;$$
\log y_{it} \sim
\textrm{t}\ \left(\log h_{it}, \sqrt{\sigma^2_y + \frac{\sigma^2_w}{n_{it}}}, \nu\right) ,
$$&lt;/p&gt;
&lt;p&gt;where $i$ refers to the zip code area, $t$ is time, $\beta$ are coefficients specific to the zip code $i$, $i’$ is the first prefix hierarchy level of the zip code (population density parameters are constant within each $i’$-area), $t()$ is the t-distribution, $\sigma_y$ is standard deviation of the underlying (log) price levels over years, $\sigma_w$ standar deviation of the prices within the measurement unit (year $\times$ zip), and $\nu$ the degrees of freedom of the residual t-distribution. Note that the linear model is for log-scale prices. The complete model is best described by the &lt;a href=&#34;https://github.com/reaktor/Neliohinnat/blob/master/source/m4.stan&#34;&gt;source code&lt;/a&gt;.
Estimate for $\nu$ is around 6.5, that is, residuals are with a bit heavier tails than normal. From the covariance parameters (&lt;em&gt;Omega&lt;/em&gt; in the source) one sees that price level and trend correlate at the lowest level ($r$=0,28), as do trend change and price level ($r$=0,43). So price differences between areas have been growing during the last ten years, probably due to urbanisation, a global trend.&lt;/p&gt;
&lt;p&gt;Plotting area-wise prices and its changes against population density, one sees the expected correlation: remote areas are loosing in the sense of price, trend &lt;em&gt;and&lt;/em&gt; trend change.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://raw.githubusercontent.com/reaktor/Neliohinnat/master/figs/tiheys-korrelaatiot-en.png&#34; alt=&#34;Correlation of population density and price trends&#34;&gt;&lt;/p&gt;
&lt;p&gt;The model has been written and estimated with the probabilistic programming language Stan (&lt;a href=&#34;http://mc-stan.org/&#34;&gt;http://mc-stan.org/&lt;/a&gt;). Stan produces a Monte Carlo estimation algorithm from a generative model description.&lt;/p&gt;
&lt;h2 id=&#34;possible-improvements&#34;&gt;Possible improvements&lt;/h2&gt;
&lt;p&gt;The model could have more demographic covariates. Dealers have reminded us about the predictivity of the sales volume. Obviously, the number of sales is not in a predictive role in the current model.&lt;/p&gt;
&lt;p&gt;Apartment vary by their size, age, etc., but this heterogeneity is not taken into account. Trends are biased towards apartments that are sold more often. Some public data is available where sales are separated by size and age of the condos, but these data are even more sparse than the aggregate data used here.&lt;/p&gt;
&lt;p&gt;On the Finnish equivalent of this blog, Herra Huu suggested whitened parameterization, which may help with estimation.&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>Reproducible posters with R, LaTeX, tikz and Sweave</title>
  <link>http://ropengov.org/2015/06/reproducible-posters-with-r/</link>
  <pubDate>Sun, 07 Jun 2015 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2015/06/reproducible-posters-with-r/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/2015/06/reproducible-posters-with-r/index_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;p&gt;rOpenGov is all about reproducible research, so preparing a
&lt;strong&gt;reproducible poster&lt;/strong&gt; became our mission after having the chance to present at &lt;a href=&#34;http://iccss2015.eu/index.html&#34;&gt;International Conference on Computational Social Science (ICCSS 2015)&lt;/a&gt; in Helsinki, June 8-11, 2015 (poster 36 in Monday session 15:30-17:00).&lt;/p&gt;
&lt;p&gt;We blended R code and output with the standard poster contents. The poster sources download data with our &lt;a href=&#34;http://github.com/rOpenGov/eurostat&#34;&gt;eurostat R package&lt;/a&gt; and automatically generates the final figures and the overall layout.&lt;/p&gt;
&lt;p&gt;Whereas several ready-made layouts, such as
&lt;a href=&#34;http://www.brian-amberg.de/uni/poster/&#34;&gt;baposter&lt;/a&gt;,
&lt;a href=&#34;https://github.com/deselaers/latex-beamerposter&#34;&gt;beamerposter&lt;/a&gt;, &lt;a href=&#34;http://www.ctan.org/pkg/tikzposter&#34;&gt;tikzposter&lt;/a&gt;,
&lt;a href=&#34;www.latextemplates.com/cat/conference-posters&#34;&gt;latextemplates.com&lt;/a&gt; and &lt;a href=&#34;http://tex.stackexchange.com/questions/341/how-to-create-posters-using-latex&#34;&gt;other options&lt;/a&gt; were available and could be useful for fast poster design, they also limit the available options as the graphical elements are laid out as tightly defined text boxes. This is not suitable for all purposes, and mixing ready-made styles with free design is potentially confusing. I also bumped into some problems in incorporating R code with some of these templates.&lt;/p&gt;
&lt;p&gt;Therefore we ended up using the plain &lt;a href=&#34;http://www.latex-project.org/&#34;&gt;LaTeX&lt;/a&gt;/&lt;a href=&#34;http://sourceforge.net/projects/pgf/&#34;&gt;tikz&lt;/a&gt; combination which allows reproducible design of arbitrary poster layouts and schematic figures, as well as automated numbering of figures and references. The
&lt;a href=&#34;http://www.r-project.org&#34;&gt;R&lt;/a&gt;/&lt;a href=&#34;https://www.statistik.lmu.de/~leisch/Sweave/&#34;&gt;Sweave&lt;/a&gt;
allows incorporation of R code and output (figures, tables, text). The &lt;a href=&#34;http://www.ctan.org/tex-archive/macros/latex/contrib/a0poster&#34;&gt;a0poster style&lt;/a&gt; provided appropriate font sizes and other LaTeX utilities for
posters. If you are a frequent LaTeX user, we warmly recommend familiarizing with &lt;a href=&#34;http://www.texample.net/tikz/&#34;&gt;tikz&lt;/a&gt;. For further details, see the &lt;a href=&#34;https://github.com/rOpenGov/poster/blob/master/2015-ICCSS/poster.Rnw&#34;&gt;poster sources&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To reproduce &lt;a href=&#34;https://github.com/rOpenGov/poster/blob/master/2015-ICCSS/poster.pdf&#34;&gt;the A0 poster (PDF)&lt;/a&gt;, clone the &lt;a href=&#34;https://github.com/rOpenGov/poster&#34;&gt;rOpenGov poster repository&lt;/a&gt; and run the following commands in R:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(knitr)
knit2pdf(&amp;quot;poster/2015-ICCSS/poster.Rnw&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;http://ropengov.org/images/201506-poster.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Finished poster&lt;/p&gt;
&lt;/div&gt;
</description> 
  </item>
  
<item>
  <title>Finnish Meteorological Institute open data added to rOpenGov</title>
  <link>http://ropengov.org/2014/09/finnish-meteorological-institute-open-data/</link>
  <pubDate>Tue, 30 Sep 2014 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2014/09/finnish-meteorological-institute-open-data/</guid>
  <description>&lt;p&gt;An R package &lt;a href=&#34;https://github.com/rOpenGov/fmi&#34;&gt;fmi&lt;/a&gt; for the Finnish Meteorological Institute
&lt;a href=&#34;https://en.ilmatieteenlaitos.fi/open-data&#34;&gt;open data API&lt;/a&gt; has been released.
The package provides an access from R to many data sets including weather conditions,
climate scenarios, sea related observations and sun radiation in Finland.
Installation instructions, examples and other details are provided in the
&lt;a href=&#34;https://github.com/rOpenGov/fmi/blob/master/vignettes/fmi_tutorial.md&#34;&gt;tutorial&lt;/a&gt;.
The package is in beta and test experiences and contributions are welcome!
For the contact details, see the package &lt;a href=&#34;https://github.com/rOpenGov/fmi&#34;&gt;home page&lt;/a&gt;.
The package has been jointly developed with &lt;a href=&#34;https://github.com/rOpenGov/rwfs&#34;&gt;rwfs&lt;/a&gt; package,
which provides a generic access to WFS interfaces in R.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt; Oct 10, 2020: the package is now &lt;a href=&#34;http://ropengov.github.io/fmi2&#34;&gt;fmi2&lt;/a&gt;&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>Berlin Open Knowledge Festival July 15-17, 2014</title>
  <link>http://ropengov.org/2014/07/berlin-okfest/</link>
  <pubDate>Fri, 18 Jul 2014 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2014/07/berlin-okfest/</guid>
  <description>&lt;p&gt;&lt;a href=&#34;http://ropengov.github.io&#34;&gt;rOpenGov&lt;/a&gt; developers met at the &lt;a href=&#34;http://2014.okfestival.org/programme/&#34;&gt;Open
Knowledge Festival&lt;/a&gt; in Berlin,
July 15-17. We discussed the development of representation formats for
social/political science data sets. The
&lt;a href=&#34;https://github.com/rOpenGov/psData/tree/devPanel&#34;&gt;psData&lt;/a&gt; package for
panel series data is now under active development, and further plans
include similar structures for election and other data
sources. Standardized representation formats will allow the
development and application of analysis algorithms beyond particular
applications. Robert Gentleman from Bioconductor has
&lt;a href=&#34;http://www.nature.com/nbt/journal/v31/n10/full/nbt.2721.html&#34;&gt;said&lt;/a&gt;:
&amp;lsquo;If everybody puts their - - data into the same kind of box, it
doesn&amp;rsquo;t matter how the data came about, but that box is the same and
can be used by analytic tools. Really, I think it&amp;rsquo;s data structures
that drive interoperability.&amp;rsquo;&lt;/p&gt;
&lt;p&gt;Overall, the OKFest 2014 was a great success with an interesting
program and good athmosphere. Thanks for all organizers and
participants!&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>NIPS/MLOSS workshop on Machine Learning Open Source Software</title>
  <link>http://ropengov.org/2013/12/nips-mloss-workshop/</link>
  <pubDate>Tue, 10 Dec 2013 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2013/12/nips-mloss-workshop/</guid>
  <description>&lt;p&gt;The &lt;a href=&#34;http://ropengov.github.io&#34;&gt;rOpenGov&lt;/a&gt; project was today highlighted at the Neural Information Processing Systems &lt;a href=&#34;http://nips.cc/&#34;&gt;(NIPS)&lt;/a&gt; conference, one of the main forums for machine learning and scientific computation. We gave a presentation at the &lt;a href=&#34;http://nips.cc/Conferences/2013/Program/event.php?ID=3710&#34;&gt;Machine Learning Open Source Software (MLOSS) workshop&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A number of other interesting projects were present, including &lt;a href=&#34;http://ipython.org/&#34;&gt;iPython&lt;/a&gt;, &lt;a href=&#34;http://factorie.cs.umass.edu/&#34;&gt;Factorie&lt;/a&gt;, &lt;a href=&#34;http://scikit-learn.org/stable/&#34;&gt;scikit-learn&lt;/a&gt;, and many others. For a full list, see the &lt;a href=&#34;http://mloss.org/workshop/nips13/&#34;&gt;MLOSS&lt;/a&gt; site.&lt;/p&gt;
&lt;p&gt;While the rOpenGov ecosystem is based on R, we actively seek connections with related initiatives in other languages, such as Python or Julia. The &lt;a href=&#34;http://ipython.org/&#34;&gt;iPython&lt;/a&gt; is particularly interesting in this regard with its fluent tools to mix code from various languages with automated document generation tools that support LaTeX and markdown. An added R support for this project would extend the applicability of rOpenGov toolkits beyond the R community. Looking forward!&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>R Ecosystem for Open Government Data and Computational Social Science</title>
  <link>http://ropengov.org/2013/12/r-ecosystem/</link>
  <pubDate>Mon, 09 Dec 2013 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/2013/12/r-ecosystem/</guid>
  <description>&lt;p&gt;Hello world !&lt;/p&gt;
&lt;p&gt;This is to announce the &lt;a href=&#34;http://ropengov.github.io&#34;&gt;rOpenGov&lt;/a&gt; project and encourage new contributions to promote community-driven development of R tools for open government data and computational social science.&lt;/p&gt;
&lt;p&gt;A community of independent package developers around open goverment data analytics is now emerging at rOpenGov. Independent &lt;a href=&#34;http://ropengov.github.io/projects/&#34;&gt;projects&lt;/a&gt; dedicated to &lt;a href=&#34;https://github.com/rOpenGov/govdat&#34;&gt;US&lt;/a&gt;, &lt;a href=&#34;http://markuskainu.fi/rustfare/index.html&#34;&gt;Russia&lt;/a&gt;, &lt;a href=&#34;http://louhos.github.io/sorvi/&#34;&gt;Finland&lt;/a&gt;, &lt;a href=&#34;http://smarterpoland.pl&#34;&gt;Poland&lt;/a&gt;, &lt;a href=&#34;https://github.com/skasberger/grazwahl2012&#34;&gt;Austria&lt;/a&gt;, and &lt;a href=&#34;http://osmar.r-forge.r-project.org/&#34;&gt;OpenStreetMap&lt;/a&gt; have already joined in; more are coming and we will keep you posted through this blog.&lt;/p&gt;
&lt;p&gt;The rapidly emerging governmental and other open data streams provide novel opportunities for social sciences, data journalism, and citizen participation across the globe while computational tools to utilize these resources are lacking.  A community-driven software ecosystem provides a scalable solution and a potential to revolutionize the field, taking advantage of the lessons learned in &lt;a href=&#34;http://www.bioconductor.org&#34;&gt;Bioconductor&lt;/a&gt;, &lt;a href=&#34;http://ropensci.org&#34;&gt;rOpenSci&lt;/a&gt;, and related projects. The umbrella site focusing on open government data R tools will give added visibility and recognition for independent package developers, an opportunity to attract contributors, and a forum for exchanging ideas and information.&lt;/p&gt;
&lt;p&gt;The project is in beta, and more information will be added soon.  You can follow us on &lt;a href=&#34;http://ropengov.github.io/&#34;&gt;rOpenGov blog&lt;/a&gt;, &lt;a href=&#34;https://twitter.com/ropengov&#34;&gt;Twitter&lt;/a&gt;, &lt;a href=&#34;https://plus.google.com/u/0/communities/108289259916380218460&#34;&gt;Google+&lt;/a&gt;, and IRC (ropengov@Freenode). If you are working on related projects, don&amp;rsquo;t hesitate to get in touch!&lt;/p&gt;
</description> 
  </item>
  
<item>
  <title>Community</title>
  <link>http://ropengov.org/community/</link>
  <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/community/</guid>
  <description>&lt;h3 id=&#34;developers&#34;&gt;Developers&lt;/h3&gt;
&lt;p&gt;The network is coordinated &lt;a href=&#34;http://www.iki.fi/Leo.Lahti&#34;&gt;Leo Lahti&lt;/a&gt;, &lt;a href=&#34;https://github.com/pitkant&#34;&gt;Pyry Kantanen&lt;/a&gt; and &lt;a href=&#34;https://github.com/muuankarski&#34;&gt;Markus Kainu&lt;/a&gt;. It all started from the Finnish &lt;a href=&#34;http://louhos.github.io&#34;&gt;Louhos blog&lt;/a&gt; back in 2010.&lt;/p&gt;
&lt;p&gt;We are grateful to all contributors! For a full list, see &lt;a href=&#34;https://ropengov.r-universe.dev/contributors&#34;&gt;R-Universe&lt;/a&gt; and the individual projects pages in &lt;a href=&#34;http://github.com/ropengov&#34;&gt;github&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The rOpenGov blog is part of &lt;a href=&#34;http://r-bloggers.com&#34;&gt;R bloggers&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;contact&#34;&gt;Contact&lt;/h3&gt;
&lt;p&gt;You can follow us in &lt;a href=&#34;https://twitter.com/rOpenGov&#34;&gt;Twitter&lt;/a&gt;, reach us in &lt;a href=&#34;https://gitter.im/rOpenGov/home&#34;&gt;Gitter&lt;/a&gt; or via email: ropengov - at - googlegroups.com&lt;/p&gt;
&lt;p&gt;In administrative matters, &lt;strong&gt;contact&lt;/strong&gt; the coordinators or other developers as relevant.&lt;/p&gt;
&lt;h3 id=&#34;how-to-contribute&#34;&gt;How to contribute?&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;http://ropengov.org&#34;&gt;rOpenGov&lt;/a&gt; &lt;!-- raw HTML omitted --&gt; is a community of independent R package developers. If you are working on related projects, don&amp;rsquo;t hesitate to get in touch!&lt;/p&gt;
&lt;p&gt;Adding new packages to rOpenGov:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Open a new repository or transfer existing repository under the rOpenGov &lt;a href=&#34;https://github.com/ropengov/&#34;&gt;Github-organization&lt;/a&gt; (contact the coordinators to obtain necessary permissions).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Add the package under &lt;a href=&#34;https://github.com/rOpenGov/universe&#34;&gt;R-Universe&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Contribute to the existing rOpenGov packages:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Open Issue or Pull Request to the package of your choice in &lt;a href=&#34;https://github.com/ropengov/&#34;&gt;GitHubissa&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Possible ways to contribute:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Use the R-packages in your projects&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Contribute to development&lt;/strong&gt;. We will gladly accept code contributions, bug reports or other suggestions for improvements, or new R packages that fit the scope. We are also welcoming tutorials and guidelines.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Write a blog post&lt;/strong&gt; in your own blog or send your writing to gain visibility through the rOpenGov blog.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Provide feedback&lt;/strong&gt; on packages, blog posts, website, or other issues.&lt;/li&gt;
&lt;/ol&gt;
</description> 
  </item>
  
<item>
  <title>Projects</title>
  <link>http://ropengov.org/projects/</link>
  <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
  
<guid>http://ropengov.org/projects/</guid>
  <description>
&lt;script src=&#34;http://ropengov.org/projects/index.en_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;p&gt;The rOpenGov R package development versions are maintained in &lt;a href=&#34;https://github.com/ropengov/&#34;&gt;GitHub&lt;/a&gt;. For contributions, check out our &lt;a href=&#34;http://ropengov.org/community/&#34;&gt;Community-page&lt;/a&gt; to get in touch or simply just make a pull request in GitHub.&lt;/p&gt;
&lt;table class=&#34;table-striped table&#34;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;
Package
&lt;/th&gt;
&lt;th&gt;
Description
&lt;/th&gt;
&lt;th&gt;
Homepage
&lt;/th&gt;
&lt;th&gt;
Forks
&lt;/th&gt;
&lt;th&gt;
Issues
&lt;/th&gt;
&lt;th&gt;
&lt;i class=&#34;fa fa-star&#34;&gt;&lt;/i&gt;
&lt;/th&gt;
&lt;th&gt;
Updated
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody id=&#34;repo_tbl&#34;&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;script src = &#34;./project_tbl.js&#34;&gt;&lt;/script&gt;
</description> 
  </item>
  
</channel>
  </rss>