DSM020 - Dr Sean McGrath
Structure
 ●
     1. Data structures
 ●
     2. Reading and writing data on the filesystem
 ●
     3. Retrieving data from the web
 ●
     4. Retrieving data from databases using query languages
 ●
     5. Cleaning and restructuring data, part 1
 ●
     6. Cleaning and restructuring data, part 2
 ●
     7. Data plotting
 ●
     8. Version control systems
 ●
     9. Unit tests
 ●
     10. Data processing pipelines
                                                               2
Assessments
 ●
     Project Proposal (CW1)   ●
                                  Finished project (CW2)
 ●
     Exploratory data         ●
                                  Comprehensive data
     analysis                     analysis
     30% of your grade            70% of your grade
                                                      3
Purpose
 ●
     Get you coding
 ●
     Get you up to speed with the technologies
 ●
     Give you a taste of what it means to do your
     own data science project - albeit small scale
                                                     4
Module design
 ●
     Activity driven - you learn to code by writing
     code
 ●
     Lots of opportunities for peer review - no such
     thing as a ‘one size fits all’ solution
 ●
     Expressive with opportunities to be creative and
     innovate
                                                      5
Be aware
 ●
     This is an MSc course, we move quickly!
 ●
     This module is perhaps the broadest in terms of
     scope. Lots of learning opportunities.
 ●
     Students produce some exceptional work on
     this course. See pinned discussion posts for
     examples from this cohort.
                                                    6
Challenges
 ●
     If you have not used Python before, it might be an
     initially steep learning curve for you.
 ●
     The activities are the main source of learning. This
     requires you to build confidence in exploring data.
 ●
     We do a lot of reading. Python for Data Analysis, 2e
     (2017): Data Wrangling with Pandas, Numpy, and
     Ipython is a good place to start.
                                                       7
Outcomes
 ●
     Our students are (mostly) very happy with the
     outcomes of the course.
 ●
     There are lots of opportunities to build a repository of
     learning - particularly with the coursework assignments.
 ●
     We learn a lot of skills, from how websites and
     databases to work to examples of good coding practice.
                                                          8
Consider
     Distinction (70-79%)
     An answer falling into the mark range 70 to 79% demonstrates:
 ●
     a capacity to develop a sophisticated and intelligent argument;
 ●
     clear evidence of wide and relevant reading, referencing and an engagement with the conceptual
 ●
     issues;
 ●
     original thinking and a willingness to take risks;
 ●
     a significant ability to plan, organise and execute independently a research project, coursework
 ●
     assignment or examination question;
 ●
     rigorous use and a sophisticated understanding of relevant source materials, balancing
     appropriately between factual detail and key theoretical issues. Materials are evaluated directly
     and their assumptions and arguments challenged and/or appraised;
 ●
     significant ability to analyse data critically;
 ●
     correct referencing.
                                                                                                         9
Welcome to UOL!
 ●
     Most students say that they really enjoy this course
 ●
     We see lots of exceptional work from students with
     zero programming experience when they arrive
 ●
     The forums (peer discussion) tutor forums (academic
     support) and the activities are the core components in
     this course. Use them wisely!
                                                          10