0% found this document useful (0 votes)
29 views12 pages

APAN 5205 Introduction

Frameworks II is a course focusing on unsupervised learning techniques, emphasizing mathematical foundations and practical applications. Key topics include clustering, dimension reduction, text mining, and neural networks, with assessments based on assignments, a team project, and exams. Students are encouraged to engage with the lecturer and TAs for support and are required to form teams for a project due on March 8th, with final presentations on May 1st.

Uploaded by

adt2156
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views12 pages

APAN 5205 Introduction

Frameworks II is a course focusing on unsupervised learning techniques, emphasizing mathematical foundations and practical applications. Key topics include clustering, dimension reduction, text mining, and neural networks, with assessments based on assignments, a team project, and exams. Students are encouraged to engage with the lecturer and TAs for support and are required to form teams for a project due on March 8th, with final presentations on May 1st.

Uploaded by

adt2156
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Introduction to

Frameworks II
Module 1 – January 23, 2025
Lecturer: Andrew Assing
Welcome to Frameworks II
Frameworks II focuses on unsupervised learning techniques. Rather than try to predict a target
variable, we are mostly concerned with finding patterns or characteristics about the data. An
exception to this will be Neural Networks which are a family of supervised models.

Our class emphasizes the mathematical underpinnings of the models. We will review concepts from
linear algebra to verify the results of the R commands. This will help to make the models seems less
of a black box and better prepare you for future courses.

Topics
Clustering (2 weeks)
Dimension Reduction (2 weeks)
Text Mining (2 weeks)
Association Rules
Recommender System
Time Series (2 weeks)
Neural Networks
Spatial Analysis
How to Contact Me
The best way to get in touch with me is by email – aa4357@columbia.edu

You are welcome to ask questions about the course materials, lectures, homeworks, etc.

Keep in mind I’m busiest on Thursdays (before our class).


Office Hours
Office hours will be held on Zoom at 5PM EST on Sundays, starting on February 2nd.

In each module there will be a page for office hour that contains the link for that week’s Zoom. I will
wait until 5:30PM for students. If you arrive at 5:31PM or later, please email me and I will re-open
the Zoom.

The office hours are usually student directed (we talk about your questions). Occasionally I will do a
mini-lecture where I go more in depth about a topic that we could not cover in class.

Office hours are recorded as long as we have some attendance. You are not required to attend office
hours.

You are also welcome to email me to schedule an office hour by appointment. If you do this, please
suggest a few times that work for you.
Class Engagement
It is expected for you to attend each class in person. As we are designated on-campus, we are not
allowed to record lectures. In lieu of attendance, I use quizzes to record class engagement (10% of
your grade).

The quizzes are not meant to be difficult. They are usually 1-2 questions that reviews an important
concept from the previous lecture. You have 3 attempts to answer. The quiz opens at 5:30pm and is
due at the end of class. Missing 1 or 2 quizzes will not materially affect your final grade. You will see
a link in the current week’s module for Class Engagement.

The quizzes will likely start on February 6th.


Get to Know Our TAs
We have 2 great TAs –

Umay Ayyub (umay.ayyub@columbia.edu)


Omead Eftekhari (oe2191@columbia.edu)

Please engage with them and get to know them (questions on homework, course and career).
Student Projects
We are required to submit team projects. Everyone is responsible for finding their own team. Please
target 4-5 team members. No more than 5 students per team. Please start thinking about your team
members. “I don’t know anyone in class” is not an excuse for forming a team.

The project is relatively open-ended. You are free to propose any topic you are interested in but you
must use techniques we covered in class (Frameworks I and/or II).

You are responsible for getting data on your own. Kaggle data is not allowed.

We also will not allow projects on Citi Bike, NYC Taxi or pairs trading with stocks.

The project proposal is due on March 8th where you will explain what you would like to do and share
your data with us so we can provide feedback.

You will present your project as a team on the last day of class (May 1st) . A final report with your
work and code will be due on May 4th.
Exams
Similar to Frameworks I, there is a mid-term exam that covers topics from the first few lectures. The
mid-term will open on Canvas on March 6th and close on March 13th. It will cover topics from
Modules 2-4 only (Clustering and Dimension Reduction).

The final exam will only cover topics discussed after the midterm.
Assessment
Assignments: 16%
Four assignments, each worth 4%

Project: 30%
Proposal: 5%
Report: 20%
Presentation: 5%

Class Participation and Engagement: 10%


Exams
Midterm Assessment: 4%
Final: 40%
Please Install R Version 4.4.2
Go to the R CRAN website and install the current version of R (4.4.2), if you do not already have this.
Please do not update your R version for the remainder of the course unless I ask you to.

https://cran.r-project.org/

You can check your R version by running R.Version() in the console.


Books for the Course
You are not required to purchase any books for this course. All books are available electronically on the
Columbia University library. Moreover, the lecture notes and examples are relatively self-contained.

•Chapman, C, & Feit, E. M. (2015). R for Marketing Research and Analytics. Springer International Publishing. Retrieved
from: https://clio.columbia.edu/catalog/11485551. ISBN-13: 9783319144351 (Referred to as “Analytics Text”)
•Silge, J., & Robinson, D. (2017). Text Mining with R: A Tidy Approach. Beijing: O’Reilly. Retrieved
from: https://www.tidytextmining.comLinks to an external site.. ISBN-13: 9781491981658 (Referred to as “Text Mining Text”)
•Gorakala, S. K., & Usuelli, M. (2015). Building a Recommendation System with R. Packt Publishing. Retrieved
from: https://clio.columbia.edu/catalog/13674495. ISBN-13: 9781783554492 (Referred to as “Recommendation Text”)
•Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and Practice. Heathmont: OTexts, second edition.
Retrieved from: https://otexts.org/fpp2/Links to an external site.. ISBN-13: 9780987507112 (Referred to as “Forecasting
Text”)
•Chollet, F. and Allaire, J. J. (2018). Deep Learning with R. Manning Publications Company. ISBN-13: 9781617295546
(Referred to as “Deep Learning Text”). Only available in hard copy form in Butler Library. Not a required reading.
•Bivand, R., Pebesma, E., & Gómez-Rubio, V. (2013). Applied Spatial Data Analysis with R. Springer International Publishing.
Retrieved from: https://clio.columbia.edu/catalog/7935086. ISBN-13: 978-1-4614-7618-4. (Referred to as “Spatial Text”)
Please Get in Touch if You Are Struggling
If you find yourself struggling with the material or falling behind please get in touch with myself or
the TAs for help.

You might also like