Skip to content

IavTavares/clustering-problem

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Locations clustering POC

Locations clustering proof-of-concept test task.

Given that we have a dataset which has around 60k of location addresses, including city, road, house_number, postcode and state: Download the Locations dataset

The data look like this: data

The candidate is expected to create a solution - a proof of concept, using Python, to cluster (or simply group) those addresses where they belong to the same place (same address). Candidate is free to implement any machine learning model, or hand-crafting solutions to show how the candidate would approach to solve this problem.

Instructions:

  • Fork this repo or create a new public GitHub repo for the solution.
  • The solution will open/read the location data CSV file and write the grouped/clustered result in an output CSV file.
  • There's no restriction on how the candidate would choose to approach the problem.

Please email to: long@shake.io if you have any questions.

Good luck!

About

Locations clustering proof-of-concept test task

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%