GitHub - HelioNeves/mut: A data analysis pipeline about career opportunities announced at Indeed :wrench:

MUT

Market Understanding Tool

About

This project is intended to make a pipeline of data analysis about opportunities for data science career announced at Indeed. However, this pipeline can classify job opportunities of whenever sector, beyond data science.

This pipeline generates a .html file with:

Clusters 2D Graph

Clusters Keywords Ranking

TF-IDF Ranking

Check the "Brazillian Data Science Jobs Market: A Deep Analysis" on the web!

Project Details

Folders

Folder	Description
db/	Folder where your Scrapy database will be saved
output/	Folder where your graphs and results will be saved

Files

ARGS	USAGE
[db-title]	It is your Scrapy database title (e. g., datascience_db)
[urls-file]	It is your Indeed URL filename (take a look at sample.urls)
[toxicwords-file]	It is the filename of list of words for not use in the analysis (take a look at sample.toxicwords)
[num-clusters]	Number of clusters to identify, in a range (e. g., 2-8) or single (e. g., 8)

Requirements

Paraphrasing The Beatles: " All you need is docker 🐳 "

Install

1. Clone this repo 🍕

git clone https://github.com/HelioNeves/mut.git
cd /mut

2. Basic building 🔧

docker build . -t mut

Running this awesome docker image

1. Load ubuntu layer 🌈

docker run -ti --name MUT-env mut /bin/bash

2. Once inside ubuntu, run pipeline python scripts 🐍

Scrapy

python3 scraper.py [db-title] [urls-file]

Analytics app

python3 app.py [db-title] [toxicwords-file] [num-clusters]

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
analyzing		analyzing
preprocessing		preprocessing
scraper		scraper
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
sample.toxicwords		sample.toxicwords
sample.urls		sample.urls
webscraper.py		webscraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MUT

Market Understanding Tool

About

Project Details

Folders

Files

Requirements

Install

1. Clone this repo 🍕

2. Basic building 🔧

Running this awesome docker image

1. Load ubuntu layer 🌈

2. Once inside ubuntu, run pipeline python scripts 🐍

Scrapy

Analytics app

About

Uh oh!

Releases

Packages

Languages

HelioNeves/mut

Folders and files

Latest commit

History

Repository files navigation

MUT

Market Understanding Tool

About

Project Details

Folders

Files

Requirements

Install

1. Clone this repo 🍕

2. Basic building 🔧

Running this awesome docker image

1. Load ubuntu layer 🌈

2. Once inside ubuntu, run pipeline python scripts 🐍

Scrapy

Analytics app

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages