Skip to content
View CleitonAmbrosio's full-sized avatar
🎯
Focused
🎯
Focused

Block or report CleitonAmbrosio

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
CleitonAmbrosio/README.md

Cleiton Ambrosio's Portfolio

About

I am currently working as a Jr Data Scientist, but I also hold a MSc degree in Physics.

I have been studying Data Science since April/2021 and, thus, I have 1.5 years of professional experience in the area, in particular through the projects detailed below (besides the learnings in the work context).

I am experient on developing all the stages of a business solution using the concepts and tools of Data Science, from understanding the business problem to deploying the final model to production.

Moreover, I believe Data Science to be more based on a good understanding of the business issue, on the nature of the available data and on the actionability of the results, than being just the application of a certain set of tools.

🧰 Toolbox

  • Main tools: Python, SQL, Git, GitHub Actions, DBT, Kubernetes, Prefect, AWS, Linux, Power BI, Heroku
  • Databases: Oracle, Snowflake
  • Other languages: Matlab, Fortran, LaTeX

📫 Reach me

  • Linkedin Badge
  • Gmail Badge

Data Science Projects (5/2021 - Present)

"Insiders" Fidelity Program - 2nd cycle (improvement) under development

Creation of a fidelity program for an online retail store through the segmentation of the customer base and identification of the most valuable users.

Even though the whole project is not yet finished (fully in production), a complete 1st cycle has already been concluded, resulting in 5 cohesive and well-separated clusters, each of which accompanied by a suggested action to be taken by the business team.
A 2nd cycle is currently in progress aiming on other feature engineering possibilities, besides other clustering algorithms.

Knowing in advance the income for the next few weeks can be decisive for managing a store towards a more accelerated growth.

In this project, I developed a Machine Learning solution to predict the sales for each unit of a chain of stores for the next 6 weeks.
The final model generated daily forecasts with at least 85% of accuracy for 97.04% of the stores. In the overall of the units, the actual income of $289,571,750.00 for the 6 weeks was predicted to be $286,722,368.00, corresponding to a subestimation of only 0.98%.
The predictions per store were conveniently made available through a Telegram bot.

Not actual Data Science projects, but more technical notebooks involving the exploration of a certain algorithm or method, to be kept as materials for further reference.

Up until now, it contains only some studies on different clustering scores to be used with centroid-based algorithms (in particular, KMeans), but the repo is intended to grow significantly in size over time.

Popular repositories Loading

  1. Rossmann-Stores-Sales-Forecast Rossmann-Stores-Sales-Forecast Public

    This project studies the units of a chain of stores, building a Machine Learning model that predicts the income for each of them for the next few weeks.

    Jupyter Notebook 2

  2. CleitonAmbrosio CleitonAmbrosio Public

  3. demo-dbt demo-dbt Public

  4. misc-studies misc-studies Public

    Jupyter Notebook

  5. Insiders-Fidelity-Program Insiders-Fidelity-Program Public

    This project studies the sales of a online retail store, building a Machine Learning model that clusterize the customers in order to create a fidelity program.

    Jupyter Notebook

  6. rossmann-telegram-bot rossmann-telegram-bot Public

    Python