Skip to content
View perrosdatos's full-sized avatar
🤖
🤖

Block or report perrosdatos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
perrosdatos/README.md

🚀 Juan Carlos González - Lead Data Scientist & AI developer

GitHub followers
GitHub stars

👋 About Me

I am a Lead Data Scientist with over 8 years of experience in machine learning, reinforcement learning, data engineering, and AI applications. My work spans various industries, including banking, energy, and government research projects. I specialize in algorithm optimization, large-scale data analysis, and AI-driven decision-making.

🎮 Fun Fact

I also enjoy gaming! You can find me on Fortnite under the username dataperromx 🕹️.


🎤 Workshops & Presentations

I am passionate about sharing knowledge and mentoring others in the fields of data science, machine learning, and AI. Below are some of the key talks and workshops I have led:

Event Topic Location Year Links
SG Data Day 2022 Archivos de la Represión & AI for Social Good Mexico City, Mexico 2022 Talk Details
RIIA Hackathon Machine Learning for Historical Archives Virtual 2021 YouTube
Introduction to Data Science with Spark Hands-on Workshop on Apache Spark Mexico City, Mexico 2018 GitHub Repo
RIIA Workshop AI Applications in Historical Analysis Mexico City, Mexico 2022 PDF

📂 Featured Projects

🔥 Projects for Society

I am passionate about using AI for social good. One of my key projects involves applying machine learning to analyze historical security files, helping uncover hidden patterns in national archives.

Project Name Description Technologies Year Repository
DFS OCR Reader Applied machine learning & OCR to analyze and extract text from national security archives. Yolov5, Pytorch, MachineLearning, Python, OpenCV, Tesseract OCR, RegEx 2022 GitHub Repo

📊 Data Engineering

Data is at the core of my work. Here are some projects focused on data pipelines, big data, and analytics.

Project Name Description Technologies Year Repository
LinkedIn Web Scraping Automated data extraction from LinkedIn using Selenium to gather job listings and profile insights. Python, Selenium, Web Scraping 2017 GitHub Repo

🤖 Machine Learning & AI

Building intelligent systems is a core part of my work.

Project Name Description Technologies Year Repository
MSC_AI_KW_Project A team-based project that leverages metaheuristic algorithms and semantic data integration (YAGO & Wikidata) to optimize personalized music recommendations. Python, SPARQL, Flask, JavaScript, Metaheuristic Algorithms 2025 GitHub Repo

🧬 Genetic Algorithms & Optimization

I have worked on metaheuristic algorithms to solve complex optimization problems.

Project Name Description Technologies Year Repository
MSC_AI_Assigment Evolutionary Algorithm for the Travelling Salesman Problem (TSP) using GA and B&B. Python, Genetic Algorithms, Branch and Bound, Jupyter Notebooks, LaTeX 2024 GitHub Repo

🎮 Reinforcement Learning & LLMs

I explore autonomous decision-making models and large language models.

Project Name Description Technologies Year Repository
reinforcement_learning_examples A demonstration of a Deep Q-Network (DQN) implementation using Keras for reinforcement learning. It features a custom training loop with target network updates, reward computation, and grid-based environment simulations for autonomous decision-making. Python, TensorFlow, Keras, NumPy, scikit-learn, matplotlib 2018 GitHub Repo

🛠️ Tech Stack

🔹 Programming Languages: Python, Scala, SQL, R
🔹 Scripting & Markup: LaTeX, Markdown
🔹 Deep Learning & AI: TensorFlow, Keras, PyTorch, YOLOv5, Tesseract, Scikit-learn, XGBoost
🔹 Metaheuristic Algorithms: Genetic Algorithms, Branch and Bound
🔹 Data Engineering & Big Data: PySpark, GraphX, Kafka, AWS Glue, Redis, Timestream
🔹 Data Analysis & Notebooks: Jupyter Notebooks, Pandas
🔹 Visualization: Google Data Studio, Amazon QuickSight, Qlik
🔹 DevOps & Cloud: AWS, Docker, Git


🤝 How to Contribute

Interested in collaborating? Feel free to:

  • ⭐ Star the repository
  • 🛠️ Open a pull request with improvements
  • 🗣️ Reach out via LinkedIn

📫 Contact

📌 LinkedIn: juancarlosgonzalezaguilar
📌 GitHub: perrosdatos
📌 Email: carlosgonzagular@email.com
📌 Discord: @dataperromx


🚀 Thanks for Visiting!

I’m always open to new ideas and collaborations! Feel free to explore my repositories, and if you find something interesting, let’s connect!

Pinned Loading

  1. fast_introduction_to_spark fast_introduction_to_spark Public

    Jupyter Notebook

  2. files_dfs_ocr_reader files_dfs_ocr_reader Public

    This repository is the result of "Uso de Aprendizaje de Máquina para el análisis de los ficheros de la Dirección Federal de Seguridad (DFS).".

    Jupyter Notebook

  3. taller_spark taller_spark Public

    Repositorio con el contenido del taller:

    Jupyter Notebook 1

  4. web_scraping_linkedin web_scraping_linkedin Public

    Jupyter Notebook

  5. wiggiAcademy/fast_introduction_spark wiggiAcademy/fast_introduction_spark Public

    Jupyter Notebook