Skip to content
View Tryd3x's full-sized avatar

Block or report Tryd3x

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Tryd3x/README.md

Hi there, I'm Hyder Reza 👋

GitHub followers GitHub stars Profile Views

🚀 About Me

I am a Data Engineer and Analyst with a Master’s in Computer Science, specializing in designing and building scalable, reliable data solutions that drive business impact. My technical expertise spans end-to-end data pipeline development, cloud-native architectures, and machine learning workflows.

I have hands-on experience with large-scale data processing using Apache Spark, Python, and SQL, as well as orchestrating complex workflows with tools like Airflow and DBT. Skilled in cloud platforms—primarily AWS and GCP—I build and optimize data lakes, ETL pipelines, and data warehouses using services such as AWS, GCP, BigQuery and Terraform for infrastructure as code.

Beyond batch processing, I develop real-time and automated pipelines that improve efficiency and reduce costs, integrating CI/CD practices with Docker and monitoring solutions like Prometheus and Grafana to ensure system reliability and performance. I also bring a strong foundation in data science, including feature engineering, model deployment, and building interactive data visualizations.

Committed to delivering data-driven solutions, I focus on transforming complex, multi-source data into actionable insights that empower smarter decision-making.

🛠️ Tech Stack

Python Airflow dbt Docker Apache Spark Prometheus Grafana Google Cloud Storage BigQuery Dataproc Bash Terraform Astronomer Pandas Jupyter TensorFlow Flask Scikit-learn Git Anaconda

📝 Recent Projects

  • Goal: Build a scalable, end-to-end data pipeline to process 100GB+ OpenFDA pharmaceutical datasets for medication safety analytics.
  • Highlights: Developed robust data ingestion and transformation workflows using Python, Apache Spark, and Airflow with full test coverage and version control. Optimized cloud infrastructure on GCP using Terraform, BigQuery, and Docker. Improved performance with memory-optimized ingestion, chunking, and distributed processing, reducing ingestion time by 30% through a hybrid metadata structure. Built advanced SQL models with DBT and ensured system reliability with Prometheus monitoring and Grafana dashboards.
  • Technologies: Python, Apache Spark, Airflow, DBT, Terraform, BigQuery, Docker, Prometheus, Grafana, Cloud Storage, Git
  • Results: Delivered a reliable, scalable platform enabling complex pharmaceutical data analytics with real-time monitoring, facilitating faster and more accurate medication safety insights.
  • Goal: Built an end-to-end data pipeline processing large-scale NYC taxi trip data with Kestra orchestration, DBT modeling, and Spark transformations.
  • Highlights: Implemented infrastructure-as-code with Terraform for consistent cloud deployment on GCP, and created automated workflows handling schema changes and data lineage.
  • Technologies: Kestra, PostgreSQL, DBT, Apache Spark, Terraform, BigQuery, Looker
  • Results: Delivered a scalable, reliable pipeline enabling timely and accurate business intelligence reporting, reducing manual intervention and supporting data-driven decisions with up-to-date analytics.
  • Goal: Predict income levels using the Adult Income Census dataset.
  • Highlights: End-to-end ML pipeline with data ingestion, preprocessing, model training (Random Forest, Decision Tree, Logistic Regression), and deployment on Flask.
  • Technologies: Anaconda, Python, Scikit-learn, Jupyter, Pandas, Numpy, Flask.
  • Results: Achieved 85% accuracy using hyperparameter-tuned models. Deployed the model on Flask with real-time predictions under 200ms.

📫 How to reach me

Email LinkedIn GitHub

Pinned Loading

  1. ade-pipeline ade-pipeline Public

    ETL pipeline to identify patterns of adverse drug events among older adults (65+). It aims to provide actionable insights to healthcare providers, to reduce preventable hospitalizations, improve me…

    Python 3

  2. zoomcamp zoomcamp Public

    ETL pipeline to perform analysis on the NYC taxi dataset.

    Jupyter Notebook

  3. ml-pipeline ml-pipeline Public

    Adult Income Prediction using Machine Learning Pipeline

    Jupyter Notebook

  4. data-kit data-kit Public

    My personal toolkit consisting of useful code snippets

    Jupyter Notebook