-
08:05
(UTC +02:00) - bento.me/tawfik
- https://orcid.org/0009-0007-1846-825X
- in/tawfikyasser
- itistawfik
Stars
Flower detection using yolov5 trained on custom dataset.
All materials you need for Federated Learning: blogs, videos, papers, and softwares, etc.
A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon.
Official Dockerfile for Delta Lake
The resources of the preparation course for Databricks Data Engineer Associate certification exam
This repo contains "Databricks Certified Data Engineer Associate" Questions and related docs.
Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize.
محتوى تقني متميز في مختلف مجالات هندسة البرمجيات عن طريق تبسيط المفاهيم البرمجية المعقدة بشكل سلس وباستخدام صور توضيحية مذهلة
End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API, sends the data to Kafka, and processes it with Spark befor…
Apache Airflow advanced functionalities examples
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the informati…
This is a repo with links to everything you'd ever want to learn about data engineering
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
Fully dockerized Data Warehouse (DWH) using Airflow, dbt, PostgreSQL and dashboard using redash
Code for "Efficient Data Processing in Spark" Course
This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python, Scala and Java as an example.
used Airflow, Postgres, Kafka, Spark, and Cassandra, and GitHub Actions to establish an end-to-end data pipeline
📃 hire me! resume built with jekyll and hosted on https://insanj.github.io/resume/
This project implements an ELT (Extract - Load - Transform) data pipeline with the goodreads dataset, using dagster (orchestration), spark (calculation) and dbt (transformation)
Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.
A real-time reddit data streaming pipeline for sentiment analysis of various subreddits
Learn PySpark from Basics to Advanced. Checkout the YouTube Series : [PySpark - Zero to Hero]
Code for Data Pipelines with Apache Airflow
Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.