Skip to content
View dataengineergaurav's full-sized avatar
🌎
Working from Earth
🌎
Working from Earth

Block or report dataengineergaurav

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Run Claude Code 100% on-device with local AI on Apple Silicon. MLX-native Anthropic-API server, 65 tok/s Qwen 3.5 122B, Llama 3.3 70B, Gemma 4 31B. Private, offline, airgap-ready. Built for NDA / l…

Python 2,755 522 Updated Jun 5, 2026

Beginner, advanced, expert level Rust training material

Rust 14,578 1,138 Updated Jun 11, 2026

A GitHub Action that implements smart caching for rust/cargo projects

TypeScript 1,833 177 Updated Jun 8, 2026

📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs, reinforcement learning, recommendations, graphs, etc.

2,895 289 Updated Mar 17, 2023

All-in-one guide to getting a tech job abroad 🌎

4,427 456 Updated Mar 19, 2026

This is the main repository for SDF documentation found at docs.sdf.com, as well as public schemas, benchmarks, and examples

Shell 125 21 Updated Feb 5, 2025

A pipeline orchestration tool

Rust 35 1 Updated Aug 2, 2024

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 192,442 58,529 Updated Jun 14, 2026

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Python 45,803 17,237 Updated Jun 14, 2026

Implementing best practices for PySpark ETL jobs and applications.

Python 1 Updated Jan 1, 2023

LLM101n: Let's build a Storyteller

37,319 2,051 Updated Aug 1, 2024

Docker Apache Airflow

Shell 3,807 585 Updated Mar 1, 2023

Implementing best practices for PySpark ETL jobs and applications.

Python 2,111 805 Updated Jan 1, 2023

Pyspark RDD, DataFrame and Dataset Examples in Python language

Python 1,359 985 Updated Dec 7, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 97,134 14,854 Updated Jun 2, 2026

Fabric is an open-source framework for augmenting humans using AI. It provides a modular system for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.

Go 42,346 4,172 Updated Jun 9, 2026

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

80,120 9,326 Updated Feb 5, 2026

Ship RAG based LLM web apps in seconds.

Python 1,004 99 Updated Jan 29, 2024

LangServe 🦜️🏓

JavaScript 2,331 273 Updated May 5, 2026

Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.

Python 2,541 237 Updated Jun 13, 2026

An Awesome List of Open-Source Data Engineering Projects

3,211 553 Updated Oct 4, 2024

🦀 Small exercises to get you used to reading and writing Rust code!

Rust 1 Updated May 14, 2023

pgAdmin is the most popular and feature rich Open Source administration and development platform for PostgreSQL, the most advanced Open Source database in the world.

Python 3,668 862 Updated Jun 12, 2026