24 Oct 25

WhatTheDuck is an open-source web application built on DuckDB. It allows users to upload CSV and Parquet files, store them in tables, and perform SQL queries on the data.WhatTheDuck is a Python library available on GitHub that serves as a high-performance bridge for seamless data transfer and integration between the DuckDB analytical database and Pandas DataFrames.

by tmfnk 3 months ago

This is a repo with links to everything you’d ever want to learn about data engineering. The Data Engineering Handbook on GitHub is a comprehensive, open-source guide and curriculum intended to help aspiring and current professionals master the skills and tools required to become a Data Engineer.

by tmfnk 3 months ago saved 2 times

The Fullstack_data_course GitHub repository contains the curriculum and materials for a comprehensive course designed to teach users the skills required to become a full-stack data professional.

by tmfnk 3 months ago

🪄 Create rich visualizations with AI. Data-Formulator is a Microsoft-developed Python library available on GitHub designed for simple and efficient data generation and transformation, facilitating tasks like creating synthetic data and preparing datasets for analysis.

by tmfnk 3 months ago

This example uses the datamapplot visualization library to create a 2D map visualization of Hacker News post data, showing clusters of related topics and discussions.

by tmfnk 3 months ago

This is an “awesome list” repository that curates fully functional, click-and-run Google Colaboratory notebooks and repositories covering a broad spectrum of topics in Data Science, Deep Learning, and various AI applications.

by tmfnk 3 months ago