Official repository for the paper "Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search" [ICML'25]
-
Updated
Nov 12, 2025 - Python
Official repository for the paper "Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search" [ICML'25]
Parallel and LAzY Analyzer for PDFs 🏖️
Optimized PySpark jobs by analyzing query execution plans and rewriting transformations for efficiency. Applied techniques such as reducing shuffles, tuning partitions, selecting efficient operators, and choosing optimal data formats. Demonstrates performance tuning for large-scale Spark ETL workloads using Python and PySpark.
Analyze Data with Pandas-based Networks. Documentation:
Built a MySQL DB from scratch for the purpose of serving football stat api
Partition and Configuration Manager for SpiNNaker
Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] "Efficient 3D Semantic Segmentation with Superpoint Transformer" and SuperCluster introduced in [3DV'24 Oral] "Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering"
โปรเจกต์เกี่ยวกับอัลกอริทึมที่ใช้ในการแบ่งพื้นที่ และเส้นรอบรูปให้เท่ากัน
A scalable real-time chat backend built with FastAPI, Redis, and Celery. It supports one-to-one and group chats, offline message persistence, user tracking through Redis, and rate limiting with a sliding window algorithm. The system uses Celery tasks queue for throttled email notifications and maintains partitioned PostgreSQL message tables.
PyFluxPro V3.4 is a significant upgrade from previous versions. It has several new features, improved stability and is introduced ahead of the 2021 OzFlux Data Workshop.
Minimal library that enables partitioning of iterable collections in a concise manner.
Windows-first PySpark batch pipeline: ingest raw → bronze Parquet, run DQ checks, publish curated silver. PowerShell wrapper adds Spark hygiene, parallelism controls, and step logs.
Hedonic Games for Network Clustering
This project demonstrates key PySpark performance optimization techniques using a synthetic banking transactions dataset (~5,000 records). Built using Databricks and Delta Lake.
Python library allowing to manipulate data split into a collection of groups stored in Zarr format.
Labworks from Database design course of 4'th semester in ITMO University (2025)
Simple function for building ensembles of iterables that are disjoint partitions of an overall Cartesian product.
Tools to encrypt/partition files, as well as decrypt them and extract.
naive k-means clustering from scratch in vanilla python
Add a description, image, and links to the partitioning topic page so that developers can more easily learn about it.
To associate your repository with the partitioning topic, visit your repo's landing page and select "manage topics."