Stars
π» Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.
An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.
Official inference repo for FLUX.1 models
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
A high-throughput and memory-efficient inference and serving engine for LLMs
Efficient Triton Kernels for LLM Training
Minimalistic large language model 3D-parallelism training
Minimalistic 4D-parallelism distributed training framework for education purpose
Toolkit for linearizing PDFs for LLM datasets/training
Optimized primitives for collective multi-GPU communication
Script to download all your Snapchat memories
High performance self-hosted photo and video management solution.
A Survey on Data Selection for Language Models
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
π MINT-1T: A one trillion token multimodal interleaved dataset.
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Utilities intended for use with Llama models.
Code to accompany blog post https://reorchestrate.com/posts/sqlite-transactions
Beyond file syncing and sharing, a new way to organize your files with extensible file properties and flexible views
Build and share delightful machine learning apps, all in Python. π Star to support our work!
Flash OS images to SD cards & USB drives, safely and easily.
Open-Sora: Democratizing Efficient Video Production for All
Open data product with real estate listings from Idealista. The datasets are for three major cities in Spain and the year 2018. https://doi.org/10.1177/23998083241242844
This repository is a curated collection of the most exciting and influential CVPR 2024 papers. π₯ [Paper + Code + Demo]
Communications from the Steering Council