-
Computer Vision Center
- Barcelona
- http://dali92002.github.io/
Highlights
- Pro
Stars
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Get your documents ready for gen AI
Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout
The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"
collection of diffusion model papers categorized by their subareas
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
annotation system for labelling bounding boxes using openCV
[ICDAR 2024] (Best Student Paper🏆) Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
Official Implementation for "Transferring Unconditional to Conditional GANs with Hyper-Modulation" CVPRW 22 https://arxiv.org/abs/2112.02219
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…
Refine high-quality datasets and visual AI models
Python implementation accompanying the Transformer Inertial Poser paper at SIGGRAPH Asia 2022
A real-time system that simultaneously captures human pose, reconstructs the scene in sparse 3D points, and localizes the human in the scene with 6 IMUs and a body-worn phone camera
Official Code for ECCV 2022 paper "AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing"
DocILE: Document Information Localization and Extraction Benchmark
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
A latent text-to-image diffusion model
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
High-Resolution Image Synthesis with Latent Diffusion Models
Handwriting Synthesis with RNNs ✏️
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
OCR Annotations from Amazon Textract for Industry Documents Library
Convert Machine Learning Code Between Frameworks
Generative Adverserial Convolutional Neural Network