Skip to content
View Digo's full-sized avatar
  • Carnegie Mellon University
  • Pittsburgh, PA, USA

Organizations

@oaqa @lapps @asyml

Block or report Digo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training

C++ 1,055 204 Updated Mar 12, 2026

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,189 572 Updated Aug 22, 2025

SPLADE: sparse neural search (SIGIR21, SIGIR22)

Python 988 95 Updated May 3, 2024

Awesome Search - this is all about the (e-commerce, but not only) search and its awesomeness

Shell 1,528 132 Updated Apr 5, 2026

Full text search that feels like a numpy array

Python 307 12 Updated Feb 1, 2026

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 12,151 1,067 Updated Mar 8, 2026

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 5,023 617 Updated Jul 2, 2024

Generative Representational Instruction Tuning

Jupyter Notebook 689 50 Updated Jun 25, 2025

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 42,260 3,365 Updated Apr 4, 2026

A Pythonic framework to simplify AI service building

Python 2,802 193 Updated Jan 31, 2026

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Python 8,282 962 Updated Feb 25, 2022

An industrial deep learning framework for high-dimension sparse data

PureBasic 4,305 1,027 Updated Sep 25, 2024

A high performance and generic framework for distributed DNN training

Python 3,715 493 Updated Oct 3, 2023

Navigating Spreading-out Graph For Approximate Nearest Neighbor Search

C++ 725 165 Updated Sep 26, 2025

2020 MIND news recomendation first place solution

Jupyter Notebook 93 25 Updated Mar 10, 2021

Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Python 11,409 883 Updated Jan 13, 2026

Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)

5,971 991 Updated Feb 15, 2023
Python 759 87 Updated May 22, 2023

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

TypeScript 3,648 371 Updated Mar 26, 2026

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance

Java 3,863 475 Updated Apr 1, 2026

Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index (DenSPI)

Python 200 24 Updated Jul 6, 2023

TextAttack πŸ™ is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

Python 3,398 440 Updated Jul 10, 2025

A comprehensive list of awesome contrastive self-supervised learning papers.

1,307 126 Updated Sep 10, 2024

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Jupyter Notebook 2,051 209 Updated Jan 9, 2024

Label Studio is a multi-type data labeling and annotation tool with standardized output format

TypeScript 26,935 3,465 Updated Apr 6, 2026

πŸ§‘β€πŸ« 60+ Implementations/tutorials of deep learning papers with side-by-side notes πŸ“; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 66,225 6,664 Updated Jan 22, 2026

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Python 1,863 316 Updated Apr 6, 2023

Tools for cleaning and normalizing text data

R 257 30 Updated Mar 5, 2026

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

Python 1,611 191 Updated Aug 12, 2020

A fast, high-quality neural vocoder.

Python 297 51 Updated Jul 18, 2023
Next