swj0419

Weijia Shi swj0419

https://weijia-shi.netlify.app/

Achievements

Stars

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 159,302 32,858 Updated Apr 13, 2026

deepset-ai / haystack

Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing…

MDX 24,825 2,714 Updated Apr 13, 2026

uwdata / visualization-curriculum

A data visualization curriculum of interactive notebooks.

Jupyter Notebook 1,358 275 Updated Apr 13, 2026

harvard-edge / cs249r_book

Machine Learning Systems

JavaScript 23,587 2,831 Updated Apr 13, 2026

camel-ai / camel

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 16,681 1,865 Updated Apr 13, 2026

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Python 5,420 539 Updated Apr 13, 2026

castorini / pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Python 2,046 503 Updated Apr 12, 2026

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 12,154 3,179 Updated Apr 8, 2026

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 20,002 2,062 Updated Mar 27, 2026

zhaochen0110 / Awesome_Think_With_Images

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,416 42 Updated Mar 9, 2026

lucidrains / transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,336 71 Updated Jan 27, 2026

texttron / tevatron

Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.

Python 733 125 Updated Jan 26, 2026

PaddlePaddle / ERNIE

The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.

Python 7,697 1,449 Updated Jan 4, 2026

ML-KULeuven / PySDD

Python package for Sentential Decision Diagrams (SDD)

C 73 22 Updated Dec 15, 2025

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 32,199 6,676 Updated Sep 30, 2025

allenai / OLMoE

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 1,007 107 Updated Sep 23, 2025

art-ai / pypsdd

The Python PSDD Package

Python 18 7 Updated Jul 20, 2025

1jsingh / negtome

Official Implementation for paper: Negative Token Merging: Image-based Adversarial Feature Guidance

Jupyter Notebook 74 2 Updated Jun 23, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 2,089 119 Updated Jun 2, 2025

InfiAgent / InfiAgent

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks (ICML 2024)

Python 187 23 Updated May 29, 2025

salesforce / factCC

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

Python 308 30 Updated May 1, 2025

alvin-zyl / CoLA

Implementation of CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation

Python 26 2 Updated Feb 18, 2025

pltrdy / rouge

A full Python Implementation of the ROUGE Metric (not a wrapper)

Python 718 100 Updated Nov 19, 2024

sangminwoo / awesome-vision-and-language

A curated list of awesome vision and language resources (still under construction... stay tuned!)

560 45 Updated Nov 4, 2024

r2llab / wrangl

Parallel data preprocessing for NLP and ML.

Python 34 2 Updated Nov 1, 2024

huggingface / OBELICS

Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M documents, 115B text tokens and 353M images.

Python 212 11 Updated Aug 28, 2024

pliang279 / awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

6,861 899 Updated Aug 20, 2024

complementizer / wcep-mds-dataset

Python 61 15 Updated Aug 20, 2024

hamishivi / EasyLM

Forked from young-geng/EasyLM

Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

Python 78 16 Updated Aug 17, 2024

geektutu / interview-questions

机器学习/深度学习/Python/Go语言面试题笔试题(Machine learning Deep Learning Python and Golang Interview Questions)

Jupyter Notebook 1,149 210 Updated Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weijia Shi swj0419

Achievements

Achievements

Block or report swj0419

Stars

huggingface / transformers

deepset-ai / haystack

uwdata / visualization-curriculum

harvard-edge / cs249r_book

camel-ai / camel

rllm-org / rllm

castorini / pyserini

EleutherAI / lm-evaluation-harness

openai / gpt-oss

zhaochen0110 / Awesome_Think_With_Images

lucidrains / transfusion-pytorch

texttron / tevatron

PaddlePaddle / ERNIE

ML-KULeuven / PySDD

facebookresearch / fairseq

allenai / OLMoE

art-ai / pypsdd

1jsingh / negtome

Open-Reasoner-Zero / Open-Reasoner-Zero

InfiAgent / InfiAgent

salesforce / factCC

alvin-zyl / CoLA

pltrdy / rouge

sangminwoo / awesome-vision-and-language

r2llab / wrangl

huggingface / OBELICS

pliang279 / awesome-multimodal-ml

complementizer / wcep-mds-dataset

hamishivi / EasyLM

geektutu / interview-questions