Skip to content
View kyleclo's full-sized avatar

Organizations

@solstat

Block or report kyleclo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 11,718 1,332 Updated Mar 26, 2026

Debugging, monitoring and visualization for Python Machine Learning and Data Science

Jupyter Notebook 3,467 360 Updated Mar 17, 2026

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Python 6,762 390 Updated Mar 16, 2026

Code for collecting, processing, and preparing datasets for the Common Pile

Python 253 25 Updated Feb 11, 2026

A full spaCy pipeline and models for scientific/biomedical documents.

Python 1,936 249 Updated Dec 4, 2025

Modeling, training, eval, and inference code for OLMo

Python 6,432 725 Updated Nov 24, 2025

Acceptance rates for the major AI conferences

Jupyter Notebook 4,740 316 Updated Sep 23, 2025

Apache PDFBox extension for precisely extracting character/symbol locations and identities from born-digital PDF files.

Java 19 7 Updated Sep 16, 2025
Jupyter Notebook 1,043 165 Updated Jul 9, 2025

A collection of scripts that build docker images for various use-cases.

Dockerfile 3 5 Updated Feb 3, 2025

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 24,712 5,864 Updated Aug 14, 2024

TensorFlow code and pre-trained models for BERT

Python 39,945 9,708 Updated Jul 23, 2024

Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)

Python 461 89 Updated Apr 11, 2024

Code for Defending Against Neural Fake News, https://rowanzellers.com/grover/

Python 919 218 Updated May 22, 2023

Code and Data for Evaluation WG

Python 42 24 Updated May 4, 2022

Code for the paper SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021). https://openreview.net/forum?id=OFLbgUP04nC

Python 30 5 Updated Oct 17, 2021

We evaluate many models used for biomedical and clinical nlp tasks, and train new models that perform much better.

Python 163 26 Updated Jul 29, 2021

An Interactive Tool for Scalable and Reproducible Error Analysis.

Python 109 11 Updated Jul 22, 2021

Library to scrape and clean web pages to create massive datasets.

Python 2,241 321 Updated Nov 11, 2020

Replication code for "With Little Power Comes Great Responsibility"

Jupyter Notebook 39 1 Updated Oct 15, 2020

A large (>5k) collection of search questions asked about Coronavirus 🦠

Python 14 1 Updated Mar 21, 2020