-
Stanford & UT Austin
- https://jiaxin-pei.github.io/
Stars
ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents, NeurIPS 2025
robots.txt-style permission manifest for web agents
Official code of "The Automated but Risky Game: Modeling Agent-to-Agent Negotiations and Transactions in Consumer Markets"
Data and experiment code for "Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals’ Subjective Text Perceptions" (ACL2025)
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
The official repo for SocKET: Social Knowledge Evaluation Tests
A curated list of awesome Active Learning
potato: portable text annotation tool
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022
A suite of tools for managing crowdsourcing tasks from the inception through to data packaging for research use.
An NLP processing pipeline for characters in fanfiction. Developed by students at Carnegie Mellon University from 2019-2021.
Tools for collecting social media data around focal events
Official repository for the ICWSM '21 paper "More than meets the tie: Examining the Role of Interpersonal Relationships in Social Networks"
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).
A PyTorch implementation of "TextFuseNet: Scene Text Detection with Richer Fused Features".
计算精神病学在线文献报告讨论会(Computational psychiatry online journal club(CPoJC))
Topic Modeling for The New York Times News Dataset
A dataset contains 37 million douban dushu comments
Demographic and Economic Data for Tracts and Counties
For associated data that can be mashed up with ours. This is data like Census demographics, CDC flu rates, and hospital beds