-
The University of Tokyo
- http://jeonghunbaek.net/
Stars
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
Fast and memory-efficient exact attention
Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"
Breakthrough Method for Agile Ai Driven Development
DeepSeek Coder: Let the Code Write Itself
Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
This repo contains the code for generating the ToxiGen dataset, published at ACL 2022.
Segmentation of text in manga images
LEAKED SYSTEM PROMPTS FOR CHATGPT, GEMINI, GROK, CLAUDE, PERPLEXITY, CURSOR, DEVIN, REPLIT, AND MORE! - AI SYSTEMS TRANSPARENCY FOR ALL! 👐
A final sanity checklist to help your CS paper get accepted, not desk rejected.
LSP helper for ruff - an extremely fast Python linter, written in Rust.
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
Janus-Series: Unified Multimodal Understanding and Generation Models
This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
A lightweight framework for evaluating visual-language models.
A simple tool to estimate the reading order of comic panels