Lists (32)
Sort Name ascending (A-Z)
Agents
Browser Automation
Character model
Clustering
Clustering loss
Curriculum Learning
Database Subseting
Dataset distillation
Deberta
Dependencies Graph
Efficient Training
Embeddings
Encoder pretraining
General training
Geolocation
github
Graph Visualization
Hard Negatives
Instruction Embeddings
Job titles
OSINT
Recommendation systems
Repository Parsing
Scraping
Search
Testing
Tools
Topic Modeling
Utils
Vibe Coding
Visualization
Starred repositories
Youtu-Embedding is an industry-leading, general-purpose text representation model developed by Tencent Youtu Lab.
DeepDiff: Deep Difference and search of any Python object/data. DeepHash: Hash of any object based on its contents. Delta: Use deltas to reconstruct objects by adding deltas together.
E2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker
A Python tool that automatically sorts class methods by visibility and type
Add properties and method specializations to Python enumeration values with a simple declarative syntax.
EvaByte: Efficient Byte-level Language Models at Scale
Code repository for the paper "MrT5: Dynamic Token Merging for Efficient Byte-level Language Models."
Skylos is the watchdog for your repository. It maps your code's structure to hunt down dead logic, trace tainted data, and kill security rot
Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting irrelevant tokens from its vocabulary. This repository contain…
an opinionated approach to productive development with Claude Code
📑 PageIndex: Document Index for Reasoning-based RAG
Full guide on claude tips and tricks and how you can optimise your claude code the best & strive to find every command possible even hidden ones!
A curated list of awesome commands, files, and workflows for Claude Code
multilspy is a lsp client library in Python intended to be used to build applications around language servers.
Intelligent automation and multi-agent orchestration for Claude Code
OmniMatch: Joinability Discovery in Data Products
This is the code repo for our paper "Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search".
State-of-the-art paired encoder and decoder models (17M-1B params)