- San Jose, California
- in/nickhillprofile
Stars
Tools for Python coroutines and advanced scheduling for `asyncio`
TPU inference for vLLM, with unified JAX and PyTorch support.
Achieve state of the art inference performance with modern accelerators on Kubernetes
A high-throughput and memory-efficient inference and serving engine for LLMs
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
High-performance netty and thrift-based microservice RPC library for Java
Abstracted helper classes providing consistent key-value store functionality, with zookeeper and etcd3 implementations
Fake XRandR configurations for multi-head setups with crappy video drivers, like fakexinerama but with xrandr
Java utilities for working with CompletionStages
Netty project - an event-driven asynchronous network application framework
The Java gRPC implementation. HTTP/2 based RPC