-
https://snorkel.ai/
- Palo Alto, CA
- https://www.linkedin.com/in/hiromuhota
- @HiromuHota
Stars
💫 Industrial-strength Natural Language Processing (NLP) in Python
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while control…
Low-code framework for building custom LLMs, neural networks, and other AI models
Ready-to-run Docker images containing Jupyter applications
A system for quickly generating training data with weak supervision
Representation learning on large graphs using stochastic graph convolutions.
A command line utility to display dependency tree of the installed Python packages
Deep neural network to extract intelligent information from invoice documents.
Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
A benchmark for LLMs on complicated tasks in the terminal
Generate modern Python clients from OpenAPI
A web interface to extract tabular data from PDFs
strip output from Jupyter and IPython notebooks
Harbor is a framework for running agent evaluations and creating and using RL environments.
Simple reference implementation of GraphSAGE.
A Japanese NLP Library using spaCy as framework based on Universal Dependencies
🌲 A tool for converting PDF into hOCR with text, tables, and figures being recognized and preserved.
Snorkel MeTaL: A framework for training models with multi-task weak supervision
A knowledge base construction engine for richly formatted data
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
Neural Symbolic Machines is a framework to integrate neural networks and symbolic representations using reinforcement learning, with applications in program synthesis and semantic parsing.