Stars
Ressources et présentations pour le cours en XML-TEI des M2 TNAH de l'École des chartes
🌐 Jekyll is a blog-aware static site generator in Ruby
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
A Python library for automating interaction with websites.
Open source annotation tool for machine learning practitioners.
Accessible large language models via k-bit quantization for PyTorch.
CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTor…
Page-wise text recognition with lower-supervision line data models
Tesseract Open Source OCR Engine (main repository)
Image Retrieval in Digital Libraries - A Multicollection Experimentation of Machine Learning techniques
Image restoration with neural networks but without learning.
Ultralytics YOLOv5 in PyTorch > ONNX > CoreML > TFLite
scikit-learn: machine learning in Python
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Mémoire et projet de module python servant à récupérer les textes de la branche francophone et à effectuer plusieurs opérations statistiques.