Skip to content
View rojaAchary's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report rojaAchary

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Python package for consolidated and extensive Univariate,Bivariate Data Analysis and Visualization catering to both categorical and continuous datasets.

Python 204 28 Updated Sep 28, 2016

Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

Jupyter Notebook 33,966 7,194 Updated Dec 18, 2025

Tesseract Open Source OCR Engine (main repository)

C++ 71,464 10,427 Updated Dec 15, 2025

NLP, before and after spaCy

Python 2,234 249 Updated Sep 22, 2023

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,028 31,482 Updated Dec 19, 2025

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Python 37,290 6,256 Updated Jul 26, 2024

✨Fast Coreference Resolution in spaCy with Neural Networks

C 2,889 473 Updated Apr 13, 2023

☁️ Build multimodal AI applications with cloud-native stack

Python 21,810 2,239 Updated Mar 24, 2025

⚒️ Data preprocessing is the process of transforming raw data into an understandable format. It is also an important step in data mining as we cannot work with raw data. The quality of the data sho…

Jupyter Notebook 9 2 Updated Nov 15, 2021

MLOps examples

Jupyter Notebook 2,054 582 Updated Aug 2, 2024

A full spaCy pipeline and models for scientific/biomedical documents.

Python 1,907 248 Updated Dec 4, 2025

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

Python 1,402 176 Updated Nov 7, 2025

Multiplatform plotting library based on the Grammar of Graphics

Kotlin 1,725 55 Updated Dec 18, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,030 4,669 Updated Dec 18, 2025

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

Jupyter Notebook 29,706 13,240 Updated Jun 13, 2024

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

28,550 3,827 Updated Jul 18, 2024

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Jupyter Notebook 14,663 3,389 Updated Aug 12, 2024

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 36,033 5,085 Updated Dec 16, 2025

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Python 21,002 3,038 Updated Dec 16, 2025

Datasets, Transforms and Models specific to Computer Vision

Python 17,381 7,183 Updated Dec 18, 2025

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Python 15,384 3,617 Updated Nov 29, 2025

Version control for machine learning

Python 1,671 72 Updated Feb 25, 2025

ImageMagick is a free, open-source software suite for creating, editing, converting, and displaying images. It supports 200+ formats and offers powerful command-line tools and APIs for automation, …

C 15,178 1,516 Updated Dec 18, 2025

Synthetic data generators for tabular and time-series data

Jupyter Notebook 1,597 256 Updated Dec 18, 2025

🪐 End-to-end NLP workflows from prototype to production

Python 1,405 469 Updated Oct 15, 2024

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…

MDX 23,668 2,528 Updated Dec 18, 2025