Hi there, I'm Shailesh Kumar 👋!
- A Machine Learning / GenAI Engineer based in India.
- Working as Lead Data Scientist at Katonic AI.
- I love math, programming, data science, and books.
- Creator of ExplainIt, an opensource package for drift detection & data quality management.
- Open Source Enthusiast.
- See my portfolio at shaileshkumar97.github.io.
- Writing
Python
,SQL
,HTML/CSS
,PostgreSQL
,MySQL
,Redis
etc... - Contributing to Open Source.
- Mostly active on LinkedIn.
- Currently, building end-to-end production ready generative ai assistants/agents to handle different types of knowledge base for variety of usecases.
- Previously, built an end-to-end production pipeline for processing short videos with different usecases.
-
🎛 Machine Learning Operations (MLOps):
- Language:
Python
•SQL
- Framework:
Mlflow
•Kubeflow
•Elyra
•Dash
•FastAPI
•Streamlit
- Databases:
PostgreSQL
•MySQL
•Redis
•Snowflake
- Concepts:
Data Pipeline
•Feature Store
•Data Governance
•Model Pipeline
•Model Deployment
•App Deployment
•Model Monitoring
•Drift Detection
•Model Explainability
- Language:
-
👨💻 Python Developer:
-
Open Source Projects:
- Explainit: A modern enterprise-ready business intelligence web application SDK for Drift Detection, Monitoring & Data Quality Management.
-
In-House SDK: (for katonic.ai)
- Feature Store: To manage end-to-end life-cycle of features & integrate with existing data stores, feature pipelines, data governance, and ML platforms.
- Connectors: To access the data from different databases/warehouses and stores to a given destination.
- FileManager: To access, store and update/manipulate objects within the katonic file browser.
- Pipeline: To convert an existing notebook into a Kubeflow pipeline.
- AutoML: To build, Train & Log different Machine Learning, Deep Learning models.
- Log: To quick register the trained models with mlflow in platform for deployment to the production environment.
-
-
🧮 Machine Learning:
- Language:
Python
•SQL
- Framework:
Scikit-Learn
•Xgboost
•Catboost
•Pandas
•Plotly
•Matplotlib
•Pyspark
- Databases:
PostgreSQL
•MySQL
•PostgreSQL
- Big Data:
Spark
•Data Lake (Delta, Hudi, Hive)
- Protocol:
REST
- Language:
-
🤖 Deep Learning:
- Language:
Python
- Framework:
PyTorch
•Tensorflow
•Keras
•OpenCV
•Librosa
- Language:
-
🗄️ Backend:
- Language:
Python
- Framework:
FastAPI
•Flask
•Streamlit
,Dash
- Databases:
PostgreSQL
•MySQL
•AWS S3
•Redis
•SnowFlake
- System Architecture:
Monolithic
•Modular
- Protocol:
REST
- Language:
-
🖥 Frontend:
- Language:
HTML
•CSS
•Python
- Framework/Library:
Dash
,Streamlit
- Utils:
Bootstrap
•Modular CSS
- Language:
-
🎡 Ecosystem:
- Containerization:
Docker
- Version Control:
Git
•GitHub
- CI/CD:
Github Actions
- Project Management:
GitHub Projects
- Containerization: