Bharat Manu
New York | bm4135805@gmail.com | +1 (650) 480 3312
Summary
Experienced and adaptable Data Science professional with 15+ years of experience delivering machine learning, AI, and GenAI-powered solutions for enterprise applications across
the finance, insurance, and technology sectors. Specialized in building and deploying ML pipelines, LLM-backed APIs, and intelligent data products using Python, Azure ML, and
OpenAI/Llama/Mistral frameworks. Proven ability to lead end-to-end projects rom raw data ingestion to real-time model deployment—using scalable architectures and modern
DevOps practices. Strong focus on MLOps, reproducibility, data quality, and measurable business impact.
Professional Work Experience
Jefferies group Remote: New York City
Lead Engineer /DS Jan 2021-
Present
Developed scalable, production-grade Python microservices for AI/ML model inference and business rule enforcement. Designed and maintained secure RESTful APIs
(Flask) for GenAI model interaction and integration with internal claim systems.
Built ML models for transaction classification, fraud scoring, and risk assessment using Python, scikit-learn, and XGBoost on millions of financial events. Designed retrieval-
augmented generation (RAG) system with Lang Chain + OpenAI to summarize customer portfolios and support queries via chatbot UI.
Monitored AI/ML models in production environments, identified performance bottlenecks, and implemented improvements for reliability and accuracy. Automated
deployment and CI/CD pipelines for AI/ML models using Azure DevOps, improving release efficiency and reducing manual errors.
Developed ML pipelines in Python and R, integrating caret, random Forest, and glmnet for model development. Used NLP techniques for document classification and
keyword extraction from claims documents using nltk, spaCy, and TF-IDF.
Developed real-time data pipelines using PySpark and Azure Data Factory for ingesting and transforming third-party credit and KYC data. Deployed Python-based REST
APIs (Fast API) exposing scoring and summarization endpoints, containerized using Docker and deployed on Azure AKS.
Built end-to-end data pipelines using Pandas, PySpark, and Azure Data Factory for claims preprocessing, document OCR, and enrichment. Integrated OpenAI’s GPT-4 and
Lang Chain agents with custom prompt templates for underwriting and policy recommendations. Deployed containerized APIs using Docker and Azure Kubernetes Service
(AKS), with role-based access control and autoscaling.
Created streaming data ingestion modules for real-time policy updates using Azure Event Hubs and Python consumers. Implemented model monitoring dashboards with
Python scripts for performance metrics, data drift, and API usage.
Refactored legacy scripts into modular, testable Python packages following PEP-8 and TDD best practices. Built automated validation checks for document ingestion
pipelines with retry logic, schema checks, and logging. Led a team of 3 junior Python engineers and provided internal tools for API testing, mocking, and load simulation.
FinEdge Analytics Remote: New York City
Software Engineer III Feb 2015- Jan
2021
Built and deployed supervised learning models (Logistic Regression, Random Forest, XGBoost, SVM) to predict credit default risk and customer churn, improving accuracy
Implemented unsupervised learning techniques (K-Means, Hierarchical Clustering, DBSCAN) for customer segmentation, used in targeted marketing campaigns.
Leveraged Azure Machine Learning Studio for model training, versioning, and operationalization of pipelines. Integrated data preprocessing workflows using PySpark within
Azure Databricks and managed datasets in Azure Data Lake Storage Gen2.
Secured AI/ML infrastructure using managed identities, role-based access controls (RBAC), and Azure Key Vault. Ensured responsible AI implementation by incorporating
fairness, transparency, and explainability metrics into deployed models.
Built and deployed Python-based APIs using Fast API to expose LLM capabilities via secure, tokenized endpoints. Created a hybrid data pipeline combining Azure Data Lake
+ SQL + Pandas for portfolio analysis, risk scoring, and personalization.
Used TypeScript + React on the front end to consume AI-backed APIs for live financial assistant dashboards. Integrated Cohere and OpenAI models with dynamic routing
based on latency and accuracy, controlled via API parameters.
Built robust data mapping frameworks between various vendor schemas and the unified canonical data model to streamline processing and analytics readiness. Designed and
deployed CI/CD pipelines using Jenkins, Docker, and Git to automate testing and deployment of ETL workflows.
Amdocs Remote: New York City
Senior Software engineer Jan 2013-Jan 2015
Designed and implemented RESTful APIs and microservices architecture for seamless data exchange between financial applications Developed financial risk models and
algorithmic trading components using Python libraries (NumPy, SciPy, Pandas).
Utilized Azure Synapse Analytics to query and transform large volumes of historical tick and trade data, supporting deep Monitored quant systems using Azure Monitor and
Log Analytics, proactively identifying performance bottlenecks and downtime. factor analysis.
Designed RESTful Webservices using DJANGO, with emphasis on improved Security for the service using DJANGO-HTTP AUTH with HTTPS. Successfully migrated the
Django database from SQLite to MySQL to PostgreSQL with complete data integrity and designed, developed and deployed CSV Parsing using the big data approach
Tuned hyperparameters using GridSearchCV, RandomizedSearchCV, and Bayesian Optimization. Designed a model monitoring dashboard to track performance drift using
Power BI and Azure Monitor.
Deloitte Sep 2011-Jan2013
Software Engineer Stanford, CA
Utilized SQL Server Management Studio (SSMS) for query tuning, troubleshooting, and optimizing ETL workflows, significantly improving data pipeline performance and
reducing latency.
Designed, deployed, and supported scalable APIs using Azure API Management (APIM) to expose internal and third-party services securely. Developed custom connectors
and proxies using Azure Functions, enabling backend integration with SQL, Blob, Dynamics, and third-party services.
Designed and implemented early machine learning models (Linear Regression, Naive Bayes, Decision Trees) using scikit-learn for predictive analytics (e.g., customer
response prediction).
Used matplotlib and seaborn to visualize trends and correlations across financial and operational datasets. Worked with SQLite and MySQL databases to fetch, aggregate, and
export data for analysis.
Created basic REST APIs using Python’s Flask framework for internal tools and dashboard data feeds.
Infosys Feb 2010-Aug 2011
System Engineer Stanford, CA
Designed and developed the UI of the website using HTML5, CSS3, JavaScript, React and jQuery. Used GraphQL for complete and understandable description of the data in
our API, gives clients the power to ask for exactly what they need
Created scheduled jobs using Celery and cron to automate data synchronization between online transaction systems and reporting databases. Built data ingestion pipelines,
importing large volumes of transaction logs and customer activity into HDFS for batch processing.
Integrated Azure Functions to trigger portfolio rebalancing workflows based on real-time market events and model outputs. Developed secure APIs for strategy execution and
analytics using Azure API Management, Azure App Services, and Flask/FastAPI.
Education -
RGPV University, India 2007 - BS in ESC (Engineering science course)
MS in IT, Stratford University (2008)