Skip to content
View svarshneysjsu's full-sized avatar
🎯
Focusing
🎯
Focusing
  • San Jose State University
  • San Jose

Block or report svarshneysjsu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
svarshneysjsu/README.md

Hi there! I'm Saumya Varshney πŸ‘‹

πŸŽ“ Master's in Applied Data Science | San Jose State University
πŸ“ San Jose, California


About Me

I'm a passionate AI Engineer and Data Scientist currently working on LLMs (Large Language Models), Generative AI, and multi-agent systems's real-world applications. I specialize in designing cloud-based ML pipelines, building autonomous AI agents, and experimenting with Chain-of-Thought (CoT) and Chain-of-Draft (CoD) prompting techniques for enhanced reasoning in LLMs.

My work bridges the gap between cutting-edge research and scalable systemsβ€”whether it's building decision-making bots, fine-tuning transformer models, or enabling intelligent data pipelines on AWS and GCP.


πŸ› οΈ Technical Skills

πŸ€– AI & LLMs

  • LLMs & Agentic AI: GPT-4, Claude, OpenAI API, LangChain, CrewAI, Retrieval-Augmented Generation (RAG)
  • NLP & Transformers: Hugging Face Transformers, BERTScore, Generative AI, NLP, Explainability & Interpretability

πŸ“ˆ Machine Learning & Data Science

  • Core ML: Scikit-learn, TensorFlow, PyTorch, Predictive Modeling, Anomaly Detection, Reinforcement Learning, Time-Series Forecasting
  • Data Science: Feature Engineering, A/B Testing, Model Evaluation, Statistical Analysis, KPI Reporting

πŸ§ͺ Data Engineering & ETL

  • ETL & Pipelines: Apache Airflow, AWS Glue, PySpark
  • Databases: MySQL, MongoDB, Google BigQuery, AWS RDS, Redshift
  • Query Languages: SQL, NoSQL

☁️ Cloud & MLOps

  • Cloud Platforms:
    • 🟧 AWS: S3, Lambda, Glue, RDS, Redshift, Step Functions
    • 🟦 GCP: Vertex AI Workflows, Cloud Functions
  • MLOps & DevOps: Docker, Git, GitHub Actions, CI/CD workflows

πŸ’» Programming & Frameworks

  • Languages: Python, SQL, Bash, PowerShell
  • Frameworks: Flask, FastAPI

πŸ“Š Visualization & BI Tools

  • Tableau, Power BI, Microsoft Excel, Google Sheets, Google Apps Script

πŸš€ Deployment & Interfaces

  • Gradio, Hugging Face Model Hub & Spaces (ZeroGPU), REST APIs

πŸš€ Featured Projects

1. Go2Bot OpenAI Integration

Description:
Showcased the integration of the Unitree Go2 robot with OpenAI during a summer research project. Features voice command processing and AI-driven task execution, enhancing robotic functionalities through advanced AI models.

Technologies Used:
Python, OpenAI API, Robotics Integration

GitHub Repository:
Go2Bot-OpenAI-Integration


2. AWS-Enabled Data Pipeline for Weather Data Analysis

Description:
Developed a robust AWS-enabled data pipeline designed for real-time weather data analysis. The system automates data ingestion, processing, storage, and analysis, providing actionable insights from NOAA datasets.

Technologies Used:
AWS (S3, Lambda, Glue, EC2), Python, Apache Airflow, Pandas, Matplotlib

GitHub Repository:
AWS-Enabled-Data-Pipeline-for-Weather-Data-Analysis


3. Paraphrase Detection

Description:
Developed a machine learning model to detect paraphrased sentences, improving NLP applications' accuracy in understanding text similarity.

Technologies Used:
Python, NLP, Scikit-Learn

GitHub Repository:
Paraphrase-Detection-with-Quora-Question-Pairs


πŸ“« Let's Connect!


Thanks for visiting!

Pinned Loading

  1. Go2Bot-OpenAI-Integration Go2Bot-OpenAI-Integration Public

    This repository showcases the integration of the Unitree Go2 robot with OpenAI, developed during a summer research project. It features voice command processing, and AI-driven task execution. Built…

    Python 8 1

  2. AWS-Enabled-Data-Pipeline-for-Weather-Data-Analysis AWS-Enabled-Data-Pipeline-for-Weather-Data-Analysis Public

    The "AWS-Enabled Data Pipeline for Weather Data Analysis" project is a sophisticated solution designed to streamline the collection, processing, and analysis of vast datasets related to weather pat…

    Jupyter Notebook

  3. Paraphrase-Detection-with-Quora-Question-Pairs Paraphrase-Detection-with-Quora-Question-Pairs Public

    In this project, an LSTM model is used to determine whether two Quora questions are similar or not. The Fasttext and GloVe word embeddings are utilized to train the model.

    Jupyter Notebook

  4. VTA-Ridership-Forecast VTA-Ridership-Forecast Public

    This project leverages machine learning techniques to forecast ridership for the Valley Transportation Authority (VTA).

    Jupyter Notebook

  5. LinkedIn_JobPostingAnalysis_NoSQL LinkedIn_JobPostingAnalysis_NoSQL Public

    LinkedIn_JobPostingAnalysis_Using_NoSQL

    Python

  6. Spotify-Data-Analysis-And-Visualization Spotify-Data-Analysis-And-Visualization Public

    Forked from SreenidhiHayagreevan/Spotify-Data-Analysis-And-Visualization

    Jupyter Notebook