Sasank Chithirala
703-801-9727 — # schihtius@gmail.com — ï Sasank Chithirala — § Sasank Chithirala
Summary
Data Analyst with expertise in SQL, Python, and statistical analysis. Skilled in data visualization (Tableau, Power BI,
Seaborn, Excel), and forecasting. Experienced with databases (SQL Server, MySQL, PostgreSQL, MongoDB) and big data
tools (Spark, Hadoop). Proficient in machine learning (TensorFlow, Sklearn) and cloud platforms (AWS, Google Cloud,
Databricks).
Skills
Languages: Python, Java, C++, R, SQL
Databases: SQL Server, MySQL, Oracle, PostgreSQL, MongoDB
Machine Learning: Spark MLlib, TensorFlow, NumPy, Sklearn
Business Intelligence: Tableau, Power BI, Seaborn, Excel
Big Data: Apache Spark (PySpark), Hadoop, NoSQL (MongoDB)
Cloud Technologies: AWS (EC2, S3, RDS, Glue), Google Cloud (Cloud SQL, Looker), Databricks
Education
George Mason University 07/2023 - Present
M.S. in Data Analytics Engineering
Courses taken - Big Data, Database management systems, Statistics,
Decision Analysis, Marketing Research, Data Mining
Vellore Institute of Technology 07/2019 - 04/2023
B.Tech in Electronics and Computer Engineering
Courses taken - Machine learning algorithms,Data Analytics and visualization,
Cyber physical systems, Probability theory
Projects
Analysis of Pandemic’s Impact on Healthcare Systems and Happiness Index(Github)
– Leveraged the COVID Python library, World Happiness Report, and vaccination datasets, creating a real-time
pipeline for advanced analytics and employing Matplotlib and Seaborn for non-linear exploration.
– Implemented interquartile range classification to solve data segmentation complexities and gained deeper
insights into country-level conditions.
– Applied Random Forest, Decision Tree, Naive Bayes, and Extra Tree models to solve classification accuracy
constraints and gained a 90% accuracy rate, further boosted to 92.5% via Ensemble Learning (Voting
Classifiers).
Customer Churn Prediction in Telecom Industry(Github)
– Developed a NoSQL database using MongoDB on Docker(For sand boxing) for scalable and flexible data storage,
and leveraged Databricks for efficient big data processing.
– Handled the preprocessing of the data using Pyspark and Pandas, ensuring data quality and categorization.
– Implemented in-depth analysis of churn patterns using demographic and usage data, uncovering key churn factors.
– Secured a model accuracy 80% and a precision of 67.2% employing Gradient boosting classifier allowing for
accurate customer churn predictions.
Target Retail Market Analysis and Strategic recommendations(Github)
– Conducted a research study on Target’s sales decline using 101 survey responses with Qualtrics and Tableau for
data visualization.
– Analyzed shopping trends, revealing that 76.1% of respondents prefer online shopping, highlighting a shift in
consumer behavior towards digital retail platforms.
– Proposed strategic recommendations, including aggressive pricing, digital platform upgrades, and
omnichannel expansion, enhancing Target’s market position and increase sales.
Optimized Climate Data Integration with Real-Time LLM Querying(Github)
– Preprocessed large-scale climate datasets from NOAA and ECMWF using advanced data wrangling techniques,
ensuring data integrity and robust data management in SQLite.
– Engineered custom MCP servers to integrate SQLite with a Large Language Model (LLM) , enabling real-time
query responses and AI-driven data insights.
– Optimized SQLite queries for low-latency retrieval of relevant data, resulting in improved performance and
efficient data retrieval processes.
Experience
George Mason University 08/2024 – Present
Student Teachers and Research Assistant - Computational Data Science
– Analyzed student grade data using K-Means and Hierarchical clustering techniques to identify performance
trends, enabling targeted improvement strategies and boosting assignment completion rates.
– Created interactive Power BI dashboards to visualize real-time metrics, providing data-driven insights and
increasing overall comprehension.
– Leveraged the AWS CLI to configure AWS S3 for scalable data storage and integrated virtualization practices,
resulting in efficient cloud computing workflows and enhanced accuracy through systematic troubleshooting.
ACS Solutions 05/2022 – 07/2022
IoT Intern
– Engineered an interactive dashboard in ThingSpeak Cloud for an IoT-based pollution detection system, enabling
real-time pollutant monitoring and data visualization.
– Incorporated real-time sensor data over Wi-Fi, creating a dynamic dataset with seven parameters for continuous
monitoring and analysis.
– Examined and visualized pollutant levels using MATLAB, revealing a notable 15% increase in particulate matter.
Certifications
– Big Data and Hadoop – Udemy
– Learning Python for Data Analytics and Visualization – Udemy
– Advanced C++, Java, MySQL – Spoken Tutorial Project by IIT Bombay
– SQL Essential Training - LinkedIn
– Docker foundations professional certificate - LinkedIn