0% found this document useful (0 votes)

54 views10 pages

Big Data & Cloud Solutions Expert

Resumes

Uploaded by

Mandeep Bakshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views10 pages

Big Data & Cloud Solutions Expert

Resumes

Uploaded by

Mandeep Bakshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

SAHITHI DEVI

+1(313)687-4486 || sahithiredz@gmail.com

PROFESSIONAL SUMMARY:

 Over 10 years of experience in Big Data ecosystems using Hadoop, Pig, Hive, HDFS, Hbase, MapReduce,
Sqoop, Storm, Spark, Scala, Airflow, Nifi, Snowflake, Flume, Kafka, Yarn, Oozie, Zookeeper.
 Experience in installation, configuration, Management, supporting and monitoring Hadoop cluster using various
distributions such as Apache Hadoop, Cloudera, Hortonworks.
 Experience on Apache Spark using Spark Core, Spark Context, Spark SQL, Spark MLlib, DataFrame, RDD
 Experience in developing Spark Streaming jobs by developing RDD’s (Resilient Distributed Datasets) using
Scala, PySpark and Spark-Shell.
 Extensive experience in Amazon Web Services (AWS) which includes services like EC2, S3, VPC, ELB, IAM,
DynamoDB, Cloud Front, Cloud Watch, Route 53, Elastic Beanstalk (EBS), Auto Scaling, Security Groups,
EC2, ECS, Code Build, Code Deploy, Dynamo DB, Auto Scaling, Security Groups, Red shift, CloudWatch,
CloudFormation, CloudTrail, Ops Works, Kinesis, IAM, SQS, SNS, SES.
 Experienced in ingesting data into HDFS from Relational databases like MYSQL, Oracle, DB2, Teradata, SQL,
Postgres using Sqoop. Experience in various Hadoop file formats like Parquet, ORC & AVRO file.
 Experience in working with NoSQL Databases like HBase, DynamoDB, Cassandra and MongoDB.
 Analyzed data using HQL, PigLatin and extending HIVE and PIG core functionality by using custom UDFs.
 Experience in working with CI/CD pipeline using tools like Jenkins and Chef.
 Hands on experience in setting up workflow using Apache Airflow and Oozie workflow engine for managing
and scheduling Hadoop jobs. Experience in Data warehousing concepts and ETL processes.
 Experience in building Data Models and Dimensional Modeling with 3NF, Star and Snowflake schemas for
OLAP and Operational data store (ODS) applications.
 Experience in designing ETL workflows on Tableau.
 Experience in job/workflow scheduling and monitoring tools like Oozie, AWS Data pipeline & Autosys.
 Participated in varying experience levels in building and supporting ETL processes for systems.
 Worked with Cloudera and Hortonworks distributions.
 Experience with Design, code, debug, reporting, data analysis of web applications utilizing Python.
 Experienced with Spark streaming API to ingest data into Spark Engine from Kafka.
 Hands on experience in GCP, Big Query, GCS bucket, G-cloud function, Google Cloud Composer, Cloud
dataflow, Pub/suB cloud shell, GSUTIL, BQ Command line utilities, Data Proc, Stack driver.
 Experience in various Project Management services like JIRA for tracking issues, bugs related to code and
GitHub for various code reviews and Worked on various version control tools like CVS, GIT, SVN.
 Experienced in using IDEs and Tools like Eclipse, Net Beans, GitHub, Jenkins, Maven, and IntelliJ.
 Experience in Shell Scripting, SQL Server, UNIX, Linux, Open stock, and Expertise python scripting.
 Strong experience in writing scripts using Python API, PySpark API and Spark API for analyzing the data.
 Extensively used Python Libraries PySpark, Pytest, PyExcel, Boto3, embedPy, NumPy and Beautiful Soup.
 Experience on Migrating SQL database to Azure data Lake, Azure data lake Analytics, Azure SQL
Database, Data Bricks and Azure SQL Data warehouse and controlling, granting database access and
Migrating On premise databases to Azure Data Lake store using Azure Data factory.
 Knowledge on OpenShift platform in managing Docker containers using Docker, Kubernetes Clusters.
 Strong experience in working with UNIX/LINUX environments, writing shell scripts.
 Experienced in working in SDLC, Agile and Waterfall Methodologies.
 Experience working with GitHub, Jenkins, and Maven.
 Conducted comprehensive analysis and optimization of SAP tables, creating efficient CDS views that
streamlined data access and reduced redundancy, leading to more accurate and timely business insights.
 Integrated Collibra DGC using Collibra Connect(MuleESB) with third-party tools such as Ataccama, IBM IGC
and Tableau to apply DQ rules, import technical lineage and to create reports using the MetaData in Collibra
DGC.
 Integrated Ataccama with Collibra using MuleESB connector and publish DQ rule results on Collibra using
REST API calls.
 sSuccessfully integrated ABAP CDS views with SAP Fiori applications,
enabling real-time data visualization and interactive reporting for end-users,
which improved decision-making processes and user satisfaction.
 Implemented complex ABAP Core Data Services (CDS) views to enhance
data modeling and reporting capabilities.

TECHNICAL SKILLS:

 Big Data Ecosystem: HDFS, MapReduce, Yarn, Spark, Hive, Impala, Stream Sets, Sqoop, HBase, Pig, Oozie,
Zookeeper, Azure, Amazon web services (AWS), EMR.
 Hadoop Distributions: Apache Hadoop 2.x/1.x, Cloudera CDP, Hortonworks HDP
 Programming Languages: Python, Scala, Java, R, JavaScript, Shell Scripting, Pig Latin, HiveQL.
 NoSQL Database: Cassandra, MongoDB.
 Database: MySQL, Oracle, MS SQL SERVER, PostgreSQL, DB2.
 Cloud Technologies: AWS (EMR, EC2, RDS, S3, Athena), Microsoft Azure, GCP.
 ETL/BI: Informatica, SSIS, SSRS, SSAS, Tableau, Power BI.
 Web Development: Spring, J2EE, JDBC, .Net MVC, Tomcat, JavaScript, Node.js, HTML, CSS.
 Operating systems: Linux (Ubuntu), Windows (XP/7/8/10)
 IDE: IntelliJ, Eclipse, Spyder, Jupyter
 Others: Machine learning, Spring Boot, Jupyter Notebook, Jira, Service Now

PROFESSIONAL WORK EXPERIENCE:

Verizon, TX May 2023 – Present

Sr. Big Data Engineer

Key Responsibilities:

● Designed and executed an extensive data migration plan to shift data from Hadoop to Azure, utilizing Azure
Data Factory for streamlined and automated data transfer operations.
● Utilized Azure Data Lake Storage Gen2 as the designated repository for the migrated data, ensuring scalability,
data security, and robust availability to meet the demands of large-scale data storage requirements.
● Integrated Azure Databricks into the migration process to handle data transformation and processing tasks,
capitalizing on its scalable computational resources and collaborative analytics environment.
● Utilized Azure Data Factory's Copy Activity functionality to coordinate the transfer of data from Hadoop
Distributed File System (HDFS) to Azure Data Lake Storage Gen2, ensuring integrity of data migration
process.
● Leveraged Azure Data Factory's Data Management Gateway to establish seamless connectivity and facilitate
data transfer between on-premises Hadoop clusters and Azure cloud services.
● Implemented Azure Data Factory's Data Flows to execute intricate data transformations and manipulations
during the migration, ensuring compatibility and optimization for Azure data storage solutions.
● Expert in using Databricks with Azure Data Factory (ADF) to compute large volumes of data.

● Performed ETL operations in Azure Databricks by connecting to different relational database source systems
using ODBC connectors.
● Developed automated process in Azure cloud to ingest data daily from web service and load into Azure SQL
DB.
● Deployed data replication and synchronization mechanisms across Azure Cosmos DB to ensure continuous
availability, disaster recovery preparedness, and global data dissemination across multiple Azure regions.
● Developed Streaming pipelines using Azure Event Hubs and Stream Analytics to analyze data for dealer
efficiency and open table counts for data coming in from IOT enabled poker and other pit tables.
● Analyzed data where it lives by Mounting Azure Data Lake and Blob to Databricks.

● Used Logic App to take decisional actions based on the workflow.

● Implemented Azure Data bricks clusters, python and pyspark notebooks, jobs, and auto scaling.

● Performed data cleansing and applied transformations using Databricks and Spark data analysis.

● Used Azure Synapse to manage processing workloads and served data for BI and prediction needs.

● Developed Spark Scala scripts for mining data and performed transformations on large datasets to provide real-
time insights and reports.
● Designed and automated Custom-built input adapters using Spark, Sqoop, and Oozie to ingest and analyze data
from RDBMS to Azure Data Lake.
● Developed automated workflows for daily incremental loads, moved data from RDBMS to Data Lake.

● Configured Spark streaming to receive real data from Apache Kafka to store stream data to HDFS using Scala.

● Involved in building an Enterprise Data Lake using Data Factory and Blob storage, enabling other teams to
work with more complex scenarios and ML solutions.
● Used Azure Data Factory, SQL API, and Mongo API and integrated data from MongoDB, MS SQL, and cloud
(Blob, Azure SQL DB).
● Designing the distribution strategy for SAP tables.

● Extensive knowledge in Data transformations, Mapping, Cleansing, Monitoring, Debugging, performance

tuning, and troubleshooting Hadoop clusters.
● Managed resources and scheduling across the cluster using Azure Kubernetes Service.

● Facilitated data for interactive Power BI dashboards and reporting purposes.

Environment: Azure (HDInsight, Databricks, Data Lake, Blob Storage, Data Factory, SQL DB, SQL DWH, AD,
AKS), Scala, Python, PySpark, Hadoop 2.x, Spark v2.0.2, NLP, Airflow v1.8.2, Hive v2.0.1, Sqoop v1.4.6, HBase,
Oozie, Talend, Cosmos DB, MS SQL, MongoDB, Apache Kafka, AWS, Ambari, Power BI, Azure DevOps.

Bank of America, NC Apr 2021 – May 2023

Senior Big Data Engineer

Key Responsibilities:

● Conducted extensive data preprocessing on AWS, encompassing tasks such as feature scaling, normalization,
and handling missing values to prepare datasets for model training and assessment.
● Orchestrated end-to-end machine learning pipelines on AWS, involving data ingestion from sources like
Amazon S3, data processing using AWS Glue, model training with Amazon Sage Maker, and model
deployment via AWS Lambda and Amazon ECS.
● Integrated PySpark with machine learning libraries such as scikit-learn and TensorFlow to execute sophisticated
data transformations and feature engineering tasks for predictive modeling purposes.
● Utilized PySpark's Data Frame API to execute intricate data transformations, including joins, aggregations, and
window functions, thereby ensuring the integrity and accuracy of the data.
● Implemented custom User Defined Functions (UDFs) in PySpark to address specific business logic
requirements, thereby enhancing the adaptability and scalability of ETL processes.
● Developed specialized data ingestion connectors for Snowflake, leveraging Snow park and Snowflake
Connector for Python to expand data integration capabilities and handle data from various sources or APIs.
● Automated data loading into Snowflake from AWS S3 by configuring Snow pipe auto-ingestion integration,
eliminating manual intervention, and streamlining the process.
● Integrated Snowflake with data orchestration tools like Apache Airflow or DBT to automate data workflows,
reducing manual intervention and improving operational efficiency.
● Executed Hadoop jobs on EMR clusters performing Spark, Hive, and MapReduce Jobs for tasks including
building recommendation engines, transactional fraud analytics, and behavioral insights.
● Migrated Hive and MapReduce jobs to EMR to automate workflows using Airflow, streamlining processes, and
improving efficiency.
● Utilized PySpark and Scala on AWS Databricks for data transformation and enhancement tasks, customizing
transformations and ensuring efficient resource utilization.
● Leveraged Kafka Controller API for efficient resource utilization of Kafka brokers in changing workload.

● Combined Kafka with Apache NiFi for data ingestion into Hadoop clusters, utilizing NiFi's capabilities for data
routing and transformation to efficiently handle and transmit data streams to Kafka topics.
● Managed file movements between HDFS, AWS S3, utilizing S3 buckets in AWS for data storage and retrieval.

● Automated data loading into the Hadoop Distributed File System using Oozie, enabling speedy reviews and first
mover advantages, while leveraging PIG for data preprocessing.
● Employed Python libraries like Pandas and NumPy within PySpark workflows for data manipulation and
statistical analysis, resulting in improved data quality and generation of insights.
● Integrated AWS Glue with other AWS services such as Amazon Athena, Amazon Redshift, and Amazon EMR
to develop end-to-end data processing and analytics solutions, enabling timely insights and decision-making.
● Designed and optimized PySpark jobs for data ingestion, cleansing, transformation, and loading (ETL)
operations, ensuring high performance and scalability across large-scale distributed data processing
environments.
● Implemented Spark using Python and Scala along with Spark SQL for faster testing and processing of data,
improving overall efficiency and scalability of data processing tasks.
● Created and implemented feature engineering pipelines on AWS Sage Maker and AWS Databricks,
preprocessing raw data, extracting pertinent features, and transforming data to facilitate machine learning model
training.
● Played a role in establishing and documenting feature ops guidelines and best practices for AWS Sage Maker
and AWS Databricks, ensuring uniformity and effectiveness in feature engineering across various teams and
projects.
● Used Git for version control and Jira for project management, tracking issues and bugs.

Environment: AWS, EC2, S3, Athena, Lambda, Glue, Elasticsearch, RDS, DynamoDB, Redshift, ECS, Hadoop 2.
x, Hive v2.3.1, Spark v2.1.3, Databricks, Python, PySpark, Java, Scala, SQL, Sqoop v1.4.6, Kafka, Airflow v1.9.0,
HBase, Oracle, Cassandra, MLlib, Quick sight, Tableau, Maven, Git, Jira, Azure DevOps.

United Airlines, IL Jun 2019 to Mar 2021

Snowflake Developer

Key Responsibilities:

● Analyzed data quality issues with SNOW SQL, constructing an analytical warehouse on Snowflake for analysis.

● Created and implemented data processing functions and procedures using Snow park within Snowflake,
enabling sophisticated data transformation and manipulation operations using Scala and Java.
● Used Snow park to execute Spark code directly within Snowflake, facilitating the seamless integration of Spark
functionalities with Snowflake data warehouses to enhance data processing and analytical capabilities.
● Proficiently utilized Snowflake's SnowSQL and Azure Blob Storage SDKs for seamless data integration and
orchestration tasks, facilitating smooth data movement and transformation.
● Actively engaged in troubleshooting and resolving integration challenges between Azure Blob Storage and
Snowflake, ensuring uninterrupted data exchange and processing.
● Created Snowflake procedures to facilitate branching and looping during execution.

● Employed Azure Key Vault to securely manage Snowflake credentials and access keys, ensuring robust data
security and adherence to regulatory standards in Azure-Snowflake integrations.
● Orchestrated the transfer of data between Snowflake and Azure Cosmos DB using Azure Data Factory, enabling
bidirectional data synchronization, and facilitating real-time data analysis across both platforms.
● Integrated Snowflake with Azure Databricks to conduct data processing & analytics, leveraging Azure
Databricks' scalable computing capabilities and collaborative analytics environment for various data
engineering tasks.
● Utilized Azure Event Hubs to stream data in real-time into Snowflake, enabling continuous ingestion of
streaming data for immediate analytics and informed decision-making within Snowflake data warehouses.
● Linked Snowflake with Azure Kubernetes Service (AKS) to deploy and manage Snowflake workloads in
containers, ensuring scalable and efficient execution of data processing tasks in Azure environments.
● Implemented Azure Active Directory (Azure AD) integration with Snowflake to centralize identity and access
management, facilitating seamless authentication and authorization for users accessing Snowflake data
warehouses from Azure environments.
● Utilized Snow park to develop custom user-defined functions (UDFs) in Snowflake, expanding the capabilities
of Snowflake data warehouses to address specific business needs and analytical requirements.
● Engineered and deployed scalable data processing solutions using Snow park within Snowflake, enhancing
performance and efficiency to handle large datasets with optimal resource utilization.
● Developed and fine-tuned Snow SQL queries for intricate data retrieval and manipulation tasks in Snowflake,
ensuring the effective execution and leveraging of Snowflake's data processing capabilities.
● Designed and implemented robust data pipelines using Azure Data Factory to efficiently transfer data between
Snowflake and Azure Data Lake Storage, ensuring optimized performance and reliability.
● Proficiently utilized Snowflake's SnowSQL and Snow park functionalities for data integration and orchestration
tasks, enabling seamless data movement and transformation within Snowflake data warehouses.
● Actively engaged in troubleshooting and resolving challenges related to Snowflake SQL queries and Snow park
Spark code execution, ensuring smooth operation of data processing workflows.
● Fine-tuned Snow pipe configurations, adjusting parameters such as batch size, concurrency, and notification
polling intervals for optimal throughput and latency in data ingestion.
● Orchestrated data transformations using Snowflake's SnowSQL, Python libraries for data quality, consistency.

● Implemented Snowflake's Snow park to execute Spark code directly within Snowflake, enabling advanced data
processing and transformation tasks using languages such as Scala and Java.
● Actively resolved challenges related to Snowflake SQL queries and Snow park Spark code execution for
smooth data processing workflows.
● Created Data Quality Scripts using SQL and Hive to validate successful das ta load and quality of the data.
Created various types of data visualizations using Python and Tableau.

Environment: Hadoop, Azure, Map Reduce, Spark, Spark MLLib, Java, Tableau, Azure DevOps SQL, Excel,
VBA, SAS, Matlab, SPSS, Cassandra, Oracle, MongoDB, SQL, DB2, T-SQL, PL/SQL, XML, Tableau.

LPL Financial, San Diego, CA Dec 2017 – May 2019

Sr Data Engineer/ Hadoop Developer

Key Responsibilities:

● Engaged in collaborative efforts with various Hadoop ecosystem components such as HBase, Pig, Sqoop, and
Oozie, contributing to diverse data processing and workflow orchestration tasks.
● Implemented robust data ingestion pipelines leveraging Apache Flume and Apache Kafka, facilitating seamless
streaming data ingestion into Hadoop for real-time processing needs.
● Played a pivotal role in the design and fine-tuning of HiveQL queries and data warehouse schemas to support
ad-hoc querying and data analysis functionalities within the Hadoop environment.
● Improved the efficiency of PySpark jobs by optimizing Spark configurations and employing partitioning
strategies, leading to reduction in both resource consumption and execution time.
● Utilized hands-on experience to write and optimize MapReduce jobs using Java, enabling the processing and
analysis of both structured and unstructured data stored within the Hadoop Distributed File System (HDFS).
● Led the design of fault-tolerant Hadoop cluster architectures, ensuring high availability and resilience in data
storage and processing operations.
● Collaborated closely with system administrators to monitor and maintain Hadoop clusters, addressing issues
related to HDFS storage capacity, performance, and data integrity.
● Actively contributed to capacity planning and scalability initiatives for Hadoop clusters, forecasting storage and
compute requirements to accommodate expanding data volumes and processing workloads.
● Developed and fine-tuned Scala-based MapReduce algorithms to execute intricate data transformations and
aggregations, optimizing computational efficiency and reducing processing durations.
● Leveraged AWS Lambda for executing serverless data processing and event-triggered migration tasks, enabling
seamless integration with other AWS services and reducing operational complexities during migration.
● Utilized Amazon EMR (Elastic MapReduce) for the processing and analysis of extensive datasets during
migration, leveraging its scalable and cost-efficient capabilities for data transformation needs.
● Used Amazon Athena for on-demand querying and analysis of data stored in AWS S3, allowing interactive and
economical exploration of migrated data for informed decision-making.
● Utilized AWS CloudFormation to automate the deployment and administration of AWS infrastructure resources
necessary for data migration, ensuring standardized and reproducible infrastructure setup.
● Implemented AWS Batch for the batch processing of data during migration, facilitating efficient and scalable
execution of data processing tasks within AWS environments.
● Performed performance evaluations of PySpark jobs using Spark UI and profiling tools, pinpointing
bottlenecks, and implementing enhancements to optimize job efficiency and resource utilization.
● Integrated PySpark with machine learning libraries such as scikit-learn and TensorFlow to execute sophisticated
data transformations and feature engineering tasks for predictive modeling purposes.
● Developed reusable objects like PL/SQL program units and libraries, database procedures, and functions,
database triggers to be used by the team, and satisfying the business rules.
● Worked on bug tracking reports daily using Quality Center.

● Designed, developed, and tested data mart prototype (SQL 2005), ETL process (SSIS), and OLAP cube (SSAS)

Environment: Hadoop, Kafka, Spark, Sqoop, Docker, Swamp, Big Query, Spark SQL, TDD, Spark-Streaming,
Hive, Scala, Pig, NoSQL, Impala, Oozie, Azure DevOps, Hbase, Data Lake, Zookeeper.

Loblaw, Canada Aug 2016 – Oct 2017

Data Engineer

Key Responsibilities:

● Constructed scalable distributed data solutions in Hadoop Cluster environments using Hortonworks distribution.

● Enhanced data processing time and network data transfer efficiency by converting raw data into sequence data
formats like Avro and Parquet.
● Applied normalization and de-normalization to optimize performance in relational, dimensional database
settings.
● Developed, tested, and refined Extract Transform Load (ETL) applications handling various data sources.

● Optimized SQL queries in Hive and crafted files utilizing HUE for improved efficiency.

● Utilized Spark to refine performance and enhance the efficiency of existing algorithms in Hadoop using Spark
context, Spark-SQL, Data Frame, and pair RDDs.
● Created custom PySpark functions to manage data cleansing activities like filling missing values, identifying
outliers, and removing duplicate data, resulting in enhanced data quality and uniformity.
● Employed PySpark for data analysis, leveraging Spark libraries with Python scripting.

● Optimized data processing workflows to ensure seamless handling and processing of large volumes of
unstructured data stored in Amazon S3.
● Developed and deployed scalable PySpark ETL workflows utilizing Apache Airflow, streamlining data
pipelines, and automating processes for data ingestion, transformation, and loading.
● Incorporated data validation checks within PySpark transformations to ensure compliance with business rules
and regulatory standards, minimizing data inconsistencies and bolstering data integrity.
● Utilized AWS ecosystem, employing AWS S3 as central repository for storing and processing unstructured
data.
● Transformed HiveQL into Spark transformations using Spark RDD via Scala programming.

● Transformed unstructured data into structured formats suitable for analysis and storage in databases, employing
AWS Glue for efficient ETL processes.
● Created and utilized User Defined Functions (UDF) and User Defined Aggregated (UDA) Functions in Pig,
Hive.
● Designed and implemented tailored ETL workflows in Spark/Hive to execute data cleaning and mapping tasks.
● Developed Kafka custom encoders to facilitate custom input formats for loading data into Kafka partitions.

● Aided with Kafka cluster topics management through Kafka manager and automated resource management
using Cloud Formation scripting.

Environment: .NET MVC, MS-Excel, Data Quality, MS-Access, SQL, MS Excel, Data Maintenance, PL/SQL,
SQL Plus, Metadata, Tableau, Data Analysis, Tableau, SSIS, SSRS, SSAS.

BluJay Solutions, India May 2014 – Jul 2016

Python Developer

Key Responsibilities:

● Utilized Python libraries like Pandas, NumPy, SciPy to enhance efficiency in managing and analyzing data
tasks.
● Proficiently crafted and executed ETL (Extract, Transform, Load) pipelines in Python to automate data
workflows and uphold data integrity.
● Expertise in seamlessly integrating Python scripts with diverse databases such as PostgreSQL, MySQL, and
MongoDB, facilitating data retrieval, storage, and manipulation.
● Demonstrated a robust grasp of data warehousing principles, coupled with hands-on experience in crafting data
models to optimize the storage and retrieval of structured and unstructured data.
● Developed RESTful APIs leveraging Python frameworks like Flask and Django, enabling smooth
communication between data sources and applications.
● Utilized data visualization tools like Matplotlib, Seaborn to generate visualizations from datasets.

● Extensively worked in automating shell scripting duties using Bash scripting to streamline various tasks
including system administration, data processing, and task automation.
● Experienced in writing automated shell scripts for tasks such as file management, data parsing, and system
monitoring to improve operational efficiency.
● Used GIT as version control for development and managing code repositories for both Python and shell
scripting.
● Generated graphical reports using python packages NumPy and matplotlib.

● Worked on Python extensively to manage unstructured data files like CSV, JSON, XML, log files, emphasizing
effective parsing, extraction, and data transformation.
● Employed personalized Python scripts and parsers to extract pertinent data from unstructured sources, readying
them for inclusion in databases.
● Utilized Python libraries such as Beautiful Soup and xml for the extraction and organization of structured data
from HTML pages and various web sources.
● Crafted and executed data transformation procedures in Python to convert unstructured data into a format
compatible with relational databases, ensuring seamless insertion.
Environment: Python, Oracle DB, Apache server, Pandas, Django, MySQL, Linux, JavaScript, Teradata, SQL
server

EDUCATION:

● BE in Computer Science

Sai Krishna Sr. Big Data Engineer
No ratings yet
Sai Krishna Sr. Big Data Engineer
8 pages
Ravali Data Engineer GCP
No ratings yet
Ravali Data Engineer GCP
8 pages
John Pual
No ratings yet
John Pual
10 pages
DataEngineer Shreya Hadoop
No ratings yet
DataEngineer Shreya Hadoop
9 pages
Saikiran Data - Engineer Resume
No ratings yet
Saikiran Data - Engineer Resume
7 pages
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
No ratings yet
Abdul Hameed Sr. Data Engineer +1 (475) 302-9845 Summary:: Hadoop/Spark Ecosystem
6 pages
Ankit Data Engineer Resume
No ratings yet
Ankit Data Engineer Resume
8 pages
Data Engineering Expertise Overview
No ratings yet
Data Engineering Expertise Overview
7 pages
Ankush Kaira
No ratings yet
Ankush Kaira
6 pages
Big Data & Cloud Engineering Expert
No ratings yet
Big Data & Cloud Engineering Expert
4 pages
Vishnu DE
No ratings yet
Vishnu DE
4 pages
Sai Vodnala DE
No ratings yet
Sai Vodnala DE
5 pages
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
No ratings yet
Anusha K Phone No: (929) 456-3121 Senior Data Engineer: Summary
7 pages
Mani DE
No ratings yet
Mani DE
5 pages
Sr. Data Engineer with Azure Expertise
No ratings yet
Sr. Data Engineer with Azure Expertise
6 pages
Dice Resume CV Karthik S
No ratings yet
Dice Resume CV Karthik S
4 pages
Vijay - Data Engineer Re
No ratings yet
Vijay - Data Engineer Re
7 pages
Aditya Paruchuri
No ratings yet
Aditya Paruchuri
7 pages
Sai Kruthik Reddy Data Engineer
No ratings yet
Sai Kruthik Reddy Data Engineer
9 pages
Vinay Kumar Data Engineer
No ratings yet
Vinay Kumar Data Engineer
8 pages
1
No ratings yet
1
6 pages
Enabling High Reliability and Low Maintenance For Querying Costs
No ratings yet
Enabling High Reliability and Low Maintenance For Querying Costs
6 pages
Jyostna DataEngineer GCEAD
No ratings yet
Jyostna DataEngineer GCEAD
5 pages
Abdul Hameed Mohamed
No ratings yet
Abdul Hameed Mohamed
7 pages
Nagaraju Bachu
No ratings yet
Nagaraju Bachu
6 pages
Resume Deepthi P
No ratings yet
Resume Deepthi P
5 pages
Dice Resume CV PAVAN SRI HARSHA LAGHUVARAPU
No ratings yet
Dice Resume CV PAVAN SRI HARSHA LAGHUVARAPU
4 pages
Nikhil Kumar Mutyala - Senior Big Data Engineer
No ratings yet
Nikhil Kumar Mutyala - Senior Big Data Engineer
7 pages
SSREDDY
No ratings yet
SSREDDY
8 pages
Manideep Resume IXL
No ratings yet
Manideep Resume IXL
9 pages
Data Engineering Expertise Overview
No ratings yet
Data Engineering Expertise Overview
8 pages
PR Ofessional Summary: Data Frames and RDD's
No ratings yet
PR Ofessional Summary: Data Frames and RDD's
6 pages
Deepak (Sr. Data Engineer)
No ratings yet
Deepak (Sr. Data Engineer)
10 pages
Swetha G
No ratings yet
Swetha G
9 pages
Suharshini - Data - Engineer - Python
No ratings yet
Suharshini - Data - Engineer - Python
8 pages
JPC - 15553 - Bhavyasri Tanneeru
No ratings yet
JPC - 15553 - Bhavyasri Tanneeru
8 pages
Vinith Siripuram Data Engineer
No ratings yet
Vinith Siripuram Data Engineer
5 pages
Data Engineer Rithick Bisher
No ratings yet
Data Engineer Rithick Bisher
5 pages
Hadoop & Azure Data Engineer Resume
No ratings yet
Hadoop & Azure Data Engineer Resume
5 pages
Ajay Resume
No ratings yet
Ajay Resume
3 pages
Bharath DE
No ratings yet
Bharath DE
7 pages
Bharath Sai K DataEngineer
No ratings yet
Bharath Sai K DataEngineer
6 pages
Mahesh - Big Data Engineer
No ratings yet
Mahesh - Big Data Engineer
5 pages
DataEngineer Shreya AWS
No ratings yet
DataEngineer Shreya AWS
9 pages
Pruthvi GCP - Data Engineer +++++++
No ratings yet
Pruthvi GCP - Data Engineer +++++++
8 pages
Mathisha Jeeva
No ratings yet
Mathisha Jeeva
6 pages
Ramya Data Engineer
No ratings yet
Ramya Data Engineer
4 pages
Teja
No ratings yet
Teja
5 pages
Sashi Kumar ADE
No ratings yet
Sashi Kumar ADE
6 pages
Data Engineering Expert Profile
No ratings yet
Data Engineering Expert Profile
5 pages
Anil SrDEngineer
No ratings yet
Anil SrDEngineer
5 pages
Anisha ETL DataEngineer
No ratings yet
Anisha ETL DataEngineer
7 pages
Venkata Sai (Sr. GCP Data Engineer)
No ratings yet
Venkata Sai (Sr. GCP Data Engineer)
7 pages
Somanath Reddy AZURE SF
No ratings yet
Somanath Reddy AZURE SF
6 pages
SumanaV Bigdata
No ratings yet
SumanaV Bigdata
6 pages
Dice Resume CV SN
No ratings yet
Dice Resume CV SN
5 pages
Hadoop/Spark Developer Resume
No ratings yet
Hadoop/Spark Developer Resume
7 pages
Resume 3
No ratings yet
Resume 3
7 pages
Ncert Solutions Class 10 Maths Chapter 7
No ratings yet
Ncert Solutions Class 10 Maths Chapter 7
30 pages
Code For Creating A Website
No ratings yet
Code For Creating A Website
1 page
Ncert Class 10 Maths Chapter 1
No ratings yet
Ncert Class 10 Maths Chapter 1
12 pages
Ncert Solutions Class 10 Maths Chapter 3
No ratings yet
Ncert Solutions Class 10 Maths Chapter 3
53 pages
Ncert Solutions Class 10 Maths Chapter 10
No ratings yet
Ncert Solutions Class 10 Maths Chapter 10
15 pages
Ncert Solutions Class 10 Maths Chapter 5
No ratings yet
Ncert Solutions Class 10 Maths Chapter 5
58 pages
AWS SQL Developer SabithaBekkam
No ratings yet
AWS SQL Developer SabithaBekkam
11 pages
Srikanth M - Data Engineer
No ratings yet
Srikanth M - Data Engineer
5 pages
Certifications
No ratings yet
Certifications
1 page
Jason Stone
No ratings yet
Jason Stone
2 pages
Ncert Solutions Class 10 Maths Chapter 15
No ratings yet
Ncert Solutions Class 10 Maths Chapter 15
20 pages
Janardhan Reddy Lingam
No ratings yet
Janardhan Reddy Lingam
7 pages
Sathish
No ratings yet
Sathish
5 pages
AI Powered Job Hunt
No ratings yet
AI Powered Job Hunt
16 pages
Keeran Res 2024 V 1
No ratings yet
Keeran Res 2024 V 1
3 pages
Bhagya Sree Power BI
100% (1)
Bhagya Sree Power BI
5 pages
Mahesh
No ratings yet
Mahesh
10 pages
Srirama OS
No ratings yet
Srirama OS
8 pages
Arul Nagothu TXPMSMSAFe
No ratings yet
Arul Nagothu TXPMSMSAFe
5 pages
Mahammad Azure Architect
No ratings yet
Mahammad Azure Architect
6 pages
How To Pilot AI Content at Your Company Ebook - Superside - FV - V1
100% (1)
How To Pilot AI Content at Your Company Ebook - Superside - FV - V1
14 pages
Principal Software Engineer Resume
No ratings yet
Principal Software Engineer Resume
4 pages
Manoj Reddy
No ratings yet
Manoj Reddy
5 pages
Reghu N - Software Developer
No ratings yet
Reghu N - Software Developer
4 pages
DevOps Engineer Resume - AWS, CI/CD, Kubernetes
No ratings yet
DevOps Engineer Resume - AWS, CI/CD, Kubernetes
4 pages
Google Groups
No ratings yet
Google Groups
2 pages
Create Persona Using Canva Templates
No ratings yet
Create Persona Using Canva Templates
21 pages
Nagaresume
No ratings yet
Nagaresume
12 pages
RDF & RDF Schema for IT Students
No ratings yet
RDF & RDF Schema for IT Students
44 pages
Semiconductors Discretes
100% (4)
Semiconductors Discretes
96 pages
Basetronl: Experiment 4
No ratings yet
Basetronl: Experiment 4
9 pages
Non Voice BPO - Computer Fundamentals
No ratings yet
Non Voice BPO - Computer Fundamentals
32 pages
Report
No ratings yet
Report
20 pages
LA3NET Workshop ILT Aachen 2013 Traub Optics Design 2
No ratings yet
LA3NET Workshop ILT Aachen 2013 Traub Optics Design 2
64 pages
Nurul Syazwani Binti Mohamad Fauzi DSK1A 18DSK2F2012 Exercise Page 44
No ratings yet
Nurul Syazwani Binti Mohamad Fauzi DSK1A 18DSK2F2012 Exercise Page 44
4 pages
Overhead Transmission Tower Installation
No ratings yet
Overhead Transmission Tower Installation
35 pages
SPECIAL CONDITIONS-TECHNICAL For EARTHWORK MINOR BRIDGES
No ratings yet
SPECIAL CONDITIONS-TECHNICAL For EARTHWORK MINOR BRIDGES
37 pages
Revision History Revision Date Purpose
No ratings yet
Revision History Revision Date Purpose
5 pages
24 Case Study One World Trade Center
No ratings yet
24 Case Study One World Trade Center
7 pages
Unknown Mentalist - Six Special Stebbins Stacks
No ratings yet
Unknown Mentalist - Six Special Stebbins Stacks
54 pages
Real-Time Anomaly Detection and Classification From Surveillance Cameras Using Deep Neural Network
No ratings yet
Real-Time Anomaly Detection and Classification From Surveillance Cameras Using Deep Neural Network
6 pages
Terra Vor TN 200d Installation Manual
No ratings yet
Terra Vor TN 200d Installation Manual
37 pages
Rdso Specification NO. M&C/PCN/110/2006: Price
No ratings yet
Rdso Specification NO. M&C/PCN/110/2006: Price
11 pages
Classical Encryption Methods Guide
No ratings yet
Classical Encryption Methods Guide
18 pages
4.2.2.7 Lab - Configuring Frame Relay and Subinterfaces
No ratings yet
4.2.2.7 Lab - Configuring Frame Relay and Subinterfaces
19 pages
Edma 262 Assignment 1
No ratings yet
Edma 262 Assignment 1
12 pages
1st Puc Physics Model Paper 2022-23 by Bvvs Puc Vidyagiri
No ratings yet
1st Puc Physics Model Paper 2022-23 by Bvvs Puc Vidyagiri
17 pages
Quadratic Inequalities - Lesson
No ratings yet
Quadratic Inequalities - Lesson
36 pages
Carl Friedrich Gauss
No ratings yet
Carl Friedrich Gauss
14 pages
385 - 95R25 MICHELIN X-CRANE+ 170F TL - Heuver
No ratings yet
385 - 95R25 MICHELIN X-CRANE+ 170F TL - Heuver
4 pages
Oic751-Transducer Engineering 2 Marks, 13 Marks and Problem
No ratings yet
Oic751-Transducer Engineering 2 Marks, 13 Marks and Problem
41 pages
Week 4 - Open - Chanel - Intro
No ratings yet
Week 4 - Open - Chanel - Intro
39 pages
Graphites and Fullerene
No ratings yet
Graphites and Fullerene
9 pages
Thevenin Theorem: Simplifying Circuits
No ratings yet
Thevenin Theorem: Simplifying Circuits
9 pages
Solutions To Homework Seven: KN KC MN MC
No ratings yet
Solutions To Homework Seven: KN KC MN MC
3 pages
Heterogeneous Computing To Enable The Highest Level of Safety in Automotive Systems - v1.2
No ratings yet
Heterogeneous Computing To Enable The Highest Level of Safety in Automotive Systems - v1.2
39 pages
Chpater 1 - Rev2
No ratings yet
Chpater 1 - Rev2
26 pages
Lecture # 3, Separate Source & Induce Test
No ratings yet
Lecture # 3, Separate Source & Induce Test
26 pages

Big Data & Cloud Solutions Expert

Uploaded by

Big Data & Cloud Solutions Expert

Uploaded by

SAHITHI DEVI

PROFESSIONAL WORK EXPERIENCE:

Verizon, TX May 2023 – Present

● Used Logic App to take decisional actions based on the workflow.

● Extensive knowledge in Data transformations, Mapping, Cleansing, Monitoring, Debugging, performance

● Facilitated data for interactive Power BI dashboards and reporting purposes.

Bank of America, NC Apr 2021 – May 2023

United Airlines, IL Jun 2019 to Mar 2021

LPL Financial, San Diego, CA Dec 2017 – May 2019

Loblaw, Canada Aug 2016 – Oct 2017

BluJay Solutions, India May 2014 – Jul 2016

You might also like