0% found this document useful (0 votes)

60 views11 pages

Shamee K Sharma - IR

The document is an internship summary report detailing a two-month virtual internship focused on data engineering using AWS. It outlines the objectives, tasks, skills acquired, and project deliverables, emphasizing the design and implementation of data pipelines and the use of various AWS services. The report concludes with reflections on the overall internship experience, highlighting the technical growth and industry insights gained.

Uploaded by

try.kushagra2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views11 pages

Shamee K Sharma - IR

Uploaded by

try.kushagra2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

`

SCHOOL OF COMPUTING SCIENCE AND ENGINEERING

GREATER NOIDA, UTTAR PRADESH
2024 – 2025

INDUSTRY INTERNSHIP
SUMMARY REPORT

AWS Data Engineering Virtual Internship Report

BACHELOR OF TECHNOLOGY

COMPUTER SCIENCE AND ENGINEERING

Submitted by

Shamee K Sharma (22SCSE1012596)

Vth Sem III Year

1
`

CERTIFICATE

I hereby certify that the work which is being presented in the Internship project

report entitled “ AWS Data Engineering Virtual Internship Report “in

partial fulfillment for the requirements for the award of the degree of Bachelor of
Technology in the School of Computing Science and Engineering of Galgotias University ,
Greater Noida, is an authentic record of my own work carried out in the industry.
To the best of my knowledge, the matter embodied in the project report has not been
submitted to any other University/Institute for the award of any Degree.

Shamee K Sharma (22SCSE1012596)

This is to certify that the above statement made by the candidate is correct and
true to the best of my knowledge.

Signature of Internship Reviewer Signature of Dean (SCSE)

2
`

TABLE OF CONTENTS

CHAPTER TITLE Page No.

Abstract 4
List of Figures & List of Tables

List of Abbreviations

1 Introduction 8
1.1 Objective of the Internship Project

1.2 Problem statement and research objectives of this Internship

1.3 Description of Internship Domain and brief introduction about

an internship organization

2 Internship Activities 9
2.1 Detailed description of tasks and responsibilities.

2.2 Daily/Weekly progress (students can provide a log or journal

of activities).

2.3 Skills or tools used (e.g., programming languages,

frameworks, software, etc.).

3 Learning Outcomes

3.1 Skills acquired (technical and soft skills).

3.2 Knowledge gained about the industry/domain.

3.3 Problem-solving or challenges faced during the internship and

how they were addressed.

4 Project/Work Deliverables

4.1 Details of the main project(s) or tasks completed.

4.2 Outcomes or results of the work done.

4.3 Links or attachments to work products (if applicable, e.g.,

reports, presentations, or code).

5 Conclusion

5.1 Reflections on the overall internship experience.

5.2 Internship certificate.

3
`

ABSTRACT

This report details the experiences and outcomes of a two-month virtual internship focused
on data engineering using Amazon Web Services (AWS). The internship encompassed the
design and implementation of data pipelines, data modeling, and the utilization of various
AWS services to manage and process large datasets. Key deliverables included the
development of scalable data solutions and the application of best practices in data
engineering.

The primary goal of the internship was to design, implement, and optimize data pipelines
capable of handling large and complex datasets. This included tasks such as data ingestion,
transformation, and storage, which are essential for enabling data-driven decision-making
in modern organizations. Leveraging AWS services such as S3 for storage, Redshift for data
warehousing, Glue for ETL processes, and Lambda for automation, the internship
emphasized building scalable and efficient data solutions.

A key aspect of the program was understanding and applying data modelling techniques to
ensure data integrity and efficiency. Participants were introduced to industry-standard
practices, including schema design, data partitioning, and query optimization. These
practices were implemented to address real-world challenges such as performance
bottlenecks and data security concerns.

The internship also highlighted the importance of adopting best practices in data
engineering, such as using IAM roles for secure access, employing serverless computing for
cost-effectiveness, and optimizing Spark jobs for large-scale data processing. The
deliverables included functional data pipelines and documentation that showcased a deep
understanding of the AWS ecosystem and its applications in solving business challenges.

By the end of the internship, participants had gained not only technical proficiency in AWS
tools but also valuable insights into the broader domain of data engineering. This experience
equipped them with the skills to build reliable, scalable, and efficient data systems, making
significant contributions to the field of cloud-based data management. The report
summarizes this transformative journey, emphasizing the practical applications of AWS
technologies and the critical lessons learned during the program.

4
`

LIST OF FIGURES

S. NO FIG. NO TITLE PAGE. NO

1 1 Tools and Technologies Used 6

2 2 Daily/Weekly Progress Summary 8

3 3 Skills Acquired During the Internship 10

4 4 Project Deliverables Overview 12

5
`

LIST OF ABBREVIATIONS

AWS Abbreviation Definition

EMR Amazon Web Services
RDS Elastic MapReduce
S3 Relational Database Service
SQL Simple Storage Service
NoSQL Structured Query Language
ETL Non-Structured Query Language
BI Extract, Transform, Load
Business Intelligence

6
`
CHAPTER 1

INTRODUCTION

CHAPTER 1: INTRODUCTION

1.1 Objective of the Internship Project

 The primary objective of this internship was to gain practical experience in data
engineering by designing and implementing data pipelines using AWS services. This
involved understanding data warehousing concepts, data modelling, and the
deployment of scalable data solutions in a cloud environment.

1.2 Problem Statement and Research Objectives

 With the increasing volume of data generated by businesses, there is a pressing need
for efficient data processing and analysis tools. The internship aimed to address this
challenge by developing data pipelines capable of handling large datasets, ensuring
data integrity, and enabling data-driven decision-making.

1.3 Description of Internship Domain and Organization

 The internship was conducted under the AWS Data Engineering Virtual Internship
program, facilitated by EduSkills Foundation in collaboration with AICTE. The
program focused on cloud-based data engineering, providing exposure to AWS tools
and services essential for building data infrastructure

CHAPTER 2

7
`
INTERNSHIP ACTIVITIES

2.1 Tasks and Responsibilities

 Designed and implemented analytical data platform solutions to facilitate data-driven

decisions and insights.
 Developed data schemas and managed internal data warehouses and SQL/NoSQL
database systems.
 Collaborated with cross-functional teams to extract, transform, and load data from
diverse sources using AWS big data technologies.
 Engaged in data model design, architecture discussions, and optimizations to enhance
data processing efficiency.
 Explored and utilized AWS services such as S3, Redshift, Lambda, and Glue to build
and maintain data pipelines.
 Participated in mentoring sessions conducted by industry experts to gain insights into
real-world data engineering challenges.

2.2 Daily/Weekly Progress

 Each week a module was completed in order to produce the desired output on time.
 Weekly progress was noted and improved in order to maintain the harmony of the
process.

2.3 Skills or Tools Used

 Programming Languages: Python, SQL
 AWS Services: S3, Redshift, EMR, RDS, Lambda, Glue
 Data Processing Frameworks: Apache Spark, Hive
 Data Modelling Tools: ERD tools
 Version Control: Git

CHAPTER 3

8
`
LEARNING OUTCOMES

3.1 Skills Acquired

 Proficiency in designing and implementing data pipelines using AWS services.

 Enhanced understanding of data warehousing concepts and data modelling

techniques.

 Improved programming skills in Python and SQL for data processing tasks.

 Experience with big data technologies and frameworks such as Apache Spark and
Hive.

 Development of soft skills including teamwork, communication, and problem-solving.

3.2 Knowledge Gained

 In-depth understanding of AWS cloud services and their applications in data

engineering.

 In-depth understanding of AWS data warehousing and data modelling .

 Complete knowledge of SQL and Python.

 Deep understanding of cloud-based data engineering concepts.

 Insight into data lifecycle management, including ingestion, transformation, and

storage.

 Practical experience in optimizing cloud-based data solutions for scalability.

CHAPTER 4

9
`
PROJECT/WORK DELIVERABLES

4.1 Details of the main project(s) or tasks completed.

 Developed an API extraction system to pull data from a website at regular
intervals.
 Built a robust system to authenticate, send requests, and parse the API
response into structured formats (e.g., JSON, CSV).
 Automated the data extraction process and scheduled periodic API calls to
update the data.
4.2 Outcomes or results of the work done.
 Improved data retrieval efficiency, reducing manual effort and increasing the
frequency of data updates
 Delivered real-time insights from the extracted data to support decision-
making processes.
 Scalable and Reliable Solutions:
The API extraction process was designed for scalability, ensuring that it can
accommodate growth in the data volume and complexity of the website's API
over time.
4.3 Links or attachments to work products (if applicable, e.g., reports, presentations, or
code).
 Documentation outlining the architecture, setup process, and data extraction
methodology.
 Presentation:
A concise presentation summarizing the project's objectives, implementation
strategy, results, and future scalability potential. This was shared with
stakeholders to demonstrate the value of the automated API extraction
solution.
 Repository with API extraction scripts and configuration files
(https://github.com/shamee12312/porject_aicte/tree/main)

CHAPTER 5

10
`
CONCLUSION

5.1 Reflections on the overall internship experience.

 The AWS Data Engineering Virtual Internship provided a comprehensive
learning experience in cloud-based data engineering. It not only enhanced
technical proficiency in AWS tools but also fostered problem-solving and
analytical skills. The opportunity to work on real-world challenges has been
instrumental in preparing for a career in data engineering.
 Technical Growth
The internship allowed hands-on exposure to various AWS services like S3,
Redshift, Glue, Lambda, and EMR, which are foundational for modern data
engineering workflows. The ability to work with tools like Apache Spark and
Python further enhanced my capacity to manage, process, and analyze large
datasets efficiently. Designing and optimizing ETL processes, a core part of
the program, helped me understand the intricacies of data ingestion,
transformation, and storage.
 Industry Insights
Through this internship, I gained valuable insights into the data engineering
domain and the best practices followed in the industry. I learned about the
significance of data-driven decision-making and the role of robust data
pipelines in achieving business objectives. Understanding how large
organizations use cloud platforms to scale and secure their data infrastructure
was an eye-opener.
 Overall Reflection
The AWS Data Engineering Virtual Internship was more than just a learning
opportunity—it was an experience that bridged the gap between academic
concepts and industry practices. By tackling real-world problems and
delivering tangible results, I have grown both professionally and personally.
This journey has solidified my interest in data engineering and affirmed my
commitment to contributing to the field.

Ram Documentatation
No ratings yet
Ram Documentatation
56 pages
Geetha Intern de
No ratings yet
Geetha Intern de
26 pages
213T1A0427
No ratings yet
213T1A0427
26 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
56 pages
Summer Internship Report On: Aws Data Engineering (Topic)
No ratings yet
Summer Internship Report On: Aws Data Engineering (Topic)
21 pages
Data Engineering Report Final
No ratings yet
Data Engineering Report Final
56 pages
Aws Intern Report
No ratings yet
Aws Intern Report
37 pages
218W1A1286
No ratings yet
218W1A1286
25 pages
Puneeth Report
No ratings yet
Puneeth Report
37 pages
Aws Cloud Internship PPT Vikash Kumar
No ratings yet
Aws Cloud Internship PPT Vikash Kumar
13 pages
Report 2 Merged
No ratings yet
Report 2 Merged
25 pages
Condensed Internship Report v2
No ratings yet
Condensed Internship Report v2
10 pages
Project K
No ratings yet
Project K
34 pages
DE AWS Test (1) T
No ratings yet
DE AWS Test (1) T
74 pages
AWS Data Lake Project Report
No ratings yet
AWS Data Lake Project Report
32 pages
Internship 1
No ratings yet
Internship 1
24 pages
Internship Report
No ratings yet
Internship Report
24 pages
Internship
No ratings yet
Internship
24 pages
AWS Virtual Internship Presentation
No ratings yet
AWS Virtual Internship Presentation
15 pages
A Internship Report UTTAM
No ratings yet
A Internship Report UTTAM
9 pages
Sourabh Internship Report
No ratings yet
Sourabh Internship Report
14 pages
Summer Internship
No ratings yet
Summer Internship
31 pages
Data Engineering
No ratings yet
Data Engineering
24 pages
21981a4907 Eduskills Data Engineering
No ratings yet
21981a4907 Eduskills Data Engineering
26 pages
Summer Entrepreneurship-II REPORT
No ratings yet
Summer Entrepreneurship-II REPORT
35 pages
Recreated AWS Internship Report Final
No ratings yet
Recreated AWS Internship Report Final
4 pages
AWS Data Engineering
100% (1)
AWS Data Engineering
17 pages
Suhail Internship
No ratings yet
Suhail Internship
22 pages
Data Engineering Internship at AICTE
No ratings yet
Data Engineering Internship at AICTE
18 pages
21mh1a4205 Documentation
No ratings yet
21mh1a4205 Documentation
54 pages
21981a4924 Aws Aiml Virtual Internship
No ratings yet
21981a4924 Aws Aiml Virtual Internship
28 pages
Namineni Rakesh - Report
No ratings yet
Namineni Rakesh - Report
15 pages
21A91A04C3
No ratings yet
21A91A04C3
81 pages
Data Engineering Nanodegree Program Syllabus
No ratings yet
Data Engineering Nanodegree Program Syllabus
16 pages
Disease Drug Prediction Usiing ML: Computer Science and Engineering (Artificial Intelligence)
No ratings yet
Disease Drug Prediction Usiing ML: Computer Science and Engineering (Artificial Intelligence)
48 pages
Vikas Intern Report
No ratings yet
Vikas Intern Report
33 pages
Internship Report Format
No ratings yet
Internship Report Format
25 pages
Data Engineering
No ratings yet
Data Engineering
22 pages
Data Engineering Nanodegree Program Syllabus PDF
No ratings yet
Data Engineering Nanodegree Program Syllabus PDF
5 pages
AI & ML Internship Report CE
No ratings yet
AI & ML Internship Report CE
24 pages
E0 Internship
No ratings yet
E0 Internship
40 pages
Data Analytics Virtual Internship Report
No ratings yet
Data Analytics Virtual Internship Report
25 pages
AI & ML Virtual Internship: Bachelor of Technology
No ratings yet
AI & ML Virtual Internship: Bachelor of Technology
34 pages
Priya
No ratings yet
Priya
56 pages
Internship Progress Report Prasad K
No ratings yet
Internship Progress Report Prasad K
9 pages
Shadab Internship Report
No ratings yet
Shadab Internship Report
15 pages
Kau Progress Report
No ratings yet
Kau Progress Report
3 pages
Internship Front Pages Final
No ratings yet
Internship Front Pages Final
9 pages
21a35a0113 Cohort 5
No ratings yet
21a35a0113 Cohort 5
26 pages
Report Internship 1 PDF
No ratings yet
Report Internship 1 PDF
9 pages
Google Aiml
No ratings yet
Google Aiml
47 pages
Last Data Analytics Report-1267
No ratings yet
Last Data Analytics Report-1267
32 pages
Ai PDF
No ratings yet
Ai PDF
51 pages
Aws Data Engineer 2
No ratings yet
Aws Data Engineer 2
50 pages
Final PVKK
No ratings yet
Final PVKK
53 pages
Raju
No ratings yet
Raju
27 pages
Coran Compile Arabe - Zip
No ratings yet
Coran Compile Arabe - Zip
4 pages
Introduction to Artificial Intelligence
No ratings yet
Introduction to Artificial Intelligence
14 pages
BIOS Recovery Procedures
No ratings yet
BIOS Recovery Procedures
11 pages
Vipin Singh: Cloud Consultant
No ratings yet
Vipin Singh: Cloud Consultant
3 pages
Ele Sem Iv R7 Report
No ratings yet
Ele Sem Iv R7 Report
19 pages
Cscu Exam Paper
55% (11)
Cscu Exam Paper
8 pages
Inteligen 500 g2 Datasheet
No ratings yet
Inteligen 500 g2 Datasheet
5 pages
Introduction ArcGIS
No ratings yet
Introduction ArcGIS
49 pages
C++ Friend Function Quiz - Quizizz
No ratings yet
C++ Friend Function Quiz - Quizizz
4 pages
WB Chapter 4 Software-Answers
No ratings yet
WB Chapter 4 Software-Answers
7 pages
Automating Feedback Using Google Sheet
No ratings yet
Automating Feedback Using Google Sheet
107 pages
Faceplate WinCC Valve en
No ratings yet
Faceplate WinCC Valve en
35 pages
Programming RASPBERRY PI PICO With ARDUINO IDE How
No ratings yet
Programming RASPBERRY PI PICO With ARDUINO IDE How
9 pages
Mobile Number Registration Form For Manual Registration
No ratings yet
Mobile Number Registration Form For Manual Registration
1 page
CCTV Maintenance Proposal
No ratings yet
CCTV Maintenance Proposal
3 pages
Rdbms Concepts 1
No ratings yet
Rdbms Concepts 1
122 pages
Steve Jobs: Innovator & Apple Co-Founder
No ratings yet
Steve Jobs: Innovator & Apple Co-Founder
4 pages
Dell Premier Multi Device Wireless Keyboard and Mouse km7321w Data Sheet
No ratings yet
Dell Premier Multi Device Wireless Keyboard and Mouse km7321w Data Sheet
4 pages
RUTX10 Datasheet-V1.0
No ratings yet
RUTX10 Datasheet-V1.0
13 pages
Modular Programming (Functions) : Anand Kr. Srivastava Assistant Professor
No ratings yet
Modular Programming (Functions) : Anand Kr. Srivastava Assistant Professor
24 pages
Fil1010 Sample Cat1
100% (1)
Fil1010 Sample Cat1
9 pages
F.I.T Question Bank With Answers
No ratings yet
F.I.T Question Bank With Answers
50 pages
Amharic Hate Speech Detection on Facebook
No ratings yet
Amharic Hate Speech Detection on Facebook
12 pages
Ddos in The Iot: Mirai and Other Botnets: Computer January 2017
No ratings yet
Ddos in The Iot: Mirai and Other Botnets: Computer January 2017
6 pages
INNOVISION EXII (Total - Final)
No ratings yet
INNOVISION EXII (Total - Final)
14 pages
Algorith Flowchart
No ratings yet
Algorith Flowchart
30 pages
Service Manual: STR-DG500/DG600
No ratings yet
Service Manual: STR-DG500/DG600
102 pages
Privacyidea
No ratings yet
Privacyidea
503 pages
Unit 3 Urls and Uris
No ratings yet
Unit 3 Urls and Uris
50 pages
Tally Prime Shortcut Key
No ratings yet
Tally Prime Shortcut Key
15 pages

Shamee K Sharma - IR

Uploaded by

Shamee K Sharma - IR

Uploaded by

`

SCHOOL OF COMPUTING SCIENCE AND ENGINEERING

AWS Data Engineering Virtual Internship Report

COMPUTER SCIENCE AND ENGINEERING

Shamee K Sharma (22SCSE1012596)

Vth Sem III Year

report entitled “ AWS Data Engineering Virtual Internship Report “in

Shamee K Sharma (22SCSE1012596)

Signature of Internship Reviewer Signature of Dean (SCSE)

CHAPTER TITLE Page No.

1.2 Problem statement and research objectives of this Internship

1.3 Description of Internship Domain and brief introduction about

2.2 Daily/Weekly progress (students can provide a log or journal

2.3 Skills or tools used (e.g., programming languages,

3.1 Skills acquired (technical and soft skills).

3.2 Knowledge gained about the industry/domain.

3.3 Problem-solving or challenges faced during the internship and

4.1 Details of the main project(s) or tasks completed.

4.2 Outcomes or results of the work done.

4.3 Links or attachments to work products (if applicable, e.g.,

5.1 Reflections on the overall internship experience.

S. NO FIG. NO TITLE PAGE. NO

1 1 Tools and Technologies Used 6

2 2 Daily/Weekly Progress Summary 8

3 3 Skills Acquired During the Internship 10

4 4 Project Deliverables Overview 12

AWS Abbreviation Definition

1.1 Objective of the Internship Project

1.2 Problem Statement and Research Objectives

1.3 Description of Internship Domain and Organization

2.1 Tasks and Responsibilities

 Designed and implemented analytical data platform solutions to facilitate data-driven

2.2 Daily/Weekly Progress

2.3 Skills or Tools Used

3.1 Skills Acquired

 Proficiency in designing and implementing data pipelines using AWS services.

 Enhanced understanding of data warehousing concepts and data modelling

 Development of soft skills including teamwork, communication, and problem-solving.

3.2 Knowledge Gained

 In-depth understanding of AWS cloud services and their applications in data

 In-depth understanding of AWS data warehousing and data modelling .

 Complete knowledge of SQL and Python.

 Deep understanding of cloud-based data engineering concepts.

 Insight into data lifecycle management, including ingestion, transformation, and

 Practical experience in optimizing cloud-based data solutions for scalability.

4.1 Details of the main project(s) or tasks completed.

5.1 Reflections on the overall internship experience.

You might also like