0% found this document useful (0 votes)
345 views5 pages

Devinder Gill - DE - Resume

Devinder Gill has over 9 years of experience as a software engineer, including experience with Azure cloud services and data warehousing. His skills include Azure technologies like Azure Data Factory, Azure Data Lake, Azure Synapse Analytics, and Azure Cosmos DB. He has experience developing ETL processes using Talend and migrating databases to Azure. He is proficient with technologies like Spark, Hive, MongoDB, Cassandra, and SQL Server.

Uploaded by

ashish ojha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
345 views5 pages

Devinder Gill - DE - Resume

Devinder Gill has over 9 years of experience as a software engineer, including experience with Azure cloud services and data warehousing. His skills include Azure technologies like Azure Data Factory, Azure Data Lake, Azure Synapse Analytics, and Azure Cosmos DB. He has experience developing ETL processes using Talend and migrating databases to Azure. He is proficient with technologies like Spark, Hive, MongoDB, Cassandra, and SQL Server.

Uploaded by

ashish ojha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

DEVINDER GILL

(Data engineer)
Phone: | Email: Gilldevin7879@gmail.com
7813549867

PROFESSIONAL SUMMARY
 Around 9 years of experience as a software industry, including 5 years of experience in, Azure cloud services,
and 4 years of experience in Data warehouse.
 Experience in Azure Cloud, Azure Data Factory, Azure Data Lake storage, Azure Synapse Analytics, Azure
Analytical services, Azure Cosmos NO SQL DB, Azure Big Data Technologies (Hadoop and Apache Spark) and
Data bricks.
 Experience in developing, support and maintenance for the ETL (Extract, Transform and Load) processes
using Talend Integration Suite.
 Experience on Migrating SQL database to Azure Data Lake, Azure SQL Database, Data Bricks and Azure SQL
Data warehouse and controlling and granting database access and Migrating On premise databases to Azure
Data Lake using Azure Data factory.
 Experience in relational Data modeling, Dimensional data modeling, Star & Snowflake schema,
Logical/Physical Design, ER Diagrams and OLTP and OLAP System Study and Analysis.
 Experience in developing very complex mappings, reusable transformations, sessions, and workflows using
Informatica ETL tool to extract data from various sources and load into targets.
 Proficiency in multiple databases like MongoDB, Cassandra, MySQL, ORACLE, and MS SQL Server.
 Used various file formats like Avro, Parquet, Sequence, Json, ORC and text for loading data, parsing,
gathering, and performing transformations.
 Good experience in Hortonworks and Cloudera for Apache Hadoop distributions.
 Designed and created Hive external tables using shared meta-store with Static & Dynamic partitioning,
bucketing, and indexing.
 Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using
Spark context, Spark-SQL, Data Frame, pair RDD's.
 Extensive hands on experience tuning spark Jobs.
 Experienced in working with structured data using HiveQL, and optimizing Hive queries.
 Familiarity with libraries like PySpark, Numbly, Pandas, Star base, Matplotlib in python.
 Writing complex SQL queries using joins, group by, nested queries.
 Experience in HBase to load data using connectors and write queries using NOSQL.
 Experience with solid capabilities in exploratory data analysis, statistical analysis, and visualization using R,
Python, SQL, and Tableau.
 Running and scheduling workflows using Oozie and Zookeeper, identifying failures and integrating,
coordinating, and scheduling jobs.
 In - depth understanding of Snowflake cloud technology.
 Hands on experience on Kafka and Flume to load the log data from multiple sources directly in to HDFS.
 Widely used different features of Teradata such as BTEQ, Fast load, Multifood, SQL Assistant, DDL and DML
commands and very good understanding of Teradata UPI and NUPI, secondary indexes and join indexes.
 Having working experience with Building RESTful web services, and RESTful API.

EDUCATION
 Bachelors of degree, India.
TECHNICAL SKILLS

Big Data Technologies Hadoop, Map Reduce, HDFS, Sqoop, Hive, HBase, Flume, Kafka, Yarn, Apache Spark.
Databases Oracle, MySQL, SQL Server, MongoDB, Dynamo DB, Cassandra, Snowflake.

Programming Languages Python, Pyspark, Shell script, Perl script, SQL, Java.

Tools PyCharm, Eclipse, Visual Studio, SQL*Plus, SQL Developer, SQL Navigator, SQL Server
Management Studio, Eclipse, Postman.
Version Control SVN, Git, GitHub, Maven
Operating Systems Windows 10/7/XP/2000/NT/98/95, UNIX, LINUX, OS
Visualization/ Reporting Tableau, ggplot2, matplotlib

PROFESSIONAL EXPERIENCE

Client: Stacknexus, California |Oct 2021 to Present


Role: Sr. Azure Data Engineer
Responsibilities
 Architect and implement ETL and data movement solutions using Azure Data Factory, SSIS
 Understand Business requirements, analysis and translate into Application and operational requirements.
 Designed one-time load strategy for moving large databases to Azure SQL DWH.
 Extract Transform and Load data from Sources Systems to Azure Data Storage services using Azure Data
Factory and HDInsight.
 Created a framework to do data profiling, cleansing, automatic restart ability of batch pipeline and handling
rollback strategy.
 Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL
 Lead a team of six developers to migrate the application.
 Implemented masking and encryption techniques to protect sensitive data.
 Implemented SSIS IR to run SSIS packages from ADF.
 Developed mapping document to map columns from source to target.
 Created azure data factory (ADF pipelines) using Azure blob.
 Performed ETL using Azure Data Bricks. Migrated on-premises Oracle ETL process to Azure Synapse
Analytics.
 Worked on python scripting to automate generation of scripts. Data curation done using azure data bricks.
 Worked on azure data bricks, PySpark, HDInsight, Azure ADW and hive used to load and transform data.
 Implemented and Developing Hive Bucketing and Partitioning.
 Implemented Kafka, spark structured streaming for real time data ingestion.
 Used Azure Data Lake as Source and pulled data using Azure blob.
 Used stored procedure, lookup, execute pipeline, data flow, copy data, azure function features in ADF.
 Worked on creating star schema for drilling data.
 Created PySpark procedures, functions, packages to load data.
 Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination
of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics.
 Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and
processing the data in In Azure Databricks.
 Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark data bricks cluster.
 Creating Databricks notebooks using SQL, Python and automated notebooks using jobs.
 Creating Spark clusters and configuring high concurrency clusters using Azure Databricks to speed up the
preparation of high-quality data.
 Create and maintain optimal data pipeline architecture in cloud Microsoft Azure using Data Factory and
Azure Databricks
Environment: Hadoop, Hive, Impala, Oracle, Spark, Pig, Sqoop, Oozie, Map Reduce, Teradata, SQL, (S3, RedShift, CFT,
EMR), Kafka, Zookeeper, Pyspark.

Client: credit Suisse, Raleigh |Aug 2020 to Sep 2021


Role: Azure Data Engineer
Responsibilities:
 Used Agile Methodology of Data Warehouse development using Kanbanize.
 Developed data pipeline using Spark, Hive and HBase to ingest customer behavioral data and financial
histories into Hadoop cluster for analysis.
 Performed ETL on data from different source systems to Azure Data Storage services using a combination of
Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics.
 Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and
processing the data in Azure Databricks.
 Worked on managing the Spark Databricazure
 Hands on experience implementing Spark and Hive jobs performance tuning.
 KS by proper troubleshooting, estimation, and monitoring of the clusters.
 Performed Data Aggregation, Validation and on Azure HDInsight using spark scripts written in Python.
 Performed monitoring and management of the Hadoop cluster by using Azure HDInsight.
 Generated PL/SQL scripts for data manipulation, validation, and materialized views for remote instances.
 Created partitioned tables in Hive, also designed a data warehouse using Hive external tables and also
created hive queries for analysis.
 Created and modified several database objects such as Tables, Views, Indexes, Constraints, Stored
procedures, Packages, Functions and Triggers using SQL and PL/SQL.
 Extensively worked on Shell scripts for running SAS programs in batch mode on UNIX.
 Wrote Python scripts to parse XML documents and load the data in database.
 Used Hive, Impala and Sqoop utilities and Oozie workflows for data extraction and data loading.
 Created HBase tables to store various data formats of data coming from different sources.
 Responsible for importing log files from various sources into HDFS using Flume.
 Responsible for translating business and data requirements into logical data models in support Enterprise
data models, ODS, OLAP, OLTP and Operational data structures.
 Created SSIS packages to migrate data from heterogeneous sources such as MS Excel, Flat files and CVS files.
 Provided thought leadership for architecture and the design of Big Data Analytics solutions for customers,
actively drive Proof of Concept (POC) and Proof of Technology (POT) evaluations and to implement a Big
Data solution

Environment: ADF, Databricks and ADL Spark, Hive, HBase, Sqoop, Flume, ADF, Blob, cosmos DB, MapReduce, HDFS,
Cloudera, SQL, Apache Kafka, Azure, Python, power BI, Unix, SQL Server.

Client: General Motors, Detroit, MI |Mar 2019 to Jul 2020


Role: Big data Developer.
Responsibilities:
 Involved in Requirement gathering, Business Analysis and translated business requirements into technical
design in Hadoop and Big Data.
 Involved in SQOOP implementation which helps in loading data from various RDBMS sources to Hadoop
systems and vice versa.
 Developed Python scripts to extract the data from the web server output files to load into HDFS.
 Written a python script which automates to launch the EMR cluster and configures the Hadoop applications.
 Extensively worked with Avro and Parquet files and converted the data from either format Parsed Semi
Structured JSON data and converted to Parquet using Data Frames in PySpark.
 Involved in Analyzing system failures, identifying root causes, and recommended course of actions,
Documented the systems processes and procedures for future references.
 Involved in Configuring Hadoop cluster and load balancing across the nodes.
 Involved in Hadoop installation, Commissioning, Decommissioning, Balancing, Troubleshooting, Monitoring
and, debugging Configuration of multiple nodes using Hortonworks platform.
 Involved in working with Spark on top of Yarn/MRv2 for interactive and Batch Analysis.
 Involved in managing and monitoring Hadoop cluster using Cloudera Manager.
 Used Python and Shell scripting to build pipelines.
 Developed data pipeline using Sqoop, HQL, Spark and Kafka to ingest Enterprise message delivery data into
HDFS.
 Developed workflow in Oozie also in Airflow to automate the tasks of loading data into HDFS and pre-
processing with Pig and Hive.
 Assisted in creating and maintaining technical documentation to launching HADOOP Clusters and even for
executing Hive queries and Pig Scripts.
 Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive
semi structured and unstructured data.
 Loaded unstructured data into Hadoop distributed File System (HDFS).
 Created HIVE Tables with dynamic and static partitioning including buckets for efficiency. Also created
external tables in HIVE for staging purposes.
 Loaded HIVE tables with data, wrote hive queries which run on MapReduce and Created customized BI tool
for manager teams that perform query analytics using HiveQL.
 Aggregated RDDs based on the business requirements and converted RDDs into Data frames saved as
temporary hive tables for intermediate processing and stored in HBase/Cassandra and RDBMs.

Environment: Hadoop 3.0, Hive 2.1, J2EE, JDBC, Pig 0.16, HBase 1.1, Sqoop, NoSQL, Impala, Java, Spring, MVC, XML,
Spark 1.9, PL/SQL, HDFS, JSON, Hibernate, Bootstrap, jQuery.

Client: Humana, Seattle, USA |Sep 2017 to Feb 2019


Role: Data Engineer
Responsibilities:
 Worked on creating tabular models on Azure analytic services for meeting business reporting requirements.
 Developed Python, PySpark, Bash scripts logs to Transform, and Load data across on premise and cloud
platform.
 Worked on Apache Spark Utilizing the Spark, SQL, and Streaming components to support the intraday 
 and real-time data processing.
 Data Ingestion to one or more cloud Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, 
 Azure DW) and cloud migration processing the data in Azure Databricks.
 Creating pipelines, data flows and complex data transformations and manipulations using ADF and PySpark
with Databricks.
 Have good experience working with Azure BLOB and Data Lake storage and loading data into Azure SQL
Synapse analytics (DW).
 Experience working with Azure SQL Database Import and Export Service.
 Created Snow pipe for continuous data load from staged data residing on cloud gateway servers.
 Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
 Converted existing Map Reduce jobs into Spark transformations and actions using Spark RDDs, Data frames
and Spark SQL APIs.
 Used Spark Streaming to divide streaming data into batches as an input to spark engine for batch processing.
 Develop transformation logic using snow pipeline. Hands-on experience with Snowflake utilities, Snow SQL,
SnowPipe, Big Data model techniques using Python / Java.
 ETL pipelines in and out of data warehouse using combination of Python and Snowflakes SnowSQL Writing
SQL queries against Snowflake.

Environment:  Azure, ADF, Azure Data Lake Gen2, PySpark, Scala, Snowflake, Streaming, Agile methods.

Client: Charter Communications, NC | Mar 2015 to Oct 2017


Role: Data Warehouse Developer
Responsibilities
 Creation, manipulation and supporting the SQL Server databases.
 Involved in the Data modeling, Physical and Logical Design of Database
 Helped in integration of the front end with the SQL Server backend.
 Created Stored Procedures, Triggers, Indexes, User defined Functions, Constraints etc on various database
objects to obtain the required results.
 Import & Export of data from one server to other servers using tools like Data Transformation Services (DTS)
 Wrote T-SQL statements for retrieval of data and involved in performance tuning of TSQL queries.
 Transferred data from various data sources/business systems including MS Excel, MS Access, Flat Files etc to
SQL Server using SSIS/DTS using various features like data conversion etc. Also Created derived columns
from the present columns for the given requirements.
 Supported team in resolving SQL Reporting services and T-SQL related issues and Proficiency in creating
different types of reports such as Cross-Tab, Conditional, Drill-down, Top N, Summary, Form, OLAP and Sub
reports, and formatting them.
 Provided via the phone, application support. Developed and tested Windows command files and SQL Server
queries for Production database monitoring in 24/7 support.
 Created logging for ETL load at package level and task level to log number of records processed by each
package and each task in a package using SSIS.
 Developed, monitored and deployed SSIS packages.

Client: Acheron Software Consulting, Hyderabad | May 2012 to Oct 2014


Role: Data Warehouse Developer
Responsibilities
 Assisted in gathering business requirement from end users.
 Coordinated tasks with onsite and offshore team members in India.
 Developed Shared Containers for reusability in all the jobs for several projects.
 Used stages like Transformer, sequential, Oracle, and Hash for Lookup, Aggregator, Folder,
 Worked with Metadata Definitions, Import and Export of DataStage jobs using Data Stage Manager.
 Involved in creating different projects parameters using Administrator.
 Used DataStage Director for running, monitoring and scheduling the Jobs.
 Familiar with import/export of DataStage Components (Jobs, DS Routines, DS Transforms, Table Definitions
etc.) between DataStage Projects, use of Dataset Management (DSM) utility and multiple job compile utility
with use of Data Stage Manager.
 Used Shared container for simplifying design and easy maintenance purpose.
 Experienced in fine tuning, Trouble shooting, bug fixing, defect analysis and testing of DataStage Jobs.
 Used UNIX shell scripts to invoke DataStage jobs from command line.
 Involved in Unit testing, Functional testing and Integration testing. Designed the Target Schema definition
and Extraction, Transformation and Loading (ETL) using DataStage.
 Developed SQL queries to performed DDL, DML, and DCL.

You might also like