hadoop-jobs-in-thane, Thane

5 Hadoop Jobs nearby Thane

Toggle to save search

posted 2 months ago

Data Scientist

LTIMindtree Limited

6 to 11 Yrs

Pune, Bangalore

azure
artificial intelligence
data science
machine learning
deep learning
generative
ai

Location - LTIM Pan IndiaExperience - 5+to 12+ yrsGeneric JD-Mandatory Skills - Data Science, Gen AI, Python, RAG and Azure/AWS/GCP, AI/ML, NLP Secondary - (Any) Machine Learning, Deep Learning, ChatGPT, Langchain, Prompt, vector stores, RAG, llama, Computer vision, Deep learning, Machine learning, OCR, Transformer, regression, forecasting, classification, hyper parameter tunning, MLOps, Inference, Model training, Model DeploymentJD_ More than 6 years of experience in Data Engineering, Data Science and AI / ML domainExcellent understanding of machine learning techniques and algorithms, such as GPTs, CNN, RNN, k-NN, Naive Bayes, SVM, Decision Forests, etc.Experience using business intelligence tools (e.g. Tableau, PowerBI) and data frameworks (e.g. Hadoop)Experience in Cloud native skills.Knowledge of SQL and Python; familiarity with Scala, Java or C++ is an assetAnalytical mind and business acumen and Strong math skills (e.g. statistics, algebra)Experience with common data science toolkits, such as TensorFlow, KERAs, PyTorch, PANDAs, Microsoft CNTK, NumPy etc. Deep expertise in at least one of these is highly desirable.Experience with NLP, NLG and Large Language Models like BERT, LLaMa, LaMDA, GPT, BLOOM, PaLM, DALL-E, etc.Great communication and presentation skills. Should have experience in working in a fast-paced team culture.Experience with AIML and Big Data technologies like AWS SageMaker, Azure Cognitive Services, Google Colab, Jupyter Notebook, Hadoop, PySpark, HIVE, AWS EMR etc.Experience with NoSQL databases, such as MongoDB, Cassandra, HBase, Vector databasesGood understanding of applied statistics skills, such as distributions, statistical testing, regression, etc.Should be a data-oriented person with analytical mind and business acumen.

INTERVIEW ASSURED IN 15 MINS

Top Companies are Hiring in Your City

For Multiple Roles

Jio Platforms Ltd

posted 2 months ago

GCP Big Data Engineer

Acme Services Private Limited

6 to 11 Yrs

16 - 28 LPA

Pune, Bangalore

spark
gcp
airflow
data
scala
big
query

Job Description We are seeking a seasoned GCP Data Analytics professional with extensive experience in Big Data technologies and Google Cloud Platform services to design and implement scalable data solutionsDesign develop and optimize data pipelines using GCP BigQuery Dataflow and Apache Airflow to support largescale data analytics Utilize the Big Data Hadoop ecosystem to manage and process vast datasets efficiently Collaborate with crossfunctional teams to gather requirements and deliver reliable data solutions Ensure data quality consistency and integrity across multiple data sources Monitor and troubleshoot data workflows to maintain high system availability and performance Stay updated with emerging trends and best practices in GCP data analytics and big data technologiesRoles and Responsibilities Implement and manage ETL processes leveraging GCP services such as BigQuery Dataflow and Airflow Develop. Scalable maintainable and reusable data pipelines to support business intelligence and analytics needs. Optimize SQL queries and data models for performance and cost efficiency in BigQuery. Integrate Hadoop ecosystem components with GCP services to enhance data processing capabilities Automate workflow orchestration using Apache Airflow for seamless data operations Collaborate with data engineers analysts and stakeholders to ensure alignment of data solutions with organizational goals Participate in code reviews testing and deployment activities adhering to best practices Mentor junior team members and contribute to continuous improvement initiatives within the data engineering team Mandatory Skills : GCP Storage,GCP BigQuery,GCP DataProc,GCP Cloud Composer,GCP DMS,Apache airflow,Java,Python,Scala,GCP Datastream,Google Analytics Hub,GCP Workflows,GCP Dataform,GCP Datafusion,GCP Pub/Sub,ANSI-SQL,GCP Dataflow,GCP Data Flow,GCP Cloud Pub/Sub,Big Data Hadoop Ecosystem

INTERVIEW ASSURED IN 15 MINS

posted 2 months ago

Data Engineer

CMA CGM Global Business Services (India)

4 to 8 Yrs

Thane, Maharashtra

Database Design
Data Integration
Apache Spark
Kafka
SQL
MongoDB
AWS
Azure
ETL Pipelines
NoSQL databases

Role Overview: As a Data Engineer, you will be responsible for handling the flow and storage of data efficiently, ensuring the model has access to the right datasets. Key Responsibilities: - Designing efficient database systems to store large volumes of transactional or behavioral data using SQL, NoSQL, Hadoop, and Spark. - Building ETL pipelines to collect, transform, and load data from various sources into a usable format for analysis. - Integrating different data sources, including transactional data, customer behavior, and external data sources. Qualifications Required: - Experience in database design and management. - Proficiency in ETL pipeline development. - Knowledge of data integration techniques. - Familiarity with tools and technologies such as Apache Spark, Kafka, SQL, NoSQL databases (e.g., MongoDB), and cloud data services (AWS, Azure). - Knowledge of UX/UI tools like Sketch, Adobe XD, Tableau, Power BI would be beneficial for building interfaces to visualize fraud prediction and alerts, as well as user-friendly dashboards.,

ACTIVELY HIRING

Are these jobs relevant for you?

posted 1 week ago

Python Developer

KAP SOFTECH

5 to 7 Yrs

4.0 - 6 LPA

Mumbai City

python
django
iot

Key Responsibilities Design, develop, and maintain robust applications and APIs using Python. Work with cross-functional teams (Data Science, Analytics, DevOps) to implement AI/ML models into production. Build data pipelines and perform data wrangling, transformation, and analysis. Contribute to analytics dashboards and reporting systems for business insights. Optimize code for performance, scalability, and reliability. Ensure best practices in version control, testing, CI/CD, and documentation. Stay updated with emerging technologies (AI, ML, analytics, DevOps). Must-Have Skills Strong programming skills in Python (mandatory). Hands-on experience with Python libraries/frameworks such as Pandas, NumPy, Flask/Django, FastAPI. Proficiency in data manipulation, scripting, and automation. Strong understanding of OOP, design patterns, and software engineering principles. Familiarity with databases (SQL/NoSQL) Good-to-Have Skills Knowledge of AI/ML frameworks (TensorFlow, PyTorch, Scikit-learn, Hugging Face). Experience in data visualization and analytics tools (Matplotlib, Seaborn, Power BI, Tableau). Exposure to cloud platforms and services Understanding of containerization & orchestration (Docker, Kubernetes). Knowledge of big data tools (Spark, Hadoop) and streaming platforms (Kafka). Experience integrating ML/AI models into production systems.

posted 2 months ago

Kafka Developer

Tata Technologies

3 to 7 Yrs

Thane, Maharashtra

Kafka
AWS
SQL
Python
Airflow
PySpark
Dockers
Containers

As a Technical Support Engineer with 3-5 years of experience in Kafka implementation, your responsibilities will include: - Designing, implementing, and maintaining enterprise RedHat Kafka or other editions - Creating and managing Kafka Clusters on cloud environments and Containers - Integrating applications with Kafka - Proficiency in scripting languages like Python and Spark - Strong understanding of message queuing and stream processing architecture - Contributing to team design discussions with detailed technical information - Developing high-quality, scalable, and extensible code - Providing work breakdown and estimates for complex software development tasks - Identifying strategic/tactical solutions, conducting risk assessments, and making recommendations - Expertise in Linux and cloud-based operating systems - Collaborating with multiple teams to optimize Kafka usage and ensure data safety in event streaming Qualifications required for this role: - Hands-on experience in the end-to-end implementation of Kafka - Proficiency in Python, Spark, and Hadoop - Familiarity with ELK and Kibana - Experience in Docker and container-oriented infrastructure - Ability to multitask and prioritize in a fast-paced, team-oriented environment - Bachelor's degree in computer science or equivalent work experience In addition to the technical responsibilities, you are ideally expected to have 3+ years of experience and a proven track record of building enterprise-level Kafka solutions.,

ACTIVELY HIRING

posted 1 month ago

Data Engineer

CAPGEMINI TECHNOLOGY SERVICES INDIA LIMITED

6 to 11 Yrs

Pune, Bangalore

sql
scala
cloud
django
hadoop
python
flask
devops
pyspark

Job Description- Data EngineerTotal yrs of exp should be 6+yrs Must have 3+ years in Pyspark. Strong programming experience, Python, Pyspark, Scala is preferred. Experience in designing and implementing CI/CD, Build Management, and Development strategy. Experience with SQL and SQL Analytical functions, experience participating in key business, architectural and technical decisions Scope to get trained on AWS cloud technology Proficient in leveraging Spark for distributed data processing and transformation. Skilled in optimizing data pipelines for efficiency and scalability. Experience with real-time data processing and integration. Familiarity with Apache Hadoop ecosystem components. Strong problem-solving abilities in handling large-scale datasets. Ability to collaborate with cross-functional teams and communicate effectively with stakeholders. Primary Skills : Pyspark SQL Secondary Skill: Experience on AWS/Azure/GCP would be added advantage

INTERVIEW ASSURED IN 15 MINS

posted 3 weeks ago

Java IOT with Hadoop

Cognizant

1 to 5 Yrs

Pune, All India

Java
Spring boot
J2EE
Spring Framework
REST
SOAP
Tomcat
JETTY
JBOSS
Relational Databases
Oracle
Informix
MySQL
Kafka
WMQ
Active MQ
RabbitMQ
JUnit
TestNG
Selenium
XML
JSON
Cloud
Azure
AWS
Data structures
Micro services
Java Application Servers
TOMEE
Messaging Systems
Testing frameworks
Web Services APIs
CICD
Algorithm

As a Java Developer at our company, you will be responsible for the following: - Developing Web Service REST and SOAP and Micro services using Java and Spring framework. - Having a good understanding and working knowledge with any one or more of Java Application Servers such as Tomcat, TOMEE, JETTY, JBOSS, WAS. - Working with Relational Databases like Oracle, Informix, MySQL. - Using Messaging Systems like Kafka, WMQ, Active MQ, RabbitMQ. - Implementing Testing frameworks like JUnit, TestNG, Selenium. - Working with Web Services APIs including REST, SOAP, XML, JSON. Qualifications required for this role include: - Strong background in Java, Spring boot, J2EE, Spring Framework, Spring Boot development. - Minimum One year experience in Cloud platforms such as Azure or AWS. - Exposure to modern deployment and CI/CD processes. - Excellent programming skills including data structures and algorithms. If you join our team, you will have the opportunity to work with cutting-edge technologies and contribute to the development of innovative solutions. As a Java Developer at our company, you will be responsible for the following: - Developing Web Service REST and SOAP and Micro services using Java and Spring framework. - Having a good understanding and working knowledge with any one or more of Java Application Servers such as Tomcat, TOMEE, JETTY, JBOSS, WAS. - Working with Relational Databases like Oracle, Informix, MySQL. - Using Messaging Systems like Kafka, WMQ, Active MQ, RabbitMQ. - Implementing Testing frameworks like JUnit, TestNG, Selenium. - Working with Web Services APIs including REST, SOAP, XML, JSON. Qualifications required for this role include: - Strong background in Java, Spring boot, J2EE, Spring Framework, Spring Boot development. - Minimum One year experience in Cloud platforms such as Azure or AWS. - Exposure to modern deployment and CI/CD processes. - Excellent programming skills including data structures and algorithms. If you join our team, you will have the opportunity to work with cutting-edge technologies and contribute to the development of innovative solutions.

ACTIVELY HIRING

posted 5 days ago

Big Data Engineer (Spark & Scala)

Golden Opportunities

6 to 10 Yrs

Maharashtra

Apache Spark
Scala
ETL
Hadoop
Hive
HDFS
AWS
Azure
GCP

Role Overview: As a Big Data Engineer specializing in Spark & Scala, you will be responsible for developing and optimizing big data processing pipelines using Apache Spark and Scala. Your role will involve designing and implementing ETL workflows for large-scale batch and real-time data processing. Additionally, you will be expected to optimize Spark performance through partitioning, caching, and memory/shuffle tuning. Collaboration with cross-functional teams and adherence to best practices in coding, testing, and deployment are essential aspects of this role. Key Responsibilities: - Develop and optimize big data processing pipelines using Apache Spark (Core, SQL, Streaming) and Scala. - Design and implement ETL workflows for large-scale batch and real-time data processing. - Optimize Spark performance through partitioning, caching, and memory/shuffle tuning. - Work with big data ecosystems like Hadoop, Hive, HDFS, and cloud platforms (AWS/Azure/GCP). - Collaborate with cross-functional teams and follow best practices in coding, testing, and deployment. Qualifications Required: - Bachelor's degree in a relevant field. - 6 to 10 years of experience in a similar role. - Strong expertise in Apache Spark and Scala. - Familiarity with big data ecosystems like Hadoop, Hive, HDFS, and cloud platforms. - Self-confidence and patience. (Note: The additional details of the company were not provided in the job description.),

ACTIVELY HIRING

posted 2 weeks ago

Data Engineers

SID Global Solutions

6 to 10 Yrs

Maharashtra

SQL
Python
Scala
Java
Airflow
Spark
AWS
GCP
Azure
Hadoop
Kafka
NiFi

You will be responsible for designing, building, and maintaining data pipelines (ETL / ELT) and ingesting, transforming, and integrating data from various sources. You will also optimize data storage in data lakes and warehouses, ensuring data quality, consistency, and governance. Additionally, you will collaborate with analytics and data science teams on datasets and monitor, log, and alert data infrastructure. Key Responsibilities: - Design, build, and maintain data pipelines (ETL / ELT) - Ingest, transform, and integrate data from various sources - Optimize data storage in data lakes and warehouses - Ensure data quality, consistency, and governance - Collaborate with analytics and data science teams on datasets - Monitor, log, and alert data infrastructure Qualifications Required: - 6+ years in data engineering or related roles - Proficiency in SQL, Python, Scala, or Java - Experience with ETL/ELT tools such as Airflow, Spark, NiFi, etc. - Familiarity with cloud data platforms like AWS, GCP, Azure - Knowledge of big data technologies like Hadoop, Kafka, Spark (is a plus) - Experience in data modeling, partitioning, and performance tuning (Note: No additional company details were provided in the job description.),

ACTIVELY HIRING

posted 1 week ago

Data/Information Mgt Int Anlst

Citi

5 to 9 Yrs

Pune, Maharashtra

Hive
Hadoop
SQL
Excel
Tableau
Power BI
Google Analytics
Adobe Analytics
Data visualization
Stakeholder management
Project management
Mentoring
Data analysis
Communication
BigData systems
Spark Python

Role Overview: You will be a part of Citi Analytics Information Management, a global community that connects and analyzes information to create actionable intelligence for business leaders. As a member of this fast-growing organization, you will be responsible for developing and maintaining reporting systems, collaborating with cross-functional teams, interpreting data to provide insights, and managing end-to-end project communications. Key Responsibilities: - Develop and maintain reporting systems to track key performance metrics, collaborating with cross-functional teams for accurate and timely delivery of reports and dashboards. - Rationalize, enhance, transform, and automate reports as required, performing adhoc and root cause analysis to address specific challenges. - Interpret data to identify trends, patterns, and anomalies, providing insights to stakeholders for informed decision-making. - Translate data into customer behavioral insights for targeting and segmentation strategies, effectively communicating with business partners and senior leaders. - Collaborate and manage project communication with onsite business partners and team in India, leading projects and mentoring a team of analysts. - Ensure data accuracy and consistency by following standard control procedures and adhering to Citis Risk and Control guidelines. Qualifications Required: - 5+ years of experience in BigData systems, Hive, Hadoop, Spark (Python), and cloud-based data management technologies. - Proficiency in SQL, Excel, and data visualization tools like Tableau, Power BI, or similar software. - Knowledge of digital channels, marketing, and tools used for audience engagement. - Expertise in Google Analytics/Adobe Analytics for tracking and reporting website traffic and journey analytics. - Strong background in reporting and data analysis, excellent communication and stakeholder management skills. - Ability to create presentations, present reports, findings, and recommendations to diverse audiences. - Proven ability to manage projects, mentor teams, and contribute to organizational initiatives. - Bachelor's degree in computer science, Engineering, or related field. Additional Company Details: Citi Analytics Information Management was established in 2003 with locations across multiple cities in India including Bengaluru, Chennai, Gurgaon, Mumbai, and Pune. The function aims to balance customer needs, business strategy, and profit objectives using best-in-class analytic methodologies. (Note: Omitted the irrelevant sections such as EEO Policy Statement and Other Relevant Skills),

ACTIVELY HIRING

posted 1 week ago

Data/Information Mgt Int Anlst - C11

Early Career

5 to 9 Yrs

Pune, Maharashtra

Hive
Hadoop
SQL
Excel
Tableau
Power BI
Google Analytics
Adobe Analytics
Data visualization
Stakeholder management
Project management
Mentoring
Data analysis
Communication
BigData systems
Spark Python

As a member of Citi Analytics Information Management, you will play a crucial role in developing and maintaining reporting systems to track key performance metrics aligned with the organization's goals. Your responsibilities will include collaborating with cross-functional teams to ensure accurate and timely delivery of reports and dashboards. Additionally, you will rationalize, enhance, transform, and automate reports as required, while performing adhoc analysis and root cause analysis to address specific reporting challenges. You will be expected to interpret data to identify trends, patterns, and anomalies, providing valuable insights to stakeholders to support informed decision-making. Your role will involve translating data into customer behavioral insights to drive targeting and segmentation strategies. Effective communication skills will be essential as you will be required to clearly and effectively communicate with business partners and senior leaders. Furthermore, you will collaborate individually and manage end-to-end project communication with onsite business partners and the team in India. Your leadership skills will be put to the test as you lead projects and mentor a team of analysts to maintain a high standard of work. It will be crucial to ensure data accuracy and consistency by following standard control procedures and adhering to Citis Risk and Control guidelines. To excel in this role, you should possess 5+ years of experience in BigData systems, Hive, Hadoop, Spark (Python), and cloud-based data management technologies. Proficiency in SQL, Excel, and data visualization tools such as Tableau, Power BI, or similar software is required. Knowledge of digital channels, marketing methods, and tools used by businesses to engage with their audience is essential. Additionally, expertise in using Google Analytics or Adobe Analytics to track and report website traffic, funnel performance, and journey analytics is preferred. Your educational background should include a Bachelor's degree in computer science, engineering, or a related field. Strong communication and stakeholder management skills are a must, as well as the ability to create presentations and present reports, findings, and recommendations to diverse audiences. Your proven ability to manage projects and mentor a team will be valuable assets in this role. This job description provides an overview of the responsibilities and expertise required for the position. Other duties may be assigned as needed. Position: C11 Job Family Group: Decision Management Job Family: Data/Information Management Time Type: Full time Most Relevant Skills: - 5+ years of experience in BigData systems, Hive, Hadoop, Spark (Python), and cloud-based data management technologies - Proficiency in SQL, Excel, and data visualization tools such as Tableau, Power BI or similar software - Knowledge of digital channels, marketing methods, and tools used by businesses to engage with their audience - Expertise in using Google Analytics or Adobe Analytics - Strong communication and stakeholder management skills - Project management and team mentoring experience Preferred Qualifications: - Exposure to Digital Business and Expertise in Adobe Site Catalyst, Clickstream Data If you require a reasonable accommodation to use our search tools or apply for a career opportunity due to a disability, please review Accessibility at Citi.,

ACTIVELY HIRING

posted 1 day ago

Data Engineer (Python, PySpark, Iceberg) - Assistant Vice President

Early Career

8 to 14 Yrs

Pune, Maharashtra

Python
Apache Spark
Hadoop
AWS
Azure
GCP
SQL
Oracle
PostgreSQL
Docker
Kubernetes
Data Pipeline Development
Big Data Infrastructure
Apache Iceberg
Apache Hudi
Trino
Apache Airflow
Prefect

As an Applications Development Senior Programmer Analyst at our company, you will play a crucial role in establishing and implementing new or revised application systems and programs in coordination with the Technology team. Your main responsibilities will involve conducting feasibility studies, providing IT planning, developing new applications, and offering user support. You will need to leverage your specialty knowledge to analyze complex problems, recommend security measures, and consult with users on advanced programming solutions. Additionally, you will be responsible for ensuring operating standards are followed and acting as an advisor to junior analysts. **Key Responsibilities:** - Conduct feasibility studies, time and cost estimates, and risk analysis for applications development - Monitor and control all phases of the development process including analysis, design, testing, and implementation - Provide user support and operational assistance on applications - Analyze complex problems and provide evaluation of business and system processes - Recommend and develop security measures for successful system design - Consult with users on advanced programming solutions and assist in system installations - Define operating standards and processes and serve as an advisor to junior analysts **Qualifications:** - 8-14 years of relevant experience - Strong experience in systems analysis and programming - Proven track record in managing and implementing successful projects - Working knowledge of consulting and project management techniques - Ability to work under pressure and manage deadlines effectively - Proficiency in Python programming language - Expertise in data processing frameworks like Apache Spark, Hadoop - Experience with cloud data platforms such as AWS, Azure, or GCP - Strong knowledge of SQL and database technologies - Familiarity with data orchestration tools like Apache Airflow or Prefect - Experience with containerization technologies like Docker and Kubernetes would be a plus In addition to the above responsibilities and qualifications, please note that this job description provides a high-level overview of the work performed. Additional duties may be assigned as required. If you are a person with a disability and require a reasonable accommodation to apply for this role, please review the Accessibility at Citi information.,

ACTIVELY HIRING

posted 3 days ago

Mid/Sr. Software Engineer

ReliaQuest

3 to 15 Yrs

Maharashtra

Python
JS
Angular
Java
C
MySQL
Elastic Search
Elasticsearch
Kafka
Apache Spark
Logstash
Hadoop
Hive
Kibana
Athena
Presto
BigTable
AWS
GCP
Azure
unit testing
continuous integration
Agile Methodology
React
Tensorflow

Role Overview: As a Software Engineer at ReliaQuest, you will have the opportunity to work on cutting-edge technologies and drive the automation of threat detection and response for a rapidly growing industry. You will be responsible for researching and developing creative solutions, creating REST APIs, managing deployment processes, performing code reviews, and automating various stages of the software development lifecycle. Collaboration with internal and external stakeholders will be key to ensure seamless product utilization. Key Responsibilities: - Research and develop solutions using cutting-edge technologies to evolve the GreyMatter platform - Create REST APIs and integrations to enhance and automate threat detection for customers - Manage continuous integration and deployment processes for complex technologies - Conduct code reviews to ensure consistent improvement - Automate and enhance all stages of software development lifecycle - Collaborate closely with different parts of the business to facilitate easy product utilization - Provide support to team members and foster a culture of collaboration Qualifications Required: - 3-6 years of Software Development experience for mid-level roles and 7-15 years for Senior-level positions in Python, JS, React, Angular, Java, C#, MySQL, Elastic Search or equivalent - Proficiency in written and verbal English - Hands-on experience with technologies such as Elasticsearch, Kafka, Apache Spark, Logstash, Hadoop/hive, Tensorflow, Kibana, Athena/Presto/BigTable, Angular, React - Familiarity with cloud platforms like AWS, GCP, or Azure - Strong understanding of unit testing, continuous integration, and deployment practices - Experience with Agile Methodology - Higher education or relevant certifications This job at ReliaQuest offers you the chance to be part of a dynamic team working on groundbreaking security technology. Join us to contribute to the growth and success of the company while learning from some of the best in the industry.,

ACTIVELY HIRING

posted 1 week ago

Senior Software Engineer - Analytics

LogiNext

4 to 8 Yrs

Maharashtra

Software Development
Big Data
Algorithms
Statistics
Machine Learning
Continuous Integration
Indexing
Clustering
SQL
Redis
Hadoop
Yarn
Spark
Kafka
PostGIS
Data Visualization
Cloud Environment
Agile Scrum Processes
NoSQL DBs
Time Series DBs
Geospatial DBs
Python Programming
Data Processing Analytics
Querying
Mongo
Casandra
Redshift
PigHive
Machine Learning Algorithms

Role Overview: As a Senior Software Engineer - Analytics at LogiNext, you will be responsible for building data products that extract valuable business insights for efficiency and customer experience. Your role will involve managing, processing, and analyzing large amounts of raw information in scalable databases. Additionally, you will be developing unique data structures and writing algorithms for new products. Critical thinking and problem-solving skills will be essential, along with experience in software development and advanced algorithms. Exposure to statistics and machine learning algorithms as well as familiarity with cloud environments, continuous integration, and agile scrum processes will be beneficial. Key Responsibilities: - Develop software that generates data-driven intelligence in products dealing with Big Data backends - Conduct exploratory analysis of data to design efficient data structures and algorithms - Manage data in large-scale data stores (e.g., NoSQL DBs, time series DBs, Geospatial DBs) - Create metrics and evaluate algorithms for improved accuracy and recall - Ensure efficient data access and usage through methods like indexing and clustering - Collaborate with engineering and product development teams Qualifications Required: - Master's or Bachelor's degree in Engineering (Computer Science, Information Technology, Information Systems, or related field) from a top-tier school, or a master's degree or higher in Statistics, Mathematics, with a background in software development - 4 to 7 years of experience in product development with algorithmic work - 3+ years of experience working with large data sets or conducting large-scale quantitative analysis - Understanding of SaaS-based products and services - Strong algorithmic problem-solving skills - Ability to mentor and manage a team, taking responsibility for team deadlines - Proficiency in Python programming language - Experience with data processing analytics and visualization tools in Python (e.g., pandas, matplotlib, Scipy) - Strong understanding of SQL and querying NoSQL databases (e.g., Mongo, Cassandra, Redis) - Understanding of working with and managing large databases, such as indexing, sharding, caching, etc. - Exposure to Big Data technologies like Hadoop, Yarn, Redshift, Spark, Kafka, Pig/Hive - Exposure to machine learning algorithms - Familiarity with geospatial data stores, with exposure to PostGIS being a plus - Desirable exposure to data visualization tools,

ACTIVELY HIRING

posted 3 days ago

Cloud Data Engineer

Hitachi Careers

6 to 10 Yrs

Pune, Maharashtra

SQL
Python
Hadoop
Spark
AWS
Azure
GCP
Data governance
Data security
Compliance
ETL processes

As a Data Engineer at the company, you will be responsible for designing, implementing, and maintaining the data infrastructure and pipelines necessary for AI/ML model training and deployment. Working closely with data scientists and engineers, you will ensure that data is clean, accessible, and efficiently processed. Key Responsibilities: - Build and maintain scalable data pipelines for data collection, processing, and analysis. - Ensure data quality and consistency for training and testing AI models. - Collaborate with data scientists and AI engineers to provide the required data for model development. - Optimize data storage and retrieval to support AI-driven applications. - Implement data governance practices to ensure compliance and security. Qualifications Required: - 6-8 years of experience in data engineering, preferably in financial services. - Strong proficiency in SQL, Python, and big data technologies (e.g., Hadoop, Spark). - Experience with cloud platforms (e.g., AWS, Azure, GCP) and data warehousing solutions. - Familiarity with ETL processes and tools, as well as knowledge of data governance, security, and compliance best practices. At GlobalLogic, you will experience a culture of caring where people come first. You will be part of an inclusive culture of acceptance and belonging, building meaningful connections with collaborative teammates, supportive managers, and compassionate leaders. The company is committed to your continuous learning and development, offering opportunities to try new things, sharpen your skills, and advance your career. You will work on projects that matter, engage your curiosity and problem-solving skills, and contribute to cutting-edge solutions shaping the world today. GlobalLogic supports balance and flexibility in work and life, providing various career areas, roles, and work arrangements. Joining GlobalLogic means being part of a high-trust organization where integrity is key, and trust is fundamental to relationships with employees and clients.,

ACTIVELY HIRING

posted 2 weeks ago

Senior Staff Engineer

Nagarro

8 to 12 Yrs

Pune, Maharashtra

Talend
SSIS
DataStage
Azure
GCP
NoSQL
Spark
Dynatrace
JIRA
Process automation
Performance tuning
Communication skills
Stakeholder management
Team leadership
Mentoring
Compliance management
Automation
Scripting
Metadata management
Master data management
ETL development
Technical solutioning
Leading largescale data initiatives
ETL tools Informatica
Data pipeline orchestration tools Apache Airflow
Azure Data Factory
Cloud platforms AWS
Databases SQL
Big data ecosystems Hadoop
Monitoring tools Splunk
Prometheus
Grafana
ITSM platforms ServiceNow
CICD
DevOps practices
RFP solutioning
Client proposals
Technical strategy alignment
Data management standards
Emerging technologies evaluation
Legacy systems modernization
Security standards
Audit standards
Regulatory standards
Selfhealing mechanisms
Workflow orchestration
Lin

As a candidate for the position at Nagarro, you will be responsible for leading large-scale data initiatives and RFP responses with over 7.5 years of experience in data operations, ETL development, and technical solutioning. Your role will involve hands-on experience with ETL tools such as Informatica, Talend, SSIS, and data pipeline orchestration tools like Apache Airflow and Azure Data Factory. Exposure to cloud platforms (AWS, Azure, GCP), databases (SQL, NoSQL), and big data ecosystems (Hadoop, Spark) will be essential. Additionally, familiarity with monitoring tools (e.g., Splunk, Dynatrace, Prometheus, Grafana), ITSM platforms (e.g., ServiceNow, JIRA), CI/CD, DevOps practices, and monitoring tools for data environments is required. Key Responsibilities: - Lead the technical and business solutioning of RFPs and client proposals related to data engineering, data operations, and platform modernization. - Collaborate with architecture, governance, and business teams to align on technical strategy and data management standards. - Evaluate emerging technologies and frameworks to modernize legacy systems and improve efficiency. - Mentor and guide a team of data engineers and operations analysts, fostering a culture of technical excellence and continuous improvement. - Act as a primary liaison with business, support, and governance teams for operational matters. - Drive automation, self-healing mechanisms, and process automation across data operations to reduce manual interventions and improve system resilience. - Implement and enforce best practices including metadata management, lineage tracking, data quality monitoring, and master data management. Qualifications: - Bachelors or masters degree in computer science, Information Technology, or a related field. Please note that Nagarro is a Digital Product Engineering company with a dynamic and non-hierarchical work culture, comprising over 17500 experts across 39 countries. Join us in building products, services, and experiences that inspire, excite, and delight.,

ACTIVELY HIRING

posted 2 weeks ago

Sr. Big Data Engineer

Facile Services

5 to 9 Yrs

Pune, Maharashtra

Hadoop
Apache Spark
EMR
Athena
Glue
Python
JSON
NoSQL
Databricks
Delta Tables
PySpark
AWS data analytics services
Parquet file format
RDBMS databases

As a Big Data Engineer, you will play a crucial role in developing and managing the Big Data solutions for our company. Your responsibilities will include designing and implementing Big Data tools and frameworks, implementing ELT processes, collaborating with development teams, building cloud platforms, and maintaining the production system. Key Responsibilities: - Meet with managers to determine the companys Big Data needs. - Develop Big Data solutions on AWS using tools such as Apache Spark, Databricks, Delta Tables, EMR, Athena, Glue, and Hadoop. - Load disparate data sets and conduct pre-processing services using Athena, Glue, Spark, etc. - Collaborate with software research and development teams. - Build cloud platforms for the development of company applications. - Maintain production systems. Qualifications Required: - 5+ years of experience as a Big Data Engineer. - Proficiency in Python & PySpark. - In-depth knowledge of Hadoop, Apache Spark, Databricks, Delta Tables, and AWS data analytics services. - Extensive experience with Delta Tables, JSON, Parquet file format. - Experience with AWS data analytics services like Athena, Glue, Redshift, EMR. - Familiarity with Data warehousing will be a plus. - Knowledge of NoSQL and RDBMS databases. - Good communication skills. - Ability to solve complex data processing and transformation related problems.,

ACTIVELY HIRING

posted 2 months ago

Officer / Assistance Manager - Big Data / Hadoop ETL Developer

CDSL

2 to 6 Yrs

Maharashtra

Data warehousing
Data solutions
Data marts
SQL
Oracle
SQOOP
ETL development
Data storage
Data warehousing concepts
Logical data model
Physical database structure
Operational data stores
NIFI

As an Officer / Assistant Manager based in Mumbai, you should have a minimum of 2-3 years of ETL development experience with knowledge of ETL ideas, tools, and data structures. Your responsibilities will include: - Analyzing and troubleshooting complicated data sets - Determining data storage needs - Building a data warehouse for internal departments using data warehousing concepts - Creating and enhancing data solutions for seamless data delivery - Collecting, parsing, managing, and analyzing large sets of data - Leading the design of logical data models and implementing physical database structures - Designing, developing, automating, and supporting complex applications for data extraction, transformation, and loading - Ensuring data quality during ETL processes - Developing logical and physical data flow models for ETL applications - Utilizing advanced knowledge of SQL, Oracle, SQOOP, NIFI tools commands, and queries Qualifications required for this role include a B.E., MCA, B.Tech, or M.Sc (I.T.) degree, and an age limit of 25-30 years. If you are interested in this position, please email your resume to careers@cdslindia.com with the position applied for clearly mentioned in the subject column.,

ACTIVELY HIRING

posted 2 weeks ago

Big Data Solution Architect

Emergys

8 to 12 Yrs

Pune, Maharashtra

Cost Optimization
Hadoop Ecosystem Expertise
DevOps Cloud Preferably Azure
Architectural Leadership
EndtoEnd Big Data Delivery
Project Discovery Due Diligence

Role Overview: As a Big Data Architect with over 8 years of experience, you will be responsible for demonstrating expertise in the Hadoop Ecosystem, DevOps & Cloud (Preferably Azure), Architectural Leadership, End-to-End Big Data Delivery, Cost Optimization, and Project Discovery / Due Diligence. Your role will involve leading end-to-end projects, optimizing cloud costs, and defining solution strategies. Key Responsibilities: - Strong hands-on experience with core Hadoop components and related big data technologies - Utilize DevOps tools and CI/CD processes on cloud platforms, with a preference for Microsoft Azure - Lead at least three end-to-end projects as a Big Data Architect - Deliver complete big data solutions from requirement gathering to post-production support - Optimize cloud and infrastructure costs for data platforms - Participate in project discovery, assessment, or due diligence phases to define scope and solution strategy Qualifications Required: - 8+ years of experience in Big Data Architecture - Expertise in the Hadoop Ecosystem - Experience with DevOps tools and cloud platforms, preferably Microsoft Azure - Proven track record of leading end-to-end projects - Strong ability to optimize cloud and infrastructure costs - Previous involvement in project discovery or due diligence processes If you find that your skills match the requirements of this role, we encourage you to apply directly. Feel free to refer or share this opportunity with someone who you believe would be a strong fit.,

ACTIVELY HIRING

posted 1 week ago

Data/Information Management Analyst - C11

Citi

5 to 9 Yrs

Pune, Maharashtra

Hive
Hadoop
SQL
Excel
Tableau
Power BI
Google Analytics
Adobe Analytics
Data visualization
Stakeholder management
Project management
Mentoring
Communication skills
BigData systems
Spark Python

You will be working with Citi Analytics Information Management, a global community that connects and analyzes information to create actionable intelligence for business leaders. As part of your role, you will have the following responsibilities: - Develop and maintain reporting systems to track key performance metrics, collaborating with cross-functional teams for accurate and timely delivery. - Rationalize, enhance, transform, and automate reports as required, performing adhoc and root cause analysis. - Interpret data to identify trends, patterns, and anomalies, providing insights to stakeholders for informed decision-making. - Translate data into customer behavioral insights to drive targeting and segmentation strategies, communicating effectively to business partners and senior leaders. - Collaborate and manage end-to-end project communication with onsite business partners and team in India. - Lead projects and mentor a team of analysts, ensuring high-quality work. - Ensure data accuracy and consistency by following standard control procedures and adhering to Citis Risk and Control guidelines. To excel in this role, you should have: - 5+ years of experience in BigData systems, Hive, Hadoop, Spark (Python), and cloud-based data management technologies. - Proficiency in SQL, Excel, and data visualization tools like Tableau, Power BI, or similar software. - Knowledge of digital channels, marketing, and various methods/tools businesses use to engage with their audience. - Expertise in using Google Analytics/Adobe Analytics for tracking and reporting website traffic and journey analytics. - Strong background in reporting and data analysis, with excellent communication and stakeholder management skills. - Ability to manage projects, mentor a team, and contribute to organizational initiatives. Preferred qualifications include exposure to Digital Business and expertise in Adobe Site Catalyst, Clickstream Data. Educational Requirement: - Bachelor's degree in computer science, Engineering, or related field This job description provides an overview of the work performed, and other job-related duties may be assigned as required.,

ACTIVELY HIRING