avro-jobs-in-warangal, Warangal

6 Avro Jobs nearby Warangal

Toggle to save search

posted 2 months ago

Software Architect-Scala

Infosys

9 to 14 Yrs

Hyderabad, Telangana

Scala
EMR
JSON
Protocol Buffers
Build Tools
GIT
data structures
algorithms
communication skills
Spark Core
RDDs
Spark SQL
Spark Optimization Techniques
Scala Functional Programming
Scala OOPS principles
Hadoop Environment
AWS S3
Python programming
Workflow Orchestration tools
API calls in Scala
Apache AVRO
Parquet
Geospatial data analytics
Test cases using frameworks like scalatest
analytical abilities

Role Overview: As a seasoned Spark Developer with 5+ years of experience, you have a deep understanding of developing, testing, deploying, and debugging Spark Jobs using Scala on the Hadoop Platform. Your expertise includes proficiency in Spark Core and working with RDDs and Spark SQL, knowledge of Spark Optimization Techniques and Best practices, familiarity with Scala Functional Programming concepts like Try, Option, Future, and Collections, understanding of Scala OOPS principles including Classes, Traits, Objects (Singleton and Companion), and Case Classes, strong grasp of Scala Language Features such as Type System and Implicit/Givens, hands-on experience in Hadoop Environment (HDFS/Hive), AWS S3, EMR, proficiency in Python programming and working with Workflow Orchestration tools like Airflow, Oozie, experience in making API calls in Scala and exposure to file formats like Apache AVRO, Parquet, and JSON. Key Responsibilities: - Develop, test, deploy, and debug Spark Jobs using Scala on the Hadoop Platform - Work proficiently with Spark Core, RDDs, and Spark SQL - Implement Spark Optimization Techniques and Best practices - Utilize Scala Functional Programming concepts like Try, Option, Future, and Collections - Apply Scala OOPS principles including Classes, Traits, Objects (Singleton and Companion), and Case Classes - Demonstrate strong grasp of Scala Language Features such as Type System and Implicit/Givens - Gain hands-on experience in Hadoop Environment (HDFS/Hive), AWS S3, EMR - Utilize proficiency in Python programming and working with Workflow Orchestration tools like Airflow, Oozie - Make API calls in Scala and work with file formats like Apache AVRO, Parquet, and JSON Qualifications Required: - 9-14 years of experience in relevant field - Job Location: Hyderabad,

ACTIVELY HIRING

Top Companies are Hiring in Your City

For Multiple Roles

Jio Platforms Ltd

posted 1 month ago

Architect, Senior (Java AWS)

Infor

12 to 16 Yrs

Hyderabad, Telangana

Java
Spring Boot
Apache Kafka
Kafka
Spark
Kubernetes
Docker
MongoDB
Data Governance
Stream processing
Avro
Cloudnative technologies
Microservices architecture
RESTful services
API design
Distributed systems design
Eventdriven architecture
Domaindriven design
AWS ecosystem
SQL databases
Data streaming ingestion pipelines
Multithreaded programming
Asynchronous communication
Defensive programming techniques
SLAbound systems
Observability
Security principles
Agile practices
DevOps pipelines
CICD automation
C4 Model
Lucidchart
Data Mesh
Master Data Management MDM
Schema registry
Protobuf
AWS Certification
Kubernetes Certification
Software Architecture Certification

As a Senior Software Architect at our organization, you will play a crucial role as a key leader in the architecture team. Your main responsibility will be to define and evolve the architectural blueprint for complex distributed systems built using Java, Spring Boot, Apache Kafka, and cloud-native technologies. Here are some key responsibilities you will be expected to fulfill: - Own and evolve the overall system architecture for Java-based microservices and data-intensive applications. - Define and enforce architecture best practices, including clean code principles, DDD, event-driven design, and cloud-native patterns. - Lead technical design sessions, architecture reviews, and design walkthroughs for high-impact features and integrations. - Design solutions focusing on performance, scalability, security, and reliability in high-volume, multi-tenant environments. - Collaborate with product and engineering teams to translate business requirements into scalable technical architectures. - Drive the use of DevSecOps, automated testing, and CI/CD to enhance development velocity and code quality. - Act as a mentor for senior developers and engage in a hands-on role when needed in prototyping or unblocking critical issues. - Contribute to architecture documentation, including high-level design diagrams, flowcharts, and decision records. - Lead architecture governance efforts and influence platform roadmaps. To be considered for this role, you should meet the following qualifications: - 12-15 years of hands-on experience in Java-based enterprise application development, with at least 4-5 years in an architectural leadership role. - Deep expertise in microservices architecture, Spring Boot, RESTful services, and API design. - Strong understanding of distributed systems design, event-driven architecture, and domain-driven design. - Proven experience with Kafka, Spark, Kubernetes, Docker, and AWS ecosystem (S3, EC2, IAM, Lambda, etc.). - Proficiency in multithreaded programming, asynchronous communication, and defensive programming techniques. - Experience in designing SLA-bound, high-availability systems and observability (logs, metrics, tracing). - Strong foundation in security principles, including data encryption, identity management, and secure APIs. - Working knowledge of Agile practices, DevOps pipelines, and CI/CD automation. - Exceptional communication, leadership, and cross-functional collaboration skills. Additionally, the preferred qualifications for this role include exposure to tools like C4 Model, Lucidchart, or similar tools for system architecture and diagramming, experience leading architectural transformations, knowledge of Data Mesh, Data Governance, or Master Data Management concepts, and certification in AWS, Kubernetes, or Software Architecture. About Infor: Infor is a global leader in business cloud software products that cater to companies in industry-specific markets. With a focus on industry suites in the cloud, Infor prioritizes user experience, data science, and seamless integration into existing systems. Over 60,000 organizations worldwide rely on Infor for business-wide digital transformation. Join Infor and become part of a global community that values bold thinking and innovation. Your expertise will not only solve problems but also shape industries, unlock opportunities, and create real-world impact for billions of people. At Infor, you are not just building a career; you are helping to build the future. For more information, visit www.infor.com.,

ACTIVELY HIRING

posted 3 weeks ago

Associate- BIM

Axtria - Ingenious Insights

4 to 8 Yrs

Hyderabad, Telangana

Python
Numpy
SQL
ETL
Data Integration
Data Processing
Data Transformation
Data Aggregation
Performance Optimization
AVRO
Distributed Systems
Snowflake
PySpark
Pandas
Spark Query Tuning
File Formats ORC
Parquet
Compression Techniques
Modular Programming
Robust Programming
ETL Development
Databricks

As a driven business analyst in the area of pharma/life sciences, your role will involve working on complex Analytical problems to support better business decision making. You will be responsible for the following key tasks: - Write Pyspark queries for data transformation needs. - Participate in ETL Design using any python framework for new or changing mappings and workflows, and prepare technical specifications. - Write complex SQL queries with a focus on performance tuning and optimization. - Demonstrate the ability to handle tasks independently and lead the team when necessary. - Coordinate with cross-functional teams to ensure project objectives are met. - Collaborate with data architects and engineers to design and implement data models. For this role, the qualifications required include: - BE/B.Tech or Master of Computer Application degree. - Advanced knowledge of PySpark, python, pandas, numpy frameworks. - Minimum 4 years of extensive experience in design, build, and deployment of Spark/Pyspark for data integration. - Deep experience in developing data processing tasks using PySpark, such as reading data from external sources, merging data, performing data enrichment, and loading into target data destinations. - Create Spark jobs for data transformation and aggregation. - Spark query tuning and performance optimization, with a good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques. - Deep understanding of distributed systems (e.g., CAP theorem, partitioning, replication, consistency, and consensus). - Experience in Modular Programming & Robust programming methodologies. - ETL knowledge and experience in ETL development using any python framework. - Preference for experience working with Databricks/Snowflake in the past. In addition to the technical competencies required for the role, key behavioral competencies sought after include Ownership, Teamwork & Leadership, Cultural Fit, Motivation to Learn and Grow. This position also values problem-solving skills, life science knowledge, and effective communication.,

ACTIVELY HIRING

Are these jobs relevant for you?

posted 2 months ago

Azure Data Bricks + SQL

Neev Systems

4 to 8 Yrs

Telangana

SQL
Microsoft SQL Server
Python
Scala
Java
Hadoop
Spark
Airflow
Kafka
Hive
Neo4J
Elastic Search
Avro
JSON
Data modeling
Data transformation
Data governance
Azure Databricks
PySpark
Spark SQL
Azure Data Factory
ADLS
Azure SQL Database
Azure Synapse Analytics
Event Hub
Streaming Analytics
Cosmos DB
Purview
NiFi
Delta Lake
Parquet
CSV
REST APIs
Data Lake
Lakehouse projects

Role Overview: You are an experienced Azure Databricks + SQL Developer / Big Data Engineer responsible for designing, developing, and maintaining scalable data solutions on Azure. Your primary focus will be on building efficient ETL/ELT pipelines, optimizing SQL queries, and leveraging Databricks and other Azure services for advanced data processing, analytics, and data platform engineering. Your strong background in traditional SQL development and modern big data technologies on Azure will be crucial for this role. Key Responsibilities: - Develop, maintain, and optimize ETL/ELT pipelines using Azure Databricks (PySpark/Spark SQL). - Write and optimize complex SQL queries, stored procedures, triggers, and functions in Microsoft SQL Server. - Design and build scalable, metadata-driven ingestion pipelines for both batch and streaming datasets. - Perform data integration and harmonization across multiple structured and unstructured data sources. - Implement orchestration, scheduling, exception handling, and log monitoring for robust pipeline management. - Collaborate with peers to evaluate and select appropriate tech stack and tools. - Work closely with business, consulting, data science, and application development teams to deliver analytical solutions within timelines. - Support performance tuning, troubleshooting, and debugging of Databricks jobs and SQL queries. - Utilize other Azure services such as Azure Data Factory, Azure Data Lake, Synapse Analytics, Event Hub, Cosmos DB, Streaming Analytics, and Purview as needed. - Support BI and Data Science teams in consuming data securely and in compliance with governance standards. Qualification Required: - 5+ years of overall IT experience with at least 4+ years in Big Data Engineering on Microsoft Azure. - Proficiency in Microsoft SQL Server (T-SQL) stored procedures, indexing, optimization, and performance tuning. - Strong experience with Azure Data Factory (ADF), Databricks, ADLS, PySpark, and Azure SQL Database. - Working knowledge of Azure Synapse Analytics, Event Hub, Streaming Analytics, Cosmos DB, and Purview. - Proficiency in SQL, Python, and either Scala or Java with debugging and performance optimization skills. - Hands-on experience with big data technologies such as Hadoop, Spark, Airflow, NiFi, Kafka, Hive, Neo4J, and Elastic Search. - Strong understanding of file formats such as Delta Lake, Avro, Parquet, JSON, and CSV. - Solid background in data modeling, data transformation, and data governance best practices. - Experience designing and building REST APIs with practical exposure to Data Lake or Lakehouse projects. - Ability to work with large and complex datasets, ensuring data quality, governance, and security standards. - Certifications such as DP-203: Data Engineering on Microsoft Azure or Databricks Certified Developer (DE) are a plus.,

ACTIVELY HIRING

posted 2 months ago

Senior Lead - ML Engineering

Talent Worx

7 to 14 Yrs

Hyderabad, Telangana

Machine Learning
Natural Language Processing
Python
Apache Spark
Docker
Kubernetes
SQL
Git
GitHub
Azure DevOps
Code Review
Debugging
Java
Apache Kafka
Scala
MLOps
NLP libraries
AWSGCP Cloud
CICD pipelines
OOP Design patterns
TestDriven Development
Linux OS
Version control system
Problemsolving
Agile principles
Apache Avro
Kotlin

As a Senior Lead Machine Learning Engineer at our client, a Global leader in financial intelligence, data analytics, and AI-driven solutions, you will be at the forefront of building cutting-edge ML-powered products and capabilities for natural language understanding, information retrieval, and data sourcing solutions. Your role in the Document Platforms and AI team will involve spearheading the development and deployment of production-ready AI products and pipelines, mentoring a talented team, and playing a critical role in shaping the future of global markets. **Responsibilities:** - Build production ready data acquisition and transformation pipelines from ideation to deployment - Be a hands-on problem solver and developer helping to extend and manage the data platforms - Apply best practices in data modeling and building ETL pipelines (streaming and batch) using cloud-native solutions - Drive the technical vision and architecture for the extraction project, making key decisions about model selection, infrastructure, and deployment strategies - Design, develop, and evaluate state-of-the-art machine learning models for information extraction, leveraging techniques from NLP, computer vision, and other relevant domains - Develop robust pipelines for data cleaning, preprocessing, and feature engineering to prepare data for model training - Train, tune, and evaluate machine learning models, ensuring high accuracy, efficiency, and scalability - Deploy and maintain machine learning models in a production environment, monitoring their performance and ensuring their reliability - Stay up-to-date with the latest advancements in machine learning and NLP, and explore new techniques and technologies to improve the extraction process - Work closely with product managers, data scientists, and other engineers to understand project requirements and deliver effective solutions - Ensure high code quality and adherence to best practices for software development - Effectively communicate technical concepts and project updates to both technical and non-technical audiences **Qualifications:** - 7-14 years of professional software work experience, with a strong focus on Machine Learning, Natural Language Processing (NLP) for information extraction and MLOps - Expertise in Python and related NLP libraries (e.g., spaCy, NLTK, Transformers, Hugging Face) - Experience with Apache Spark or other distributed computing frameworks for large-scale data processing - AWS/GCP Cloud expertise, particularly in deploying and scaling ML pipelines for NLP tasks - Solid understanding of the Machine Learning model lifecycle, including data preprocessing, feature engineering, model training, evaluation, deployment, and monitoring, specifically for information extraction models - Experience with CI/CD pipelines for ML models, including automated testing and deployment - Docker & Kubernetes experience for containerization and orchestration - OOP Design patterns, Test-Driven Development and Enterprise System design - SQL (any variant, bonus if this is a big data variant) - Linux OS (e.g. bash toolset and other utilities) - Version control system experience with Git, GitHub, or Azure DevOps - Excellent Problem-solving, Code Review and Debugging skills - Software craftsmanship, adherence to Agile principles and taking pride in writing good code - Techniques to communicate change to non-technical people **Nice to have:** - Core Java 17+, preferably Java 21+, and associated toolchain - Apache Avro - Apache Kafka - Other JVM based languages - e.g. Kotlin, Scala Join our client's team to be a part of a global company, collaborate with a highly skilled team, and contribute to solving high complexity, high impact problems!,

ACTIVELY HIRING

posted 1 month ago

Technical Architect

Transnational AI Private Limited

2 to 7 Yrs

Telangana

Apache Kafka
Python
Flask
MySQL
PostgreSQL
MongoDB
FastAPI
AWS Lambda
SageMaker
MLflow
ONNX
AWS Certifications
DVC

As a senior Technical Architect at Transnational AI Private Limited, your primary role will be to design and lead the backend development and system design for real-time, event-driven microservices integrating AI/ML capabilities. You will be working with cutting-edge frameworks such as FastAPI, Kafka, AWS Lambda, and collaborating with data scientists to embed ML models into production-grade systems. Key Responsibilities: - Design and implement event-driven architectures using Apache Kafka to orchestrate distributed microservices and streaming pipelines. - Define scalable message schemas (e.g., JSON/Avro), data contracts, and versioning strategies to support AI-powered services. - Architect hybrid event + request-response systems to balance real-time streaming and synchronous business logic. - Develop Python-based microservices using FastAPI, enabling standard business logic and AI/ML model inference endpoints. - Collaborate with AI/ML teams to operationalize ML models via REST APIs, batch processors, or event consumers. - Integrate model-serving platforms such as SageMaker, MLflow, or custom Flask/ONNX-based services. - Design and deploy cloud-native applications using AWS Lambda, API Gateway, S3, CloudWatch, and optionally SageMaker or Fargate. - Build AI/ML-aware pipelines for automating retraining, inference triggers, or model selection based on data events. - Implement autoscaling, monitoring, and alerting for high-throughput AI services in production. - Ingest and manage high-volume structured and unstructured data across MySQL, PostgreSQL, and MongoDB. - Enable AI/ML feedback loops by capturing usage signals, predictions, and outcomes via event streaming. - Support data versioning, feature store integration, and caching strategies for efficient ML model input handling. - Write unit, integration, and end-to-end tests for standard services and AI/ML pipelines. - Implement tracing and observability for AI/ML inference latency, success/failure rates, and data drift. - Document ML integration patterns, input/output schema, service contracts, and fallback logic for AI systems. Preferred Qualifications: - 7+ years of backend software development experience with 2+ years in AI/ML integration or MLOps. - Strong experience in productionizing ML models for classification, regression, or NLP use cases. - Experience with streaming data pipelines and real-time decision systems. - AWS Certifications (Developer Associate, Machine Learning Specialty) are a plus. - Exposure to data versioning tools (e.g., DVC), feature stores, or vector databases is advantageous. If you join Transnational AI Private Limited, you will work in a mission-driven team delivering real-world impact through AI and automation. You will collaborate with architects, ML engineers, data scientists, and product managers in a flat hierarchy, innovation-friendly environment, and experience rapid career advancement.,

ACTIVELY HIRING

posted 2 weeks ago

Sr. Data Engineer/Architect

Barclays

5 to 9 Yrs

All India, Pune

ETL
APIs
JSON
Avro
Glue
Snowflake
data modeling
data quality
data integration
data governance
application design
architecture modeling
data analysis
data governance
distributed systems
DBT
PCI DSS
tokenization
encryption
Parquet
AWS services
S3
Databricks
Hadoop ecosystem
Databricks
Delta Lake
Medallion architecture
data design patterns
database technologies
RDMBS
NoSQL databases
PA DSS

As a Sr. Data Engineer/Architect at Barclays, you will play a vital role in driving innovation and excellence in the digital landscape. You will utilize cutting-edge technology to enhance digital offerings, ensuring exceptional customer experiences. Working alongside a team of engineers, business analysts, and stakeholders, you will tackle complex technical challenges that require strong analytical skills and problem-solving abilities. **Key Responsibilities:** - Experience and understanding in ETL, APIs, various data formats (JSON, Avro, Parquet) and experience in documenting/maintaining interface inventories. - Deep understanding of AWS services (e.g., Glue, S3, Databricks, Snowflake) and Hadoop ecosystem for data processing and storage. - Familiarity with Databricks, Delta Lake, and Medallion architecture for advanced analytics and fraud detection use cases. - Build logical and physical data models, enforce data quality, and integrate data across multiple systems. - Data Design and Requirements Analysis: Able to apply data design patterns and frameworks, working knowledge of schemas and normalization. - Experience in preparing architecture vision documents, data flow diagrams, and maintain auditable governance documentation. - Understands user requirement gathering to define data flow, model and design. - Knowledge of basic activities and deliverables of application design; ability to utilize application design methodologies, tools and techniques to convert business requirements and logical models into a technical application design. - Knowledge of Architecture Modelling; ability to develop and modify enterprise architecture through conceptual, logical and physical approaches. - Knowledge of data, process and events; ability to use tools and techniques for analyzing and documenting logical relationships among data, processes or events. - Knows the tools and techniques used for data governance. Understands the relevance of following, creating and improving policies to ensure data is secure including data privacy (e.g. token generation). - Knowledge on the right platform for the data transmission and ensure the cloud / on prem servers are appropriately used. Also, ensure the cost is considered while choosing the cloud vs on-perm platform. - Knowledge on the database and latest updates to help provide the right tools and design. - Proficient in communicating data standards and demonstrating their value to the wider audience. **Qualifications Required:** - Educated to degree or MBA level to be able to meet the intellectual demands of the job, or can demonstrate equivalent experience. - Good understanding of distributed systems and databases. - Good understanding of DBT (Data Build Tool). - Good understanding of AWS database technologies e.g. Databricks, Snowflake. - Knowledge of PCI DSS and PA DSS tokenization and encryption. - Understands basic features of RDMBS and NoSQL databases. The role is based in Pune. In this role, you will build and maintain data architectures pipelines, design and implement data warehouses and data lakes, develop processing and analysis algorithms, and collaborate with data scientists to deploy machine learning models. Your responsibilities also include advising on decision making, contributing to policy development, and ensuring operational effectiveness. All colleagues at Barclays are expected to demonstrate the Barclays Values of Respect, Integrity, Service, Excellence, and Stewardship. Additionally, adherence to the Barclays Mindset to Empower, Challenge and Drive is crucial for creating a culture of excellence and integrity within the organization. As a Sr. Data Engineer/Architect at Barclays, you will play a vital role in driving innovation and excellence in the digital landscape. You will utilize cutting-edge technology to enhance digital offerings, ensuring exceptional customer experiences. Working alongside a team of engineers, business analysts, and stakeholders, you will tackle complex technical challenges that require strong analytical skills and problem-solving abilities. **Key Responsibilities:** - Experience and understanding in ETL, APIs, various data formats (JSON, Avro, Parquet) and experience in documenting/maintaining interface inventories. - Deep understanding of AWS services (e.g., Glue, S3, Databricks, Snowflake) and Hadoop ecosystem for data processing and storage. - Familiarity with Databricks, Delta Lake, and Medallion architecture for advanced analytics and fraud detection use cases. - Build logical and physical data models, enforce data quality, and integrate data across multiple systems. - Data Design and Requirements Analysis: Able to apply data design patterns and frameworks, working knowledge of schemas and normalization. - Experience in preparing architecture vision documents, data flow diagrams, and maintain auditable governance documentation. - Understands user requirement gathering to define data flow, model and design. - Knowledge of ba

ACTIVELY HIRING

posted 2 months ago

Data Engineer - ETL/Python

Techno-Comp Computer Services Pvt. Ltd.

2 to 9 Yrs

All India

Python
MongoDB
Snowflake
Glue
Kafka
SQL
JSON
ORC
Avro
scheduling
APIs
Data streaming
Spark ecosystem
Scala programming
AWS EMR
S3
BigData pipelines
NoSQL databases
Parquet
CSV

As a Data Engineer with 6-9 years of experience, your role will involve the following key responsibilities: - Design, develop, and maintain scalable and robust data pipelines for collecting, processing, and transforming large datasets. - Implement ETL (Extract, Transform, Load) processes to ensure efficient movement of data across multiple systems. - Design, implement, and optimize relational and non-relational databases to support business needs. - Ensure data quality, consistency, and accuracy by building validation checks, monitoring systems, and automating data reconciliation processes. - Work with cloud platforms (AWS, Azure, GCP) to deploy and manage data storage, compute, and processing resources. - Monitor and optimize the performance of data pipelines, queries, and data processing workflows. Qualifications and Skills required for this role include: - 5.5 years of experience in Spark ecosystem, Python/Scala programming, MongoDB data loads, Snowflake and AWS platform (EMR, Glue, S3), Kafka. - Hands-on experience in writing advanced SQL queries and familiarity with a variety of databases. - Experience in coding solutions using Python/Spark and performing performance tuning/optimization. - Experience in building and optimizing Big-Data pipelines in the Cloud. - Experience in handling different file formats like JSON, ORC, Avro, Parquet, CSV. - Hands-on experience in data processing with NoSQL databases like MongoDB. - Familiarity and understanding of job scheduling. - Hands-on experience working with APIs to process data. - Understanding of data streaming, such as Kafka services. - 2+ years of experience in Healthcare IT projects. - Certification on Snowflake (Snow PRO certification) and AWS (Cloud Practitioner/Solution Architect). - Hands-on experience on Kafka streaming pipelines implementation. You will be a valuable asset to the team with your expertise in data engineering and cloud infrastructure management.,

ACTIVELY HIRING

posted 2 months ago

Java Kafka Developer

Capco

5 to 9 Yrs

Maharashtra, Pune

Java
Spring Boot
Distributed Systems
Microservices
JSON
Avro
Docker
Kubernetes
Splunk
Confluent Kafka
Protobuf

As a highly skilled Java Developer with expertise in Spring Boot, Confluent Kafka, and distributed systems, your main responsibility will be designing, developing, and optimizing event-driven applications using Confluent Kafka while leveraging Spring Boot/Spring Cloud for microservices-based architectures. Your key responsibilities will include: - Developing, deploying, and maintaining scalable and high-performance applications using Java (Core Java, Collections, Multithreading, Executor Services, CompletableFuture, etc.). - Working extensively with Confluent Kafka, including producer-consumer frameworks, offset management, and optimization of consumer instances based on message volume. - Ensuring efficient message serialization and deserialization using JSON, Avro, and Protobuf with Kafka Schema Registry. - Designing and implementing event-driven architectures with real-time processing capabilities. - Optimizing Kafka consumers for high-throughput and low-latency scenarios. - Collaborating with cross-functional teams to ensure seamless integration and deployment of services. - Troubleshooting and resolving performance bottlenecks and scalability issues in distributed environments. - Having familiarity with containerization (Docker, Kubernetes) and cloud platforms is a plus. - Possessing experience with monitoring and logging tool - Splunk is a plus. Qualifications required for this role include: - 5+ years of experience in Java development. - Strong expertise in Spring Boot, Confluent Kafka, and distributed systems. - Proficiency in designing and optimizing event-driven applications. - Experience with microservices-based architectures using Spring Boot/Spring Cloud. - Knowledge of JSON, Avro, and Protobuf for message serialization and deserialization. - Familiarity with Docker, Kubernetes, and cloud platforms. - Experience with Splunk for monitoring and logging is a plus.,

ACTIVELY HIRING

posted 2 months ago

Senior Talend Developer

Inxite Out

5 to 9 Yrs

Karnataka

Talend
Data Integration
SQL
PLSQL
Data Modeling
Relational Databases
JSON
XML
Git
Jenkins
DevOps
Data Governance
Data Quality
Metadata Management
ETLELT
Cloud Platforms
REST APIs
File Formats
CICD

Role Overview: As a skilled and detail-oriented Senior Talend Developer, your role will involve designing, developing, and optimizing ETL/ELT processes using Talend Data Integration to support the enterprise data ecosystem. You will collaborate with various teams to build robust data pipelines, facilitate data migration, and enhance data-driven decision-making within the organization. Key Responsibilities: - Design and develop scalable ETL/ELT pipelines utilizing Talend (Big Data, Data Fabric, or Open Studio). - Collaborate with data architects, business analysts, and stakeholders to address data integration requirements. - Integrate data from diverse on-premise and cloud-based sources into a centralized data warehouse or data lake. - Create and manage Talend jobs, routes, and workflows; schedule and monitor job execution via Talend Administration Center (TAC). - Ensure data quality, integrity, and consistency across all systems. - Optimize existing jobs for enhanced performance, reusability, and maintainability. - Conduct code reviews, establish best practices, and mentor junior developers on Talend standards and design. - Troubleshoot data integration issues and provide support for ongoing data operations. - Document technical designs, data flows, and operational procedures. Qualification Required: - Bachelor's degree in Computer Science, Information Systems, or a related field. - 5+ years of hands-on experience with Talend ETL tools (Talend Data Integration / Talend Big Data Platform). - Strong knowledge of SQL, PL/SQL, and data modeling concepts (star/snowflake schemas). - Proficiency in working with relational databases (e.g., Oracle, SQL Server, MySQL) and cloud platforms (AWS, Azure, or GCP). - Familiarity with REST APIs, JSON/XML, and file formats (CSV, Parquet, Avro). - Experience with Git, Jenkins, or other DevOps/CI-CD tools. - Knowledge of data governance, data quality, and metadata management best practices. Additional Details: Omit this section as there are no additional details of the company present in the job description.,

ACTIVELY HIRING

posted 2 weeks ago

Pyspark Data Engineer

People Prime Worldwide

8 to 12 Yrs

Maharashtra, Pune

Python
Distributed Computing
Cloud Services
Software Development
Kafka
Numpy
API development
PySpark
Big Data Ecosystem
Database
SQL
Apache Airflow
Pandas
RESTful services
Data file formats
Agile development methodologies

You will be responsible for designing, developing, and maintaining scalable and efficient data processing pipelines using PySpark and Python. Your key responsibilities will include: - Building and implementing ETL (Extract, Transform, Load) processes to ingest data from various sources and load it into target destinations. - Optimizing PySpark applications for performance and troubleshooting existing code. - Ensuring data integrity and quality throughout the data lifecycle. - Collaborating with cross-functional teams, including data engineers and data scientists, to understand and fulfill data needs. - Providing technical leadership, conducting code reviews, and mentoring junior team members. - Translating business requirements into technical solutions and contributing to architectural discussions. - Staying current with the latest industry trends in big data and distributed computing. You must possess the following mandatory skills and experience: - Advanced proficiency in PySpark and Python with extensive experience in building data processing applications. - In-depth understanding of distributed computing principles, including performance tuning. - Experience with big data ecosystem technologies such as Hadoop, Hive, Sqoop, and Spark. - Hands-on experience with cloud platforms like AWS (e.g., Glue, Lambda, Kinesis). - Strong knowledge of SQL, experience with relational databases, and data warehousing. - Experience with software development best practices, including version control (Git), unit testing, and code reviews. The following desired or "nice-to-have" skills would be advantageous: - Familiarity with orchestration tools like Apache Airflow. - Experience with other data processing tools like Kafka or Pandas/Numpy. - Knowledge of API development and creating RESTful services. - Experience with data file formats like Parquet, ORC, and Avro. - Experience with Agile development methodologies.,

ACTIVELY HIRING

posted 2 months ago

Senior Java Developer

Ashnik

5 to 9 Yrs

Maharashtra

Java
Kafka
Data Integration
Leadership
Java Programming
Data Validation
JSON
Avro
REST
SOAP
FTP
Distributed Systems
Git
Maven
Gradle
Docker
AWS
Azure
GCP
Protobuf
EventDriven Architecture
CICD

Role Overview: You will be joining as a Senior Java Developer at the team to lead the development of custom connectors for integrating source systems with a Kafka cluster. Your expertise in Java programming, data integration, and Kafka architecture will be essential for this role. You will have the responsibility of managing a small team of developers and collaborating effectively with Kafka administrators. This position will involve hands-on development, leadership responsibilities, and direct engagement with client systems and data pipelines. Key Responsibilities: - Design, develop, and maintain Java-based connectors for integrating diverse source systems with Kafka clusters. - Incorporate business logic, data validation, and transformation rules within the connector code. - Work closely with client teams to understand source systems, data formats, and integration requirements. - Guide and mentor a team of developers, review code, and ensure adherence to best practices. - Collaborate with Kafka admins to optimize connector deployment, resolve operational issues, and ensure high availability. - Develop monitoring solutions to track data flow and resolve issues in real-time. - Create comprehensive technical documentation for the custom connectors, including configuration guides and troubleshooting procedures. - Evaluate and implement improvements in performance, scalability, and reliability of the connectors and associated data pipelines. Qualification Required: - Strong proficiency in Java, including multi-threading, performance tuning, and memory management. - Hands-on experience with Confluent Kafka and Kafka Connect. - Familiarity with JSON, Avro, or Protobuf data formats. - Experience in integrating data from various sources such as relational databases, APIs, and flat files. - Understanding of data exchange protocols like REST, SOAP, and FTP. - Knowledge of event-driven architecture and distributed systems. - Ability to lead a small team of developers, delegate tasks, and manage project timelines. - Strong mentoring and communication skills to guide team members and collaborate with non-technical stakeholders. - Proven ability to troubleshoot complex data and system integration issues. - Analytical mindset for designing scalable, reliable, and efficient data pipelines. - Experience with CI/CD pipelines, Git, and build tools like Maven and Gradle. - Familiarity with containerization (Docker) and cloud environments (AWS, Azure, or GCP) is a plus.,

ACTIVELY HIRING

posted 2 months ago

Senior Software Engineer (Data Engineering)

Bigthinkcode Technologies Private Limited

5 to 9 Yrs

Chennai, Tamil Nadu

Python
SQL
dbt
RDBMS
JSON
Avro
Snowflake
Git
Kafka
AWS
GCP
Azure
ETLELT frameworks
cloud data warehouses
Apache Airflow
AWS Glue
Parquet
Redshift
BigQuery
CICD
Kinesis
Spark Streaming
dimensional
star schema

As a skilled Data Engineer at BigThinkCode Technologies, your role involves designing, building, and maintaining robust data pipelines and infrastructure to optimize data flow and ensure scalability. Your technical expertise in Python, SQL, ETL/ELT frameworks, and cloud data warehouses, along with strong collaboration skills, will be crucial in partnering with cross-functional teams to enable seamless access to structured and unstructured data across the organization. **Key Responsibilities:** - Design, develop, and maintain scalable ETL/ELT pipelines for structured and unstructured data processing. - Optimize and manage SQL queries for performance and efficiency in handling large-scale datasets. - Collaborate with data scientists, analysts, and business stakeholders to translate requirements into technical solutions. - Ensure data quality, governance, and security across pipelines and storage systems. - Document architectures, processes, and workflows for clarity and reproducibility. **Required Technical Skills:** - Proficiency in Python for scripting, automation, and pipeline development. - Expertise in SQL for complex queries, optimization, and database design. - Hands-on experience with ETL/ELT tools such as Apache Airflow, dbt, and AWS Glue. - Experience working with both structured (RDBMS) and unstructured data (JSON, Parquet, Avro). - Familiarity with cloud-based data warehouses like Redshift, BigQuery, and Snowflake. - Knowledge of version control systems like Git and CI/CD practices. **Preferred Qualifications:** - Experience with streaming data technologies such as Kafka, Kinesis, and Spark Streaming. - Exposure to cloud platforms like AWS, GCP, Azure, and their data services. - Understanding of data modeling techniques like dimensional and star schema and optimization. The company offers benefits like a flexible schedule, health insurance, paid time off, and a performance bonus. Thank you for considering this opportunity at BigThinkCode Technologies.,

ACTIVELY HIRING

posted 1 week ago

Snowflake Developer with PL/SQL

PibyThree

2 to 6 Yrs

Maharashtra, Navi Mumbai

cloud
snowflake
plsql
dbt

As a Snowflake Data Engineer at PibyThree Consulting Pvt Ltd., you will be responsible for leveraging your expertise in Snowflake Data Cloud and cloud platforms such as AWS, Azure, or Google Cloud Platform to develop and maintain data solutions. Your main responsibilities will include: - Having 2+ years of experience in Snowflake Data Cloud and a total of 4+ years of experience in the field. - Demonstrating proficiency in PL/SQL, Oracle, and Snowflake Internal External Staging and Loading options. - Utilizing your deep exposure to Snowflake features to write SnowSQL and Stored Procedures. - Developing ETL routines for Snowflake using Python, Scala, or ETL tools. - Applying your knowledge of AWS or Azure platforms for data ingestion to Snowflake from various formats like CSV, JSON, Parquet, and Avro. - Conducting SQL performance tuning, identifying technical issues, and resolving failures effectively. - Evaluating existing data structures and creating advanced SQL and PL/SQL programs. - Demonstrating proficiency in at least one programming language such as Python, Scala, or Pyspark, and familiarity with DBT. Qualifications required for this role include a minimum of 4 years of experience, strong skills in cloud platforms, Snowflake, PL/SQL, and DBT. Join us at PibyThree Consulting Pvt Ltd. and contribute to cutting-edge data solutions using your expertise in Snowflake and related technologies.,

ACTIVELY HIRING

posted 2 weeks ago

Senior Associate - Data Engineering

PwC Acceleration Center India

4 to 8 Yrs

All India

Python
Apache Spark
Kafka
ETL
AWS
Azure
GCP
JSON
Avro
RDBMS
NoSQL
Docker
Kubernetes
GitHub
Snowflake
Data Governance
Data Quality
Data Security
Data Integration
Agile Methodology
PySpark
CSV
Parquet
CICD
Databricks
Azure Data Factory
Data Orchestration
Generative AI
Large Language Models LLMs

Role Overview: At PwC, you will be part of the data and analytics engineering team, focusing on utilizing advanced technologies to create robust data solutions for clients. Your role will involve transforming raw data into actionable insights to drive informed decision-making and business growth. As a data engineer at PwC, your main responsibilities will include designing and constructing data infrastructure and systems to enable efficient data processing and analysis. You will also be involved in developing and implementing data pipelines, data integration, and data transformation solutions. Key Responsibilities: - Design, develop, and maintain robust, scalable ETL pipelines using tools such as Apache Spark, Kafka, and other big data technologies. - Create scalable and reliable data architectures, including Lakehouse, hybrid batch/streaming systems, Lambda, and Kappa architectures. - Demonstrate proficiency in Python, PySpark, Spark, and solid understanding of design patterns (e.g., SOLID). - Ingest, process, and store structured, semi-structured, and unstructured data from various sources. - Utilize cloud platforms like AWS, Azure, and GCP to set up data pipelines. - Optimize ETL processes for scalability and efficiency. - Work with JSON, CSV, Parquet, and Avro file formats. - Possess deep knowledge of RDBMS, NoSQL databases, and CAP theorem principles. - Collaborate with data scientists, analysts, and stakeholders to optimize data models for performance and scalability. - Document data processes, architectures, and models comprehensively for cross-team understanding. - Implement and maintain CI/CD pipelines using tools like Docker, Kubernetes, and GitHub. - Ensure data quality, integrity, and security across all systems and processes. - Implement and monitor data governance best practices. - Stay updated with emerging data technologies and trends for innovation and improvement. - Familiarity with Cloud Data/Integration/Orchestration Platforms like Snowflake, Databricks, and Azure Data Factory is beneficial. Qualifications Required: - BE / masters in design / B Design / B.Tech / HCI Certification (Preferred) - 4-7 years of experience in Programming Language (Python, Scala, Java), Apache Spark, ADF, Azure Databricks, Postgres, ETL (Batch/Streaming), Git - Familiarity with Agile methodology. Additional Company Details: No additional details provided in the job description. Role Overview: At PwC, you will be part of the data and analytics engineering team, focusing on utilizing advanced technologies to create robust data solutions for clients. Your role will involve transforming raw data into actionable insights to drive informed decision-making and business growth. As a data engineer at PwC, your main responsibilities will include designing and constructing data infrastructure and systems to enable efficient data processing and analysis. You will also be involved in developing and implementing data pipelines, data integration, and data transformation solutions. Key Responsibilities: - Design, develop, and maintain robust, scalable ETL pipelines using tools such as Apache Spark, Kafka, and other big data technologies. - Create scalable and reliable data architectures, including Lakehouse, hybrid batch/streaming systems, Lambda, and Kappa architectures. - Demonstrate proficiency in Python, PySpark, Spark, and solid understanding of design patterns (e.g., SOLID). - Ingest, process, and store structured, semi-structured, and unstructured data from various sources. - Utilize cloud platforms like AWS, Azure, and GCP to set up data pipelines. - Optimize ETL processes for scalability and efficiency. - Work with JSON, CSV, Parquet, and Avro file formats. - Possess deep knowledge of RDBMS, NoSQL databases, and CAP theorem principles. - Collaborate with data scientists, analysts, and stakeholders to optimize data models for performance and scalability. - Document data processes, architectures, and models comprehensively for cross-team understanding. - Implement and maintain CI/CD pipelines using tools like Docker, Kubernetes, and GitHub. - Ensure data quality, integrity, and security across all systems and processes. - Implement and monitor data governance best practices. - Stay updated with emerging data technologies and trends for innovation and improvement. - Familiarity with Cloud Data/Integration/Orchestration Platforms like Snowflake, Databricks, and Azure Data Factory is beneficial. Qualifications Required: - BE / masters in design / B Design / B.Tech / HCI Certification (Preferred) - 4-7 years of experience in Programming Language (Python, Scala, Java), Apache Spark, ADF, Azure Databricks, Postgres, ETL (Batch/Streaming), Git - Familiarity with Agile methodology. Additional Company Details: No additional details provided in the job description.

ACTIVELY HIRING

posted 2 days ago

Sr Data Engineer

Uplers

3 to 7 Yrs

Karnataka

Consulting
Snowflake
AWS
Python
Spark
Glue
EMR
DynamoDB
JSON
Avro
ORC
Industry experience in RetailCPGMedia
clientfacing experience
Certifications AWS
Databricks
Data Engineering
Data Modeling skills
Experience in Extract Transform Load ETL processes
Data Warehousing
Data Analytics skills
Proficiency in relevant programming languages like SQL
Python
Experience with cloud services like AWS
Databricks
Strong analytical
problemsolving skills
Programming Python
AWS Services S3
Lambda
Step Functions
Databricks Delta Lake
MLflow
Unity Catalog experience
Databases SQL databases PostgreSQL
MySQL
NoSQL MongoDB
Data Formats

You are applying for the role of Senior Data Engineer at Beige Bananas, a rapidly growing AI consulting firm specializing in creating custom AI products for Fortune 500 Retail, CPG, and Media companies with an outcome-driven mindset to accelerate clients" value realization from their analytics investments. **Role Overview:** As a Senior Data Engineer at Beige Bananas, you will be responsible for data engineering, data modeling, ETL processes, data warehousing, and data analytics. You will work independently to build end-to-end pipelines in AWS or Databricks. **Key Responsibilities:** - Design and implement scalable data architectures - Build and maintain real-time and batch processing pipelines - Optimize data pipeline performance and costs - Ensure data quality, governance, and security - Collaborate with ML teams on feature stores and model pipelines **Qualifications Required:** - Data Engineering and Data Modeling skills - Experience in Extract Transform Load (ETL) processes - Data Warehousing and Data Analytics skills - Proficiency in programming languages like SQL and Python - Experience with cloud services like AWS and Databricks - Strong analytical and problem-solving skills - Bachelor's or Master's degree in Computer Science, Engineering, or related field **Additional Details of the Company:** Beige Bananas is a pure play AI consulting firm that focuses on creating hyper-custom AI products for Fortune 500 Retail, CPG, and Media companies. They have a fast-paced environment with a focus on accelerating clients" value realization from their analytics investments. If you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, apply today and be a part of Beige Bananas" exciting journey!,

ACTIVELY HIRING

posted 2 months ago

Core Java Developer - Contractual Role

Rapsys Technologies

4 to 8 Yrs

Karnataka

Java development
Spring Boot
Spring
Microservices
API development
AVRO
Kafka
Collections
Garbage Collection
Multithreading
Hibernate
Maven
Mockito
relational databases
SQL
MongoDB
Oracle
SQL Server
Cassandra
Java 8
RESTJSON
Core Java Skills
Design pattern
JUnits
JMock
NoSQL databases
RESTful web services
API design

As a Java Developer with 6+ years of experience, you will be responsible for: - Leading Java development using Spring Boot, Java 8+, Spring, and Microservices - Developing and integrating APIs using REST/JSON, AVRO, and Kafka - Designing, building, and maintaining Rest APIs and microservices following best practices, including security and performance tuning - Demonstrating strong Core Java Skills, including knowledge of Design patterns, Collections, Garbage Collection, and Multithreading - Utilizing Java frameworks and technologies such as Spring Boot, Hibernate, and Maven - Writing and managing JUnits and Mock frameworks like Mockito, JMock or equivalent - Working with relational databases and SQL, as well as NoSQL databases like MongoDB, Oracle, SQL Server, or Cassandra - Implementing RESTful web services and API design best practices Qualifications required for this role include: - 6+ years of experience in Core Java (Preferred) - 4+ years of experience in Spring (Preferred) - 4+ years of experience in Microservices (Preferred) Please note that this is a Contractual/Temporary position based in Bangalore/Chennai and requires in-person work. If you are open to a 6-month Contractual Role and can join immediately, we encourage you to apply for this opportunity.,

ACTIVELY HIRING

posted 2 months ago

Solutions Data Architect

Radial Inc.

5 to 15 Yrs

Chennai, Tamil Nadu

AWS
Enterprise design patterns
Kafka
Java
Data Architecture
JSON
Avro
XML
Confluent Kafka
Microservice architecture
Domain Driven Design
12factor app
Kinesis
AWS services
Springboot
Oracle Databases
NoSQL DBs
Data Modelling
Data Lake
Data Mesh
API gateways
AWS architecture

As a Solutions Architect, your primary focus is to ensure the technical integrity of the Event Driven Architecture platform and to formulate the technical direction for the strategic cloud investments. You will drive the estimation, analysis, and design, as well as support implementation and operations of a slew of microservices owned by the team. Working closely with the senior tech and business leadership, engineering, and ops teams, you will drive the vision and the technical strategic objectives throughout the SDLC. It is essential for you to remain current in all Parcel and Logistics related technologies in support of enterprise applications and infrastructure. Your passion for technology and thirst for innovation will play a crucial role in shaping the future of digital transformation. Responsibilities: - Analyze, design, and lead technical solutions fulfilling core business requirements in the migration journey from legacy messaging solution to Confluent Kafka in AWS platform. Consider solution scalability, availability, security, extensibility, maintainability, risk assumptions, and cost considerations. - Actively participate in proof-of-concept implementation of new applications and services. - Research, evaluate, and recommend third-party software packages and services to enhance digital transformation capabilities. - Promote technical vision and sound engineering principles among technology department staff members and across the global team. - Occasionally assist in production escalations, systems operations, and problem resolutions. - Assist team members in adopting new Kafka and real-time data streaming solutions. Mentor the team to remain current with the latest tech trends in the global marketplace. - Perform and mentor conceptual, logical, and physical data modeling. - Drive the team to maintain semantic models. - Guide teams in adopting data warehousing, data lakes, and data mesh architectures. - Drive process, policy, and standard improvements related to architecture, design, and development principles. - Assist business leadership in prioritizing business capabilities and go-to-market decisions. - Collaborate with cross-functional teams and business teams as required to drive the strategy and initiatives forward. - Lead architecture teams in digital capabilities and competency building, mentoring junior team members. Qualifications: - 15+ years of software engineering experience with 5+ years in hands-on architecture roles. - Ability to define platform strategy, target state architecture, and implementation roadmaps for enterprise-scale applications to migrate to Kafka. - Proven hands-on architecture and design experience in Microservice architecture and Domain Driven Design concepts and principles including 12-factor app and other enterprise design patterns. - Establish enterprise architectural blueprints and cloud deployment topologies. - Highly experienced in designing high traffic services serving 1k+ transactions per second or similar high transaction volume distributed systems with resilient high-availability and fault-tolerance. - Experience in developing event-driven, message-driven asynchronous systems such as Kafka, Kinesis, etc. - Experience in AWS services such as Lambdas, ECS, EKS, EC2, S3, DynamoDB, RDS, VPCs, Route 53, ELB. - Experience in Enterprise Java, Springboot ecosystem. - Experience with Oracle Databases, NoSQL DBs, and distributed caching. - Experience with Data Architecture, Data Modeling, Data Lake, and Data Mesh implementations. - Extensive experience implementing system integrations utilizing API gateways, JSON, Avro, and XML libraries. - Excellent written, verbal, presentation, and process facilitation skills. - AWS architecture certification or equivalent working expertise with the AWS platform. - B.S. or M.S. in computer science or a related technical area preferred. Please note that this position is based out of GDC in Chennai, India, and occasional travel to Belgium might be required for business interactions and training.,

ACTIVELY HIRING

posted 2 weeks ago

Data Engineer

Enterprise Minds, Inc

4 to 8 Yrs

Maharashtra

Java
Splunk
Tableau
SQL
Spark
JSON
XML
Avro
Docker
Kubernetes
Azure
Kafka
Databricks
Grafana
Prometheus
PowerBI
Pyspark

Role Overview: As a data engineer in Pune, you will be responsible for delivering data intelligence solutions to customers globally. Your primary tasks will include implementing and deploying a product that provides insights into material handling systems" performance. You will collaborate with a dynamic team to build end-to-end data ingestion pipelines and deploy dashboards. Key Responsibilities: - Design and implement data & dashboarding solutions to maximize customer value. - Deploy and automate data pipelines and dashboards to facilitate further project implementation. - Work in an international, diverse team with an open and respectful atmosphere. - Make data available for other teams within the department to support the platform vision. - Communicate and collaborate with various groups within the company and project team. - Work independently and proactively with effective communication to provide optimal solutions. - Participate in an agile team, contribute ideas for improvements, and address concerns. - Collect feedback and identify opportunities to enhance the existing product. - Lead communication with stakeholders involved in the deployed projects. - Execute projects from conception to client handover, ensuring technical performance and organizational contribution. Qualifications Required: - Bachelor's or master's degree in computer science, IT, or equivalent. - Minimum of 4 to 8 years of experience in building and deploying complex data pipelines and solutions. - Hands-on experience with Java and Databricks. - Familiarity with visualization software such as Splunk, Grafana, Prometheus, PowerBI, or Tableau. - Strong expertise in SQL, Java, data modeling, and data schemas (JSON/XML/Avro). - Experience with Pyspark or Spark for distributed data processing. - Knowledge of deploying services as containers (e.g., Docker, Kubernetes) and working with cloud services (preferably Azure). - Familiarity with streaming and/or batch storage technologies (e.g., Kafka) is advantageous. - Experience in data quality management, monitoring, and Splunk (SPL) is a plus. - Excellent communication skills in English. (Note: No additional details of the company were provided in the job description.),

ACTIVELY HIRING

posted 2 weeks ago

CT&I - Software and Product Innovation - Data Engineering - Senior Associate

PwC Acceleration Center India

4 to 8 Yrs

Karnataka

Apache Spark
Kafka
Python
ETL
AWS
Azure
GCP
JSON
Avro
RDBMS
NoSQL
Docker
Kubernetes
GitHub
Snowflake
Data Governance
Data Quality
Data Security
Data Integration
PySpark
CSV
Parquet
CICD
Databricks
Azure Data Factory
Data Orchestration
Generative AI

Role Overview: At PwC, as a data and analytics engineering professional, your main focus will be on utilizing advanced technologies and techniques to create robust data solutions for clients. Your role is crucial in converting raw data into actionable insights, which facilitate informed decision-making and drive business growth. In the field of data engineering at PwC, your responsibilities will include designing and constructing data infrastructure and systems to enable efficient data processing and analysis. You will be tasked with developing and implementing data pipelines, data integration, and data transformation solutions. Key Responsibilities: - Design, develop, and maintain robust, scalable ETL pipelines using technologies such as Apache Spark, Kafka, and other big data tools. - Create scalable and reliable data architectures, including Lakehouse, hybrid batch/streaming systems, Lambda, and Kappa architectures. - Demonstrate proficiency in Python, PySpark, Spark, and possess a solid understanding of design patterns like SOLID. - Ingest, process, and store structured, semi-structured, and unstructured data from various sources. - Utilize cloud platforms such as AWS, Azure, and GCP to set up data pipelines. - Optimize ETL processes to ensure scalability and efficiency. - Handle various file formats like JSON, CSV, Parquet, and Avro. - Deep knowledge of RDBMS, NoSQL databases, and CAP theorem principles. - Collaborate with data scientists, analysts, and stakeholders to understand data requirements and optimize data models. - Document data processes, architectures, and models comprehensively to facilitate cross-team understanding. - Implement and maintain CI/CD pipelines using tools like Docker, Kubernetes, and GitHub. - Ensure data quality, integrity, and security across all systems and processes. - Implement and monitor data governance best practices. - Stay updated with emerging data technologies and trends, identifying opportunities for innovation and improvement. - Knowledge of other Cloud Data/Integration/Orchestration Platforms like Snowflake, Databricks, and Azure Data Factory is advantageous. Qualification Required: - Minimum 4-7 years of experience in Programming Language (Python, Scala, Java), Apache Spark, ADF, Azure Databricks, Postgres, with familiarity in NoSQL. - BE / masters in design / B Design / B.Tech / HCI Certification (Preferred),

ACTIVELY HIRING

Jobs/
Jobs in Warangal/
Avro Jobs in Warangal

Connect with us: