avro-jobs-in-warangal, Warangal

6 Avro Jobs nearby Warangal

Toggle to save search
posted 2 months ago
experience9 to 14 Yrs
location
Hyderabad, Telangana
skills
  • Scala
  • EMR
  • JSON
  • Protocol Buffers
  • Build Tools
  • GIT
  • data structures
  • algorithms
  • communication skills
  • Spark Core
  • RDDs
  • Spark SQL
  • Spark Optimization Techniques
  • Scala Functional Programming
  • Scala OOPS principles
  • Hadoop Environment
  • AWS S3
  • Python programming
  • Workflow Orchestration tools
  • API calls in Scala
  • Apache AVRO
  • Parquet
  • Geospatial data analytics
  • Test cases using frameworks like scalatest
  • analytical abilities
Job Description
Role Overview: As a seasoned Spark Developer with 5+ years of experience, you have a deep understanding of developing, testing, deploying, and debugging Spark Jobs using Scala on the Hadoop Platform. Your expertise includes proficiency in Spark Core and working with RDDs and Spark SQL, knowledge of Spark Optimization Techniques and Best practices, familiarity with Scala Functional Programming concepts like Try, Option, Future, and Collections, understanding of Scala OOPS principles including Classes, Traits, Objects (Singleton and Companion), and Case Classes, strong grasp of Scala Language Features such as Type System and Implicit/Givens, hands-on experience in Hadoop Environment (HDFS/Hive), AWS S3, EMR, proficiency in Python programming and working with Workflow Orchestration tools like Airflow, Oozie, experience in making API calls in Scala and exposure to file formats like Apache AVRO, Parquet, and JSON. Key Responsibilities: - Develop, test, deploy, and debug Spark Jobs using Scala on the Hadoop Platform - Work proficiently with Spark Core, RDDs, and Spark SQL - Implement Spark Optimization Techniques and Best practices - Utilize Scala Functional Programming concepts like Try, Option, Future, and Collections - Apply Scala OOPS principles including Classes, Traits, Objects (Singleton and Companion), and Case Classes - Demonstrate strong grasp of Scala Language Features such as Type System and Implicit/Givens - Gain hands-on experience in Hadoop Environment (HDFS/Hive), AWS S3, EMR - Utilize proficiency in Python programming and working with Workflow Orchestration tools like Airflow, Oozie - Make API calls in Scala and work with file formats like Apache AVRO, Parquet, and JSON Qualifications Required: - 9-14 years of experience in relevant field - Job Location: Hyderabad,
ACTIVELY HIRING

Top Companies are Hiring in Your City

For Multiple Roles

Jio Platforms Ltd
Jio Platforms Ltdslide-preview-Genpact
posted 1 month ago
experience12 to 16 Yrs
location
Hyderabad, Telangana
skills
  • Java
  • Spring Boot
  • Apache Kafka
  • Kafka
  • Spark
  • Kubernetes
  • Docker
  • MongoDB
  • Data Governance
  • Stream processing
  • Avro
  • Cloudnative technologies
  • Microservices architecture
  • RESTful services
  • API design
  • Distributed systems design
  • Eventdriven architecture
  • Domaindriven design
  • AWS ecosystem
  • SQL databases
  • Data streaming ingestion pipelines
  • Multithreaded programming
  • Asynchronous communication
  • Defensive programming techniques
  • SLAbound systems
  • Observability
  • Security principles
  • Agile practices
  • DevOps pipelines
  • CICD automation
  • C4 Model
  • Lucidchart
  • Data Mesh
  • Master Data Management MDM
  • Schema registry
  • Protobuf
  • AWS Certification
  • Kubernetes Certification
  • Software Architecture Certification
Job Description
As a Senior Software Architect at our organization, you will play a crucial role as a key leader in the architecture team. Your main responsibility will be to define and evolve the architectural blueprint for complex distributed systems built using Java, Spring Boot, Apache Kafka, and cloud-native technologies. Here are some key responsibilities you will be expected to fulfill: - Own and evolve the overall system architecture for Java-based microservices and data-intensive applications. - Define and enforce architecture best practices, including clean code principles, DDD, event-driven design, and cloud-native patterns. - Lead technical design sessions, architecture reviews, and design walkthroughs for high-impact features and integrations. - Design solutions focusing on performance, scalability, security, and reliability in high-volume, multi-tenant environments. - Collaborate with product and engineering teams to translate business requirements into scalable technical architectures. - Drive the use of DevSecOps, automated testing, and CI/CD to enhance development velocity and code quality. - Act as a mentor for senior developers and engage in a hands-on role when needed in prototyping or unblocking critical issues. - Contribute to architecture documentation, including high-level design diagrams, flowcharts, and decision records. - Lead architecture governance efforts and influence platform roadmaps. To be considered for this role, you should meet the following qualifications: - 12-15 years of hands-on experience in Java-based enterprise application development, with at least 4-5 years in an architectural leadership role. - Deep expertise in microservices architecture, Spring Boot, RESTful services, and API design. - Strong understanding of distributed systems design, event-driven architecture, and domain-driven design. - Proven experience with Kafka, Spark, Kubernetes, Docker, and AWS ecosystem (S3, EC2, IAM, Lambda, etc.). - Proficiency in multithreaded programming, asynchronous communication, and defensive programming techniques. - Experience in designing SLA-bound, high-availability systems and observability (logs, metrics, tracing). - Strong foundation in security principles, including data encryption, identity management, and secure APIs. - Working knowledge of Agile practices, DevOps pipelines, and CI/CD automation. - Exceptional communication, leadership, and cross-functional collaboration skills. Additionally, the preferred qualifications for this role include exposure to tools like C4 Model, Lucidchart, or similar tools for system architecture and diagramming, experience leading architectural transformations, knowledge of Data Mesh, Data Governance, or Master Data Management concepts, and certification in AWS, Kubernetes, or Software Architecture. About Infor: Infor is a global leader in business cloud software products that cater to companies in industry-specific markets. With a focus on industry suites in the cloud, Infor prioritizes user experience, data science, and seamless integration into existing systems. Over 60,000 organizations worldwide rely on Infor for business-wide digital transformation. Join Infor and become part of a global community that values bold thinking and innovation. Your expertise will not only solve problems but also shape industries, unlock opportunities, and create real-world impact for billions of people. At Infor, you are not just building a career; you are helping to build the future. For more information, visit www.infor.com.,
ACTIVELY HIRING
posted 3 weeks ago

Associate- BIM

Axtria - Ingenious Insights
experience4 to 8 Yrs
location
Hyderabad, Telangana
skills
  • Python
  • Numpy
  • SQL
  • ETL
  • Data Integration
  • Data Processing
  • Data Transformation
  • Data Aggregation
  • Performance Optimization
  • AVRO
  • Distributed Systems
  • Snowflake
  • PySpark
  • Pandas
  • Spark Query Tuning
  • File Formats ORC
  • Parquet
  • Compression Techniques
  • Modular Programming
  • Robust Programming
  • ETL Development
  • Databricks
Job Description
As a driven business analyst in the area of pharma/life sciences, your role will involve working on complex Analytical problems to support better business decision making. You will be responsible for the following key tasks: - Write Pyspark queries for data transformation needs. - Participate in ETL Design using any python framework for new or changing mappings and workflows, and prepare technical specifications. - Write complex SQL queries with a focus on performance tuning and optimization. - Demonstrate the ability to handle tasks independently and lead the team when necessary. - Coordinate with cross-functional teams to ensure project objectives are met. - Collaborate with data architects and engineers to design and implement data models. For this role, the qualifications required include: - BE/B.Tech or Master of Computer Application degree. - Advanced knowledge of PySpark, python, pandas, numpy frameworks. - Minimum 4 years of extensive experience in design, build, and deployment of Spark/Pyspark for data integration. - Deep experience in developing data processing tasks using PySpark, such as reading data from external sources, merging data, performing data enrichment, and loading into target data destinations. - Create Spark jobs for data transformation and aggregation. - Spark query tuning and performance optimization, with a good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques. - Deep understanding of distributed systems (e.g., CAP theorem, partitioning, replication, consistency, and consensus). - Experience in Modular Programming & Robust programming methodologies. - ETL knowledge and experience in ETL development using any python framework. - Preference for experience working with Databricks/Snowflake in the past. In addition to the technical competencies required for the role, key behavioral competencies sought after include Ownership, Teamwork & Leadership, Cultural Fit, Motivation to Learn and Grow. This position also values problem-solving skills, life science knowledge, and effective communication.,
ACTIVELY HIRING
question

Are these jobs relevant for you?

posted 2 months ago
experience4 to 8 Yrs
location
Telangana
skills
  • SQL
  • Microsoft SQL Server
  • Python
  • Scala
  • Java
  • Hadoop
  • Spark
  • Airflow
  • Kafka
  • Hive
  • Neo4J
  • Elastic Search
  • Avro
  • JSON
  • Data modeling
  • Data transformation
  • Data governance
  • Azure Databricks
  • PySpark
  • Spark SQL
  • Azure Data Factory
  • ADLS
  • Azure SQL Database
  • Azure Synapse Analytics
  • Event Hub
  • Streaming Analytics
  • Cosmos DB
  • Purview
  • NiFi
  • Delta Lake
  • Parquet
  • CSV
  • REST APIs
  • Data Lake
  • Lakehouse projects
Job Description
Role Overview: You are an experienced Azure Databricks + SQL Developer / Big Data Engineer responsible for designing, developing, and maintaining scalable data solutions on Azure. Your primary focus will be on building efficient ETL/ELT pipelines, optimizing SQL queries, and leveraging Databricks and other Azure services for advanced data processing, analytics, and data platform engineering. Your strong background in traditional SQL development and modern big data technologies on Azure will be crucial for this role. Key Responsibilities: - Develop, maintain, and optimize ETL/ELT pipelines using Azure Databricks (PySpark/Spark SQL). - Write and optimize complex SQL queries, stored procedures, triggers, and functions in Microsoft SQL Server. - Design and build scalable, metadata-driven ingestion pipelines for both batch and streaming datasets. - Perform data integration and harmonization across multiple structured and unstructured data sources. - Implement orchestration, scheduling, exception handling, and log monitoring for robust pipeline management. - Collaborate with peers to evaluate and select appropriate tech stack and tools. - Work closely with business, consulting, data science, and application development teams to deliver analytical solutions within timelines. - Support performance tuning, troubleshooting, and debugging of Databricks jobs and SQL queries. - Utilize other Azure services such as Azure Data Factory, Azure Data Lake, Synapse Analytics, Event Hub, Cosmos DB, Streaming Analytics, and Purview as needed. - Support BI and Data Science teams in consuming data securely and in compliance with governance standards. Qualification Required: - 5+ years of overall IT experience with at least 4+ years in Big Data Engineering on Microsoft Azure. - Proficiency in Microsoft SQL Server (T-SQL) stored procedures, indexing, optimization, and performance tuning. - Strong experience with Azure Data Factory (ADF), Databricks, ADLS, PySpark, and Azure SQL Database. - Working knowledge of Azure Synapse Analytics, Event Hub, Streaming Analytics, Cosmos DB, and Purview. - Proficiency in SQL, Python, and either Scala or Java with debugging and performance optimization skills. - Hands-on experience with big data technologies such as Hadoop, Spark, Airflow, NiFi, Kafka, Hive, Neo4J, and Elastic Search. - Strong understanding of file formats such as Delta Lake, Avro, Parquet, JSON, and CSV. - Solid background in data modeling, data transformation, and data governance best practices. - Experience designing and building REST APIs with practical exposure to Data Lake or Lakehouse projects. - Ability to work with large and complex datasets, ensuring data quality, governance, and security standards. - Certifications such as DP-203: Data Engineering on Microsoft Azure or Databricks Certified Developer (DE) are a plus.,
ACTIVELY HIRING
posted 2 months ago
experience7 to 14 Yrs
location
Hyderabad, Telangana
skills
  • Machine Learning
  • Natural Language Processing
  • Python
  • Apache Spark
  • Docker
  • Kubernetes
  • SQL
  • Git
  • GitHub
  • Azure DevOps
  • Code Review
  • Debugging
  • Java
  • Apache Kafka
  • Scala
  • MLOps
  • NLP libraries
  • AWSGCP Cloud
  • CICD pipelines
  • OOP Design patterns
  • TestDriven Development
  • Linux OS
  • Version control system
  • Problemsolving
  • Agile principles
  • Apache Avro
  • Kotlin
Job Description
As a Senior Lead Machine Learning Engineer at our client, a Global leader in financial intelligence, data analytics, and AI-driven solutions, you will be at the forefront of building cutting-edge ML-powered products and capabilities for natural language understanding, information retrieval, and data sourcing solutions. Your role in the Document Platforms and AI team will involve spearheading the development and deployment of production-ready AI products and pipelines, mentoring a talented team, and playing a critical role in shaping the future of global markets. **Responsibilities:** - Build production ready data acquisition and transformation pipelines from ideation to deployment - Be a hands-on problem solver and developer helping to extend and manage the data platforms - Apply best practices in data modeling and building ETL pipelines (streaming and batch) using cloud-native solutions - Drive the technical vision and architecture for the extraction project, making key decisions about model selection, infrastructure, and deployment strategies - Design, develop, and evaluate state-of-the-art machine learning models for information extraction, leveraging techniques from NLP, computer vision, and other relevant domains - Develop robust pipelines for data cleaning, preprocessing, and feature engineering to prepare data for model training - Train, tune, and evaluate machine learning models, ensuring high accuracy, efficiency, and scalability - Deploy and maintain machine learning models in a production environment, monitoring their performance and ensuring their reliability - Stay up-to-date with the latest advancements in machine learning and NLP, and explore new techniques and technologies to improve the extraction process - Work closely with product managers, data scientists, and other engineers to understand project requirements and deliver effective solutions - Ensure high code quality and adherence to best practices for software development - Effectively communicate technical concepts and project updates to both technical and non-technical audiences **Qualifications:** - 7-14 years of professional software work experience, with a strong focus on Machine Learning, Natural Language Processing (NLP) for information extraction and MLOps - Expertise in Python and related NLP libraries (e.g., spaCy, NLTK, Transformers, Hugging Face) - Experience with Apache Spark or other distributed computing frameworks for large-scale data processing - AWS/GCP Cloud expertise, particularly in deploying and scaling ML pipelines for NLP tasks - Solid understanding of the Machine Learning model lifecycle, including data preprocessing, feature engineering, model training, evaluation, deployment, and monitoring, specifically for information extraction models - Experience with CI/CD pipelines for ML models, including automated testing and deployment - Docker & Kubernetes experience for containerization and orchestration - OOP Design patterns, Test-Driven Development and Enterprise System design - SQL (any variant, bonus if this is a big data variant) - Linux OS (e.g. bash toolset and other utilities) - Version control system experience with Git, GitHub, or Azure DevOps - Excellent Problem-solving, Code Review and Debugging skills - Software craftsmanship, adherence to Agile principles and taking pride in writing good code - Techniques to communicate change to non-technical people **Nice to have:** - Core Java 17+, preferably Java 21+, and associated toolchain - Apache Avro - Apache Kafka - Other JVM based languages - e.g. Kotlin, Scala Join our client's team to be a part of a global company, collaborate with a highly skilled team, and contribute to solving high complexity, high impact problems!,
ACTIVELY HIRING
posted 1 month ago

Technical Architect

Transnational AI Private Limited
experience2 to 7 Yrs
location
Telangana
skills
  • Apache Kafka
  • Python
  • Flask
  • MySQL
  • PostgreSQL
  • MongoDB
  • FastAPI
  • AWS Lambda
  • SageMaker
  • MLflow
  • ONNX
  • AWS Certifications
  • DVC
Job Description
As a senior Technical Architect at Transnational AI Private Limited, your primary role will be to design and lead the backend development and system design for real-time, event-driven microservices integrating AI/ML capabilities. You will be working with cutting-edge frameworks such as FastAPI, Kafka, AWS Lambda, and collaborating with data scientists to embed ML models into production-grade systems. Key Responsibilities: - Design and implement event-driven architectures using Apache Kafka to orchestrate distributed microservices and streaming pipelines. - Define scalable message schemas (e.g., JSON/Avro), data contracts, and versioning strategies to support AI-powered services. - Architect hybrid event + request-response systems to balance real-time streaming and synchronous business logic. - Develop Python-based microservices using FastAPI, enabling standard business logic and AI/ML model inference endpoints. - Collaborate with AI/ML teams to operationalize ML models via REST APIs, batch processors, or event consumers. - Integrate model-serving platforms such as SageMaker, MLflow, or custom Flask/ONNX-based services. - Design and deploy cloud-native applications using AWS Lambda, API Gateway, S3, CloudWatch, and optionally SageMaker or Fargate. - Build AI/ML-aware pipelines for automating retraining, inference triggers, or model selection based on data events. - Implement autoscaling, monitoring, and alerting for high-throughput AI services in production. - Ingest and manage high-volume structured and unstructured data across MySQL, PostgreSQL, and MongoDB. - Enable AI/ML feedback loops by capturing usage signals, predictions, and outcomes via event streaming. - Support data versioning, feature store integration, and caching strategies for efficient ML model input handling. - Write unit, integration, and end-to-end tests for standard services and AI/ML pipelines. - Implement tracing and observability for AI/ML inference latency, success/failure rates, and data drift. - Document ML integration patterns, input/output schema, service contracts, and fallback logic for AI systems. Preferred Qualifications: - 7+ years of backend software development experience with 2+ years in AI/ML integration or MLOps. - Strong experience in productionizing ML models for classification, regression, or NLP use cases. - Experience with streaming data pipelines and real-time decision systems. - AWS Certifications (Developer Associate, Machine Learning Specialty) are a plus. - Exposure to data versioning tools (e.g., DVC), feature stores, or vector databases is advantageous. If you join Transnational AI Private Limited, you will work in a mission-driven team delivering real-world impact through AI and automation. You will collaborate with architects, ML engineers, data scientists, and product managers in a flat hierarchy, innovation-friendly environment, and experience rapid career advancement.,
ACTIVELY HIRING
posted 2 weeks ago
experience5 to 9 Yrs
location
All India, Pune
skills
  • ETL
  • APIs
  • JSON
  • Avro
  • Glue
  • Snowflake
  • data modeling
  • data quality
  • data integration
  • data governance
  • application design
  • architecture modeling
  • data analysis
  • data governance
  • distributed systems
  • DBT
  • PCI DSS
  • tokenization
  • encryption
  • Parquet
  • AWS services
  • S3
  • Databricks
  • Hadoop ecosystem
  • Databricks
  • Delta Lake
  • Medallion architecture
  • data design patterns
  • database technologies
  • RDMBS
  • NoSQL databases
  • PA DSS
Job Description
As a Sr. Data Engineer/Architect at Barclays, you will play a vital role in driving innovation and excellence in the digital landscape. You will utilize cutting-edge technology to enhance digital offerings, ensuring exceptional customer experiences. Working alongside a team of engineers, business analysts, and stakeholders, you will tackle complex technical challenges that require strong analytical skills and problem-solving abilities. **Key Responsibilities:** - Experience and understanding in ETL, APIs, various data formats (JSON, Avro, Parquet) and experience in documenting/maintaining interface inventories. - Deep understanding of AWS services (e.g., Glue, S3, Databricks, Snowflake) and Hadoop ecosystem for data processing and storage. - Familiarity with Databricks, Delta Lake, and Medallion architecture for advanced analytics and fraud detection use cases. - Build logical and physical data models, enforce data quality, and integrate data across multiple systems. - Data Design and Requirements Analysis: Able to apply data design patterns and frameworks, working knowledge of schemas and normalization. - Experience in preparing architecture vision documents, data flow diagrams, and maintain auditable governance documentation. - Understands user requirement gathering to define data flow, model and design. - Knowledge of basic activities and deliverables of application design; ability to utilize application design methodologies, tools and techniques to convert business requirements and logical models into a technical application design. - Knowledge of Architecture Modelling; ability to develop and modify enterprise architecture through conceptual, logical and physical approaches. - Knowledge of data, process and events; ability to use tools and techniques for analyzing and documenting logical relationships among data, processes or events. - Knows the tools and techniques used for data governance. Understands the relevance of following, creating and improving policies to ensure data is secure including data privacy (e.g. token generation). - Knowledge on the right platform for the data transmission and ensure the cloud / on prem servers are appropriately used. Also, ensure the cost is considered while choosing the cloud vs on-perm platform. - Knowledge on the database and latest updates to help provide the right tools and design. - Proficient in communicating data standards and demonstrating their value to the wider audience. **Qualifications Required:** - Educated to degree or MBA level to be able to meet the intellectual demands of the job, or can demonstrate equivalent experience. - Good understanding of distributed systems and databases. - Good understanding of DBT (Data Build Tool). - Good understanding of AWS database technologies e.g. Databricks, Snowflake. - Knowledge of PCI DSS and PA DSS tokenization and encryption. - Understands basic features of RDMBS and NoSQL databases. The role is based in Pune. In this role, you will build and maintain data architectures pipelines, design and implement data warehouses and data lakes, develop processing and analysis algorithms, and collaborate with data scientists to deploy machine learning models. Your responsibilities also include advising on decision making, contributing to policy development, and ensuring operational effectiveness. All colleagues at Barclays are expected to demonstrate the Barclays Values of Respect, Integrity, Service, Excellence, and Stewardship. Additionally, adherence to the Barclays Mindset to Empower, Challenge and Drive is crucial for creating a culture of excellence and integrity within the organization. As a Sr. Data Engineer/Architect at Barclays, you will play a vital role in driving innovation and excellence in the digital landscape. You will utilize cutting-edge technology to enhance digital offerings, ensuring exceptional customer experiences. Working alongside a team of engineers, business analysts, and stakeholders, you will tackle complex technical challenges that require strong analytical skills and problem-solving abilities. **Key Responsibilities:** - Experience and understanding in ETL, APIs, various data formats (JSON, Avro, Parquet) and experience in documenting/maintaining interface inventories. - Deep understanding of AWS services (e.g., Glue, S3, Databricks, Snowflake) and Hadoop ecosystem for data processing and storage. - Familiarity with Databricks, Delta Lake, and Medallion architecture for advanced analytics and fraud detection use cases. - Build logical and physical data models, enforce data quality, and integrate data across multiple systems. - Data Design and Requirements Analysis: Able to apply data design patterns and frameworks, working knowledge of schemas and normalization. - Experience in preparing architecture vision documents, data flow diagrams, and maintain auditable governance documentation. - Understands user requirement gathering to define data flow, model and design. - Knowledge of ba
ACTIVELY HIRING
posted 2 months ago

Data Engineer - ETL/Python

Techno-Comp Computer Services Pvt. Ltd.
experience2 to 9 Yrs
location
All India
skills
  • Python
  • MongoDB
  • Snowflake
  • Glue
  • Kafka
  • SQL
  • JSON
  • ORC
  • Avro
  • scheduling
  • APIs
  • Data streaming
  • Spark ecosystem
  • Scala programming
  • AWS EMR
  • S3
  • BigData pipelines
  • NoSQL databases
  • Parquet
  • CSV
Job Description
As a Data Engineer with 6-9 years of experience, your role will involve the following key responsibilities: - Design, develop, and maintain scalable and robust data pipelines for collecting, processing, and transforming large datasets. - Implement ETL (Extract, Transform, Load) processes to ensure efficient movement of data across multiple systems. - Design, implement, and optimize relational and non-relational databases to support business needs. - Ensure data quality, consistency, and accuracy by building validation checks, monitoring systems, and automating data reconciliation processes. - Work with cloud platforms (AWS, Azure, GCP) to deploy and manage data storage, compute, and processing resources. - Monitor and optimize the performance of data pipelines, queries, and data processing workflows. Qualifications and Skills required for this role include: - 5.5 years of experience in Spark ecosystem, Python/Scala programming, MongoDB data loads, Snowflake and AWS platform (EMR, Glue, S3), Kafka. - Hands-on experience in writing advanced SQL queries and familiarity with a variety of databases. - Experience in coding solutions using Python/Spark and performing performance tuning/optimization. - Experience in building and optimizing Big-Data pipelines in the Cloud. - Experience in handling different file formats like JSON, ORC, Avro, Parquet, CSV. - Hands-on experience in data processing with NoSQL databases like MongoDB. - Familiarity and understanding of job scheduling. - Hands-on experience working with APIs to process data. - Understanding of data streaming, such as Kafka services. - 2+ years of experience in Healthcare IT projects. - Certification on Snowflake (Snow PRO certification) and AWS (Cloud Practitioner/Solution Architect). - Hands-on experience on Kafka streaming pipelines implementation. You will be a valuable asset to the team with your expertise in data engineering and cloud infrastructure management.,
ACTIVELY HIRING
posted 2 months ago
experience5 to 9 Yrs
location
Maharashtra, Pune
skills
  • Java
  • Spring Boot
  • Distributed Systems
  • Microservices
  • JSON
  • Avro
  • Docker
  • Kubernetes
  • Splunk
  • Confluent Kafka
  • Protobuf
Job Description
As a highly skilled Java Developer with expertise in Spring Boot, Confluent Kafka, and distributed systems, your main responsibility will be designing, developing, and optimizing event-driven applications using Confluent Kafka while leveraging Spring Boot/Spring Cloud for microservices-based architectures. Your key responsibilities will include: - Developing, deploying, and maintaining scalable and high-performance applications using Java (Core Java, Collections, Multithreading, Executor Services, CompletableFuture, etc.). - Working extensively with Confluent Kafka, including producer-consumer frameworks, offset management, and optimization of consumer instances based on message volume. - Ensuring efficient message serialization and deserialization using JSON, Avro, and Protobuf with Kafka Schema Registry. - Designing and implementing event-driven architectures with real-time processing capabilities. - Optimizing Kafka consumers for high-throughput and low-latency scenarios. - Collaborating with cross-functional teams to ensure seamless integration and deployment of services. - Troubleshooting and resolving performance bottlenecks and scalability issues in distributed environments. - Having familiarity with containerization (Docker, Kubernetes) and cloud platforms is a plus. - Possessing experience with monitoring and logging tool - Splunk is a plus. Qualifications required for this role include: - 5+ years of experience in Java development. - Strong expertise in Spring Boot, Confluent Kafka, and distributed systems. - Proficiency in designing and optimizing event-driven applications. - Experience with microservices-based architectures using Spring Boot/Spring Cloud. - Knowledge of JSON, Avro, and Protobuf for message serialization and deserialization. - Familiarity with Docker, Kubernetes, and cloud platforms. - Experience with Splunk for monitoring and logging is a plus.,
ACTIVELY HIRING
posted 2 months ago
experience5 to 9 Yrs
location
Karnataka
skills
  • Talend
  • Data Integration
  • SQL
  • PLSQL
  • Data Modeling
  • Relational Databases
  • JSON
  • XML
  • Git
  • Jenkins
  • DevOps
  • Data Governance
  • Data Quality
  • Metadata Management
  • ETLELT
  • Cloud Platforms
  • REST APIs
  • File Formats
  • CICD
Job Description
Role Overview: As a skilled and detail-oriented Senior Talend Developer, your role will involve designing, developing, and optimizing ETL/ELT processes using Talend Data Integration to support the enterprise data ecosystem. You will collaborate with various teams to build robust data pipelines, facilitate data migration, and enhance data-driven decision-making within the organization. Key Responsibilities: - Design and develop scalable ETL/ELT pipelines utilizing Talend (Big Data, Data Fabric, or Open Studio). - Collaborate with data architects, business analysts, and stakeholders to address data integration requirements. - Integrate data from diverse on-premise and cloud-based sources into a centralized data warehouse or data lake. - Create and manage Talend jobs, routes, and workflows; schedule and monitor job execution via Talend Administration Center (TAC). - Ensure data quality, integrity, and consistency across all systems. - Optimize existing jobs for enhanced performance, reusability, and maintainability. - Conduct code reviews, establish best practices, and mentor junior developers on Talend standards and design. - Troubleshoot data integration issues and provide support for ongoing data operations. - Document technical designs, data flows, and operational procedures. Qualification Required: - Bachelor's degree in Computer Science, Information Systems, or a related field. - 5+ years of hands-on experience with Talend ETL tools (Talend Data Integration / Talend Big Data Platform). - Strong knowledge of SQL, PL/SQL, and data modeling concepts (star/snowflake schemas). - Proficiency in working with relational databases (e.g., Oracle, SQL Server, MySQL) and cloud platforms (AWS, Azure, or GCP). - Familiarity with REST APIs, JSON/XML, and file formats (CSV, Parquet, Avro). - Experience with Git, Jenkins, or other DevOps/CI-CD tools. - Knowledge of data governance, data quality, and metadata management best practices. Additional Details: Omit this section as there are no additional details of the company present in the job description.,
ACTIVELY HIRING
posted 2 weeks ago

Pyspark Data Engineer

People Prime Worldwide
experience8 to 12 Yrs
location
Maharashtra, Pune
skills
  • Python
  • Distributed Computing
  • Cloud Services
  • Software Development
  • Kafka
  • Numpy
  • API development
  • PySpark
  • Big Data Ecosystem
  • Database
  • SQL
  • Apache Airflow
  • Pandas
  • RESTful services
  • Data file formats
  • Agile development methodologies
Job Description
You will be responsible for designing, developing, and maintaining scalable and efficient data processing pipelines using PySpark and Python. Your key responsibilities will include: - Building and implementing ETL (Extract, Transform, Load) processes to ingest data from various sources and load it into target destinations. - Optimizing PySpark applications for performance and troubleshooting existing code. - Ensuring data integrity and quality throughout the data lifecycle. - Collaborating with cross-functional teams, including data engineers and data scientists, to understand and fulfill data needs. - Providing technical leadership, conducting code reviews, and mentoring junior team members. - Translating business requirements into technical solutions and contributing to architectural discussions. - Staying current with the latest industry trends in big data and distributed computing. You must possess the following mandatory skills and experience: - Advanced proficiency in PySpark and Python with extensive experience in building data processing applications. - In-depth understanding of distributed computing principles, including performance tuning. - Experience with big data ecosystem technologies such as Hadoop, Hive, Sqoop, and Spark. - Hands-on experience with cloud platforms like AWS (e.g., Glue, Lambda, Kinesis). - Strong knowledge of SQL, experience with relational databases, and data warehousing. - Experience with software development best practices, including version control (Git), unit testing, and code reviews. The following desired or "nice-to-have" skills would be advantageous: - Familiarity with orchestration tools like Apache Airflow. - Experience with other data processing tools like Kafka or Pandas/Numpy. - Knowledge of API development and creating RESTful services. - Experience with data file formats like Parquet, ORC, and Avro. - Experience with Agile development methodologies.,
ACTIVELY HIRING
posted 2 months ago
experience5 to 9 Yrs
location
Maharashtra
skills
  • Java
  • Kafka
  • Data Integration
  • Leadership
  • Java Programming
  • Data Validation
  • JSON
  • Avro
  • REST
  • SOAP
  • FTP
  • Distributed Systems
  • Git
  • Maven
  • Gradle
  • Docker
  • AWS
  • Azure
  • GCP
  • Protobuf
  • EventDriven Architecture
  • CICD
Job Description
Role Overview: You will be joining as a Senior Java Developer at the team to lead the development of custom connectors for integrating source systems with a Kafka cluster. Your expertise in Java programming, data integration, and Kafka architecture will be essential for this role. You will have the responsibility of managing a small team of developers and collaborating effectively with Kafka administrators. This position will involve hands-on development, leadership responsibilities, and direct engagement with client systems and data pipelines. Key Responsibilities: - Design, develop, and maintain Java-based connectors for integrating diverse source systems with Kafka clusters. - Incorporate business logic, data validation, and transformation rules within the connector code. - Work closely with client teams to understand source systems, data formats, and integration requirements. - Guide and mentor a team of developers, review code, and ensure adherence to best practices. - Collaborate with Kafka admins to optimize connector deployment, resolve operational issues, and ensure high availability. - Develop monitoring solutions to track data flow and resolve issues in real-time. - Create comprehensive technical documentation for the custom connectors, including configuration guides and troubleshooting procedures. - Evaluate and implement improvements in performance, scalability, and reliability of the connectors and associated data pipelines. Qualification Required: - Strong proficiency in Java, including multi-threading, performance tuning, and memory management. - Hands-on experience with Confluent Kafka and Kafka Connect. - Familiarity with JSON, Avro, or Protobuf data formats. - Experience in integrating data from various sources such as relational databases, APIs, and flat files. - Understanding of data exchange protocols like REST, SOAP, and FTP. - Knowledge of event-driven architecture and distributed systems. - Ability to lead a small team of developers, delegate tasks, and manage project timelines. - Strong mentoring and communication skills to guide team members and collaborate with non-technical stakeholders. - Proven ability to troubleshoot complex data and system integration issues. - Analytical mindset for designing scalable, reliable, and efficient data pipelines. - Experience with CI/CD pipelines, Git, and build tools like Maven and Gradle. - Familiarity with containerization (Docker) and cloud environments (AWS, Azure, or GCP) is a plus.,
ACTIVELY HIRING
posted 2 months ago

Senior Software Engineer (Data Engineering)

Bigthinkcode Technologies Private Limited
experience5 to 9 Yrs
location
Chennai, Tamil Nadu
skills
  • Python
  • SQL
  • dbt
  • RDBMS
  • JSON
  • Avro
  • Snowflake
  • Git
  • Kafka
  • AWS
  • GCP
  • Azure
  • ETLELT frameworks
  • cloud data warehouses
  • Apache Airflow
  • AWS Glue
  • Parquet
  • Redshift
  • BigQuery
  • CICD
  • Kinesis
  • Spark Streaming
  • dimensional
  • star schema
Job Description
As a skilled Data Engineer at BigThinkCode Technologies, your role involves designing, building, and maintaining robust data pipelines and infrastructure to optimize data flow and ensure scalability. Your technical expertise in Python, SQL, ETL/ELT frameworks, and cloud data warehouses, along with strong collaboration skills, will be crucial in partnering with cross-functional teams to enable seamless access to structured and unstructured data across the organization. **Key Responsibilities:** - Design, develop, and maintain scalable ETL/ELT pipelines for structured and unstructured data processing. - Optimize and manage SQL queries for performance and efficiency in handling large-scale datasets. - Collaborate with data scientists, analysts, and business stakeholders to translate requirements into technical solutions. - Ensure data quality, governance, and security across pipelines and storage systems. - Document architectures, processes, and workflows for clarity and reproducibility. **Required Technical Skills:** - Proficiency in Python for scripting, automation, and pipeline development. - Expertise in SQL for complex queries, optimization, and database design. - Hands-on experience with ETL/ELT tools such as Apache Airflow, dbt, and AWS Glue. - Experience working with both structured (RDBMS) and unstructured data (JSON, Parquet, Avro). - Familiarity with cloud-based data warehouses like Redshift, BigQuery, and Snowflake. - Knowledge of version control systems like Git and CI/CD practices. **Preferred Qualifications:** - Experience with streaming data technologies such as Kafka, Kinesis, and Spark Streaming. - Exposure to cloud platforms like AWS, GCP, Azure, and their data services. - Understanding of data modeling techniques like dimensional and star schema and optimization. The company offers benefits like a flexible schedule, health insurance, paid time off, and a performance bonus. Thank you for considering this opportunity at BigThinkCode Technologies.,
ACTIVELY HIRING
posted 1 week ago
experience2 to 6 Yrs
location
Maharashtra, Navi Mumbai
skills
  • cloud
  • snowflake
  • plsql
  • dbt
Job Description
As a Snowflake Data Engineer at PibyThree Consulting Pvt Ltd., you will be responsible for leveraging your expertise in Snowflake Data Cloud and cloud platforms such as AWS, Azure, or Google Cloud Platform to develop and maintain data solutions. Your main responsibilities will include: - Having 2+ years of experience in Snowflake Data Cloud and a total of 4+ years of experience in the field. - Demonstrating proficiency in PL/SQL, Oracle, and Snowflake Internal External Staging and Loading options. - Utilizing your deep exposure to Snowflake features to write SnowSQL and Stored Procedures. - Developing ETL routines for Snowflake using Python, Scala, or ETL tools. - Applying your knowledge of AWS or Azure platforms for data ingestion to Snowflake from various formats like CSV, JSON, Parquet, and Avro. - Conducting SQL performance tuning, identifying technical issues, and resolving failures effectively. - Evaluating existing data structures and creating advanced SQL and PL/SQL programs. - Demonstrating proficiency in at least one programming language such as Python, Scala, or Pyspark, and familiarity with DBT. Qualifications required for this role include a minimum of 4 years of experience, strong skills in cloud platforms, Snowflake, PL/SQL, and DBT. Join us at PibyThree Consulting Pvt Ltd. and contribute to cutting-edge data solutions using your expertise in Snowflake and related technologies.,
ACTIVELY HIRING
posted 2 weeks ago

Senior Associate - Data Engineering

PwC Acceleration Center India
experience4 to 8 Yrs
location
All India
skills
  • Python
  • Apache Spark
  • Kafka
  • ETL
  • AWS
  • Azure
  • GCP
  • JSON
  • Avro
  • RDBMS
  • NoSQL
  • Docker
  • Kubernetes
  • GitHub
  • Snowflake
  • Data Governance
  • Data Quality
  • Data Security
  • Data Integration
  • Agile Methodology
  • PySpark
  • CSV
  • Parquet
  • CICD
  • Databricks
  • Azure Data Factory
  • Data Orchestration
  • Generative AI
  • Large Language Models LLMs
Job Description
Role Overview: At PwC, you will be part of the data and analytics engineering team, focusing on utilizing advanced technologies to create robust data solutions for clients. Your role will involve transforming raw data into actionable insights to drive informed decision-making and business growth. As a data engineer at PwC, your main responsibilities will include designing and constructing data infrastructure and systems to enable efficient data processing and analysis. You will also be involved in developing and implementing data pipelines, data integration, and data transformation solutions. Key Responsibilities: - Design, develop, and maintain robust, scalable ETL pipelines using tools such as Apache Spark, Kafka, and other big data technologies. - Create scalable and reliable data architectures, including Lakehouse, hybrid batch/streaming systems, Lambda, and Kappa architectures. - Demonstrate proficiency in Python, PySpark, Spark, and solid understanding of design patterns (e.g., SOLID). - Ingest, process, and store structured, semi-structured, and unstructured data from various sources. - Utilize cloud platforms like AWS, Azure, and GCP to set up data pipelines. - Optimize ETL processes for scalability and efficiency. - Work with JSON, CSV, Parquet, and Avro file formats. - Possess deep knowledge of RDBMS, NoSQL databases, and CAP theorem principles. - Collaborate with data scientists, analysts, and stakeholders to optimize data models for performance and scalability. - Document data processes, architectures, and models comprehensively for cross-team understanding. - Implement and maintain CI/CD pipelines using tools like Docker, Kubernetes, and GitHub. - Ensure data quality, integrity, and security across all systems and processes. - Implement and monitor data governance best practices. - Stay updated with emerging data technologies and trends for innovation and improvement. - Familiarity with Cloud Data/Integration/Orchestration Platforms like Snowflake, Databricks, and Azure Data Factory is beneficial. Qualifications Required: - BE / masters in design / B Design / B.Tech / HCI Certification (Preferred) - 4-7 years of experience in Programming Language (Python, Scala, Java), Apache Spark, ADF, Azure Databricks, Postgres, ETL (Batch/Streaming), Git - Familiarity with Agile methodology. Additional Company Details: No additional details provided in the job description. Role Overview: At PwC, you will be part of the data and analytics engineering team, focusing on utilizing advanced technologies to create robust data solutions for clients. Your role will involve transforming raw data into actionable insights to drive informed decision-making and business growth. As a data engineer at PwC, your main responsibilities will include designing and constructing data infrastructure and systems to enable efficient data processing and analysis. You will also be involved in developing and implementing data pipelines, data integration, and data transformation solutions. Key Responsibilities: - Design, develop, and maintain robust, scalable ETL pipelines using tools such as Apache Spark, Kafka, and other big data technologies. - Create scalable and reliable data architectures, including Lakehouse, hybrid batch/streaming systems, Lambda, and Kappa architectures. - Demonstrate proficiency in Python, PySpark, Spark, and solid understanding of design patterns (e.g., SOLID). - Ingest, process, and store structured, semi-structured, and unstructured data from various sources. - Utilize cloud platforms like AWS, Azure, and GCP to set up data pipelines. - Optimize ETL processes for scalability and efficiency. - Work with JSON, CSV, Parquet, and Avro file formats. - Possess deep knowledge of RDBMS, NoSQL databases, and CAP theorem principles. - Collaborate with data scientists, analysts, and stakeholders to optimize data models for performance and scalability. - Document data processes, architectures, and models comprehensively for cross-team understanding. - Implement and maintain CI/CD pipelines using tools like Docker, Kubernetes, and GitHub. - Ensure data quality, integrity, and security across all systems and processes. - Implement and monitor data governance best practices. - Stay updated with emerging data technologies and trends for innovation and improvement. - Familiarity with Cloud Data/Integration/Orchestration Platforms like Snowflake, Databricks, and Azure Data Factory is beneficial. Qualifications Required: - BE / masters in design / B Design / B.Tech / HCI Certification (Preferred) - 4-7 years of experience in Programming Language (Python, Scala, Java), Apache Spark, ADF, Azure Databricks, Postgres, ETL (Batch/Streaming), Git - Familiarity with Agile methodology. Additional Company Details: No additional details provided in the job description.
ACTIVELY HIRING
posted 2 days ago
experience3 to 7 Yrs
location
Karnataka
skills
  • Consulting
  • Snowflake
  • AWS
  • Python
  • Spark
  • Glue
  • EMR
  • DynamoDB
  • JSON
  • Avro
  • ORC
  • Industry experience in RetailCPGMedia
  • clientfacing experience
  • Certifications AWS
  • Databricks
  • Data Engineering
  • Data Modeling skills
  • Experience in Extract Transform Load ETL processes
  • Data Warehousing
  • Data Analytics skills
  • Proficiency in relevant programming languages like SQL
  • Python
  • Experience with cloud services like AWS
  • Databricks
  • Strong analytical
  • problemsolving skills
  • Programming Python
  • AWS Services S3
  • Lambda
  • Step Functions
  • Databricks Delta Lake
  • MLflow
  • Unity Catalog experience
  • Databases SQL databases PostgreSQL
  • MySQL
  • NoSQL MongoDB
  • Data Formats
Job Description
You are applying for the role of Senior Data Engineer at Beige Bananas, a rapidly growing AI consulting firm specializing in creating custom AI products for Fortune 500 Retail, CPG, and Media companies with an outcome-driven mindset to accelerate clients" value realization from their analytics investments. **Role Overview:** As a Senior Data Engineer at Beige Bananas, you will be responsible for data engineering, data modeling, ETL processes, data warehousing, and data analytics. You will work independently to build end-to-end pipelines in AWS or Databricks. **Key Responsibilities:** - Design and implement scalable data architectures - Build and maintain real-time and batch processing pipelines - Optimize data pipeline performance and costs - Ensure data quality, governance, and security - Collaborate with ML teams on feature stores and model pipelines **Qualifications Required:** - Data Engineering and Data Modeling skills - Experience in Extract Transform Load (ETL) processes - Data Warehousing and Data Analytics skills - Proficiency in programming languages like SQL and Python - Experience with cloud services like AWS and Databricks - Strong analytical and problem-solving skills - Bachelor's or Master's degree in Computer Science, Engineering, or related field **Additional Details of the Company:** Beige Bananas is a pure play AI consulting firm that focuses on creating hyper-custom AI products for Fortune 500 Retail, CPG, and Media companies. They have a fast-paced environment with a focus on accelerating clients" value realization from their analytics investments. If you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, apply today and be a part of Beige Bananas" exciting journey!,
ACTIVELY HIRING
posted 2 months ago
experience4 to 8 Yrs
location
Karnataka
skills
  • Java development
  • Spring Boot
  • Spring
  • Microservices
  • API development
  • AVRO
  • Kafka
  • Collections
  • Garbage Collection
  • Multithreading
  • Hibernate
  • Maven
  • Mockito
  • relational databases
  • SQL
  • MongoDB
  • Oracle
  • SQL Server
  • Cassandra
  • Java 8
  • RESTJSON
  • Core Java Skills
  • Design pattern
  • JUnits
  • JMock
  • NoSQL databases
  • RESTful web services
  • API design
Job Description
As a Java Developer with 6+ years of experience, you will be responsible for: - Leading Java development using Spring Boot, Java 8+, Spring, and Microservices - Developing and integrating APIs using REST/JSON, AVRO, and Kafka - Designing, building, and maintaining Rest APIs and microservices following best practices, including security and performance tuning - Demonstrating strong Core Java Skills, including knowledge of Design patterns, Collections, Garbage Collection, and Multithreading - Utilizing Java frameworks and technologies such as Spring Boot, Hibernate, and Maven - Writing and managing JUnits and Mock frameworks like Mockito, JMock or equivalent - Working with relational databases and SQL, as well as NoSQL databases like MongoDB, Oracle, SQL Server, or Cassandra - Implementing RESTful web services and API design best practices Qualifications required for this role include: - 6+ years of experience in Core Java (Preferred) - 4+ years of experience in Spring (Preferred) - 4+ years of experience in Microservices (Preferred) Please note that this is a Contractual/Temporary position based in Bangalore/Chennai and requires in-person work. If you are open to a 6-month Contractual Role and can join immediately, we encourage you to apply for this opportunity.,
ACTIVELY HIRING
posted 2 months ago
experience5 to 15 Yrs
location
Chennai, Tamil Nadu
skills
  • AWS
  • Enterprise design patterns
  • Kafka
  • Java
  • Data Architecture
  • JSON
  • Avro
  • XML
  • Confluent Kafka
  • Microservice architecture
  • Domain Driven Design
  • 12factor app
  • Kinesis
  • AWS services
  • Springboot
  • Oracle Databases
  • NoSQL DBs
  • Data Modelling
  • Data Lake
  • Data Mesh
  • API gateways
  • AWS architecture
Job Description
As a Solutions Architect, your primary focus is to ensure the technical integrity of the Event Driven Architecture platform and to formulate the technical direction for the strategic cloud investments. You will drive the estimation, analysis, and design, as well as support implementation and operations of a slew of microservices owned by the team. Working closely with the senior tech and business leadership, engineering, and ops teams, you will drive the vision and the technical strategic objectives throughout the SDLC. It is essential for you to remain current in all Parcel and Logistics related technologies in support of enterprise applications and infrastructure. Your passion for technology and thirst for innovation will play a crucial role in shaping the future of digital transformation. Responsibilities: - Analyze, design, and lead technical solutions fulfilling core business requirements in the migration journey from legacy messaging solution to Confluent Kafka in AWS platform. Consider solution scalability, availability, security, extensibility, maintainability, risk assumptions, and cost considerations. - Actively participate in proof-of-concept implementation of new applications and services. - Research, evaluate, and recommend third-party software packages and services to enhance digital transformation capabilities. - Promote technical vision and sound engineering principles among technology department staff members and across the global team. - Occasionally assist in production escalations, systems operations, and problem resolutions. - Assist team members in adopting new Kafka and real-time data streaming solutions. Mentor the team to remain current with the latest tech trends in the global marketplace. - Perform and mentor conceptual, logical, and physical data modeling. - Drive the team to maintain semantic models. - Guide teams in adopting data warehousing, data lakes, and data mesh architectures. - Drive process, policy, and standard improvements related to architecture, design, and development principles. - Assist business leadership in prioritizing business capabilities and go-to-market decisions. - Collaborate with cross-functional teams and business teams as required to drive the strategy and initiatives forward. - Lead architecture teams in digital capabilities and competency building, mentoring junior team members. Qualifications: - 15+ years of software engineering experience with 5+ years in hands-on architecture roles. - Ability to define platform strategy, target state architecture, and implementation roadmaps for enterprise-scale applications to migrate to Kafka. - Proven hands-on architecture and design experience in Microservice architecture and Domain Driven Design concepts and principles including 12-factor app and other enterprise design patterns. - Establish enterprise architectural blueprints and cloud deployment topologies. - Highly experienced in designing high traffic services serving 1k+ transactions per second or similar high transaction volume distributed systems with resilient high-availability and fault-tolerance. - Experience in developing event-driven, message-driven asynchronous systems such as Kafka, Kinesis, etc. - Experience in AWS services such as Lambdas, ECS, EKS, EC2, S3, DynamoDB, RDS, VPCs, Route 53, ELB. - Experience in Enterprise Java, Springboot ecosystem. - Experience with Oracle Databases, NoSQL DBs, and distributed caching. - Experience with Data Architecture, Data Modeling, Data Lake, and Data Mesh implementations. - Extensive experience implementing system integrations utilizing API gateways, JSON, Avro, and XML libraries. - Excellent written, verbal, presentation, and process facilitation skills. - AWS architecture certification or equivalent working expertise with the AWS platform. - B.S. or M.S. in computer science or a related technical area preferred. Please note that this position is based out of GDC in Chennai, India, and occasional travel to Belgium might be required for business interactions and training.,
ACTIVELY HIRING
posted 2 weeks ago

Data Engineer

Enterprise Minds, Inc
experience4 to 8 Yrs
location
Maharashtra
skills
  • Java
  • Splunk
  • Tableau
  • SQL
  • Spark
  • JSON
  • XML
  • Avro
  • Docker
  • Kubernetes
  • Azure
  • Kafka
  • Databricks
  • Grafana
  • Prometheus
  • PowerBI
  • Pyspark
Job Description
Role Overview: As a data engineer in Pune, you will be responsible for delivering data intelligence solutions to customers globally. Your primary tasks will include implementing and deploying a product that provides insights into material handling systems" performance. You will collaborate with a dynamic team to build end-to-end data ingestion pipelines and deploy dashboards. Key Responsibilities: - Design and implement data & dashboarding solutions to maximize customer value. - Deploy and automate data pipelines and dashboards to facilitate further project implementation. - Work in an international, diverse team with an open and respectful atmosphere. - Make data available for other teams within the department to support the platform vision. - Communicate and collaborate with various groups within the company and project team. - Work independently and proactively with effective communication to provide optimal solutions. - Participate in an agile team, contribute ideas for improvements, and address concerns. - Collect feedback and identify opportunities to enhance the existing product. - Lead communication with stakeholders involved in the deployed projects. - Execute projects from conception to client handover, ensuring technical performance and organizational contribution. Qualifications Required: - Bachelor's or master's degree in computer science, IT, or equivalent. - Minimum of 4 to 8 years of experience in building and deploying complex data pipelines and solutions. - Hands-on experience with Java and Databricks. - Familiarity with visualization software such as Splunk, Grafana, Prometheus, PowerBI, or Tableau. - Strong expertise in SQL, Java, data modeling, and data schemas (JSON/XML/Avro). - Experience with Pyspark or Spark for distributed data processing. - Knowledge of deploying services as containers (e.g., Docker, Kubernetes) and working with cloud services (preferably Azure). - Familiarity with streaming and/or batch storage technologies (e.g., Kafka) is advantageous. - Experience in data quality management, monitoring, and Splunk (SPL) is a plus. - Excellent communication skills in English. (Note: No additional details of the company were provided in the job description.),
ACTIVELY HIRING
posted 2 weeks ago
experience4 to 8 Yrs
location
Karnataka
skills
  • Apache Spark
  • Kafka
  • Python
  • ETL
  • AWS
  • Azure
  • GCP
  • JSON
  • Avro
  • RDBMS
  • NoSQL
  • Docker
  • Kubernetes
  • GitHub
  • Snowflake
  • Data Governance
  • Data Quality
  • Data Security
  • Data Integration
  • PySpark
  • CSV
  • Parquet
  • CICD
  • Databricks
  • Azure Data Factory
  • Data Orchestration
  • Generative AI
Job Description
Role Overview: At PwC, as a data and analytics engineering professional, your main focus will be on utilizing advanced technologies and techniques to create robust data solutions for clients. Your role is crucial in converting raw data into actionable insights, which facilitate informed decision-making and drive business growth. In the field of data engineering at PwC, your responsibilities will include designing and constructing data infrastructure and systems to enable efficient data processing and analysis. You will be tasked with developing and implementing data pipelines, data integration, and data transformation solutions. Key Responsibilities: - Design, develop, and maintain robust, scalable ETL pipelines using technologies such as Apache Spark, Kafka, and other big data tools. - Create scalable and reliable data architectures, including Lakehouse, hybrid batch/streaming systems, Lambda, and Kappa architectures. - Demonstrate proficiency in Python, PySpark, Spark, and possess a solid understanding of design patterns like SOLID. - Ingest, process, and store structured, semi-structured, and unstructured data from various sources. - Utilize cloud platforms such as AWS, Azure, and GCP to set up data pipelines. - Optimize ETL processes to ensure scalability and efficiency. - Handle various file formats like JSON, CSV, Parquet, and Avro. - Deep knowledge of RDBMS, NoSQL databases, and CAP theorem principles. - Collaborate with data scientists, analysts, and stakeholders to understand data requirements and optimize data models. - Document data processes, architectures, and models comprehensively to facilitate cross-team understanding. - Implement and maintain CI/CD pipelines using tools like Docker, Kubernetes, and GitHub. - Ensure data quality, integrity, and security across all systems and processes. - Implement and monitor data governance best practices. - Stay updated with emerging data technologies and trends, identifying opportunities for innovation and improvement. - Knowledge of other Cloud Data/Integration/Orchestration Platforms like Snowflake, Databricks, and Azure Data Factory is advantageous. Qualification Required: - Minimum 4-7 years of experience in Programming Language (Python, Scala, Java), Apache Spark, ADF, Azure Databricks, Postgres, with familiarity in NoSQL. - BE / masters in design / B Design / B.Tech / HCI Certification (Preferred),
ACTIVELY HIRING
logo

@ 2025 Shine.com | All Right Reserved

Connect with us:
  • LinkedIn
  • Instagram
  • Facebook
  • YouTube
  • Twitter