Job Details
Primary Responsibilities:
- Develop and maintain data processing pipelines leveraging Apache Spark and either Java or Scala.
- Lead and mentor a skilled team to enhance data engineering capabilities.
- Deploy and manage workflow orchestration utilizing AirFlow for optimized data processing workflows.
- Utilize SQL proficiently for data querying and manipulation.
- Collaborate with diverse teams to gather requirements and ensure alignment with data engineering solutions.
Key Requirements:
- Possess a Bachelor s degree in computer science or a related field, along with at least five years of pertinent experience in Data Engineering.
- Demonstrated proficiency in Apache Spark for handling large-scale data processing.
- Strong command of either Java or Scala programming languages.
- In-depth understanding and usage of AirFlow for effective workflow orchestration.
- Proficiency in SQL for data querying and manipulation.
Preferred Skills:
- Knowledge of Avro and Schema Registry to enhance data serialization efficiency.
- Experience with containerization tools like Docker and orchestration platforms such as Kubernetes (K8S).
- Practical exposure to cloud-based data engineering services, particularly Azure Data Factory or AWS Glue.
- Familiarity with release management procedures.
- Experience with version control systems like Git and exposure to release pipelines using Azure DevOps or similar solutions.
- Capability to collaborate efficiently with teams across various geographical regions, including Ireland, the United States (Eastern Time Zone), and India.