Job Details
Core skills:
Data engineering, Data analysis, Data science, Machine Learning, hypotheses & statistical data inference knowledge.
Experience managing data pipelines, large scale of data,
Expertise with: Kubernetes, Python, SQL, Pyspark, GCP, Azure, and Kubernetes.
Being able to talk about
ML, Data scientist, etc. People with background in statistical skills, etc.
Responsibilities:
- Experience building robust large-scale data pipelines and storage solutions using Python, Pyspark, Databricks, Google/Azure clouds, WCNP, Apache Airflow, and other relevant tools.
- Experience with batch processing/scheduling frameworks like Apache Airflow, and event processing frameworks like Apache Kafka/Streams etc.
- Experience designing and managing datawarehouses and datalakes like BigQuery, GCS, Deltalake. Reporting/Dashboarding tools like Google looker, Tableau, MS Power BI. Familiarity with automated ML frameworks like Vertex.ai or Element.
- Experience with data-quality and pipeline-health monitoring and alerting solutions like Prometheus, Grafana, Splunk etc. Good understanding of DevOps and MLOps practices.
- Solid understanding of data engineering, data analysis, and data visualization disciplines.
Notes:
- Bachelor's degree in Computer Science or equivalent is required. Strong expertise in data analysis, data visualization, and machine learning algorithms.
- Position requires candidate to work from our Bentonville office at least 2 days of the week. Strong communication skills to effectively convey complex technical concepts to non-technical stakeholders.
- Must be able to deliver with little to no supervision or support, as this will be a new team we are building in Bentonville.