Our Data team focuses on acquiring, synthesizing, annotating, and ensuring the quality of ML data, driving numerous features in collaboration with R&D teams in Apple’s SWE organization. As a data engineering tech lead, you will be responsible for establishing and executing the strategy for our organization’s ML data engine.
In this position, you will:
- Collaborate with a variety of partners, from infrastructure, ML research teams to our data functions, including data engineers, to assess the needs
- Identify state of the art data components used to store, expose and track ML data
- Identify opportunities for improvement of internal infrastructure offerings, and influence roadmaps of partner teams to build or improve key components we rely on
- Design and execute the roadmap for adoption of new components, build the pipelines necessary to connect data systems and teams
- Improve automation workflows, data visualization tools, ML enrichment, asset lineage tracking, including data coming from sophisticated synthetic workflows, data governance and compliance components, storage and tracking of synthetic data
- Be hands on, actively participate to the stack and implement high quality code
Bachelors, Masters or PhD in Computer Science, Mathematics, Physics, or a related field; or equivalent practical experience.
7+ years of industry experience as a software engineer, including 2+ years as a tech lead/architect specialized in data infrastructure
Proven experience designing, automating and scaling large data pipelines (petabyte scale desired) using state of the art technologies
Expertise in Python or another modern programming language
Strong ability to design and lead a technical roadmap, work with cross functional teams and a diversity of profiles, proven capacity to influence and build alignment
Demonstrated prior experience in foundation models such as building data infrastructure supporting generative technologies
Self-starter, able to handle ambiguity, identify risks, mitigate, and autonomously find the right people and tools to get the job done
Proven track record of mentoring and growing engineers, lift the level of software engineering excellence in a ML data ops team typically focused on short term deliveries