As a growing deep-tech startup in the semiconductor industry we are seeking a Data Platform Engineer to develop and maintain an inhouse data ingestion and analytics platform.
Key Responsibilities
Custom SOP-driven Data Ingestion (Upstream):
Custom S3 Data Lake Management (Infra/Platform):
Analytics / ML (Downstream):
Some expected tasks:
- Organization-wide support with onboarding of new SOPs which require data contracts.
- Develop workflows and test strategies to ensure end-to-end data quality
- Support data producers with troubleshooting, and upload guidance.
- Develop custom UI- and data tools
- Monitor and debug ingestion failures, ensuring high data quality and consistency.
- Perform exploratory data analysis, statistical reporting, and machine learning modeling on curated datasets.
- Document SOP onboarding processes, data validation rules, and platform workflows.
- Collaborate with scientific and engineering teams to plan future platform improvements.
Desired Skills & Qualifications
- Strong proficiency in Python, and data validation workflows.
- Proficient in S3-like object storage: Experience managing and utilizing object storage solutions for data management.
- Cloud Computing: Knowledge of setting up and configuring cloud compute instances, ensuring efficient resource allocation and deployment.
- Knowledge of schema validation (e.g., JSON Schema) and ETL/data ingestion patterns.
- Ability to design, validate, and evolve ETL and data pipelines.
- Experience with data cleaning, wrangling, and quality control processes.
- Comfortable communicating and presenting in English.
Preferred Qualifications
- Background in engineering, physics, or similar technical domain.
- Experience with machine learning tools (scikit-learn).
- Familiarity with laboratory workflows, or process data environments.
- Experience with visualization libraries (e.g. Matplotlib, Seaborn, Plotly).
- Understanding of modern tooling (CI/CD, version control, testing frameworks).
Personal Attributes
- Structured and detail-oriented.
- Strong communicator who enjoys supporting colleagues.
- Thrives in a cross-functional highly dynamic environment