Arbetsbeskrivning
The Data Engineer role exists to ensure that all test‑lab data is trustworthy, accessible, scalable, and usable for engineering, validation, analytics, and decision‑making.
The function provides the technical backbone that enables reliable testing, advanced analysis, automation, and long‑term product improvement.
The Data Engineer is responsible for designing and operating the data infrastructure that supports all laboratory test activities.
This includes connecting test equipment, building pipelines for high‑frequency and large‑volume datasets, ensuring data quality, and enabling engineers, analysts, and scientists to work efficiently with accurate data.
As the laboratory and organization expand, the role will evolve into a leadership function that guides the long‑term data strategy and leads a multidisciplinary data team.
Key Responsibilities
Key responsibilities include (but are not limited to)
- Design, build, and maintain data pipelines for high‑frequency lab data
- Integrate test equipment such as battery cyclers (Chroma, Keysight, PEC, PNE), chambers, DAQ systems, and PLCs
- Develop ETL/ELT processes to transform raw → validated → curated datasets
- Build scalable data storage solutions (data lakes, time-series DBs, structured metadata stores)
- Implement data validation, anomaly detection, and quality monitoring
- Automate data processing for reporting, dashboards, and analysis
- Ensure data traceability, version control, and audit compliance
- Work closely with test and validation engineers to understand test profiles, metadata, and measurement methods
- Support lab technicians with tools that simplify workflows and reduce manual data tasks
- Integrate with MES, LIMS, PLM, and other enterprise systems
- Troubleshoot data-related issues in test execution or equipment communication
- Take increasing ownership of data architecture and long‑term data roadmap
- Contribute to documentation standards, data governance, and best practices
Requirements
Qualifications and Experience
- Engineering in technical data role (Data Engineering, Data Science, Machine Learning) including processing, storage, quality, and management on GCP or AWS
- +4 years of relevant experience
- Project management experience
- Experience in large manufacturing or industrial enterprises with heterogeneous, distributed data sources, demonstrating ability to navigate complexity at scale
Specific skills & Knowledge
- Proven experience scaling and re‑architecting data platforms and infrastructure to handle rapid growth and increasing data volumes
- Hands-on experience designing and building highly scalable and reliable data architectures using modern cloud and data tooling (e.g., AWS Kinesis, Lambda, Redshift, GCP equivalents; Airflow, dbt; Parquet, Protobuf, Avro)
- Strong programming skills in Python, SQL, and general-purpose scripting for automation, data processing, and integration
- Deep understanding of ETL/ELT frameworks (Airflow, dbt, Spark, etc.) and experience building production-grade data pipelines
- Familiarity with time-series and high‑frequency measurement data, particularly from industrial or test environments
- Cloud engineering experience in AWS, GCP, or Azure, including serverless architectures, distributed storage, and stream processing
- Experience with CI/CD, Git-based workflows, Docker, and robust software engineering practices
- Knowledge of data serialization formats (Parquet, Avro, Protobuf, JSON) and best practices for efficient storage and retrieval
- Experience integrating systems via APIs; familiarity with hardware communication protocols such as REST, OPC-UA, and Modbus is a strong plus
- Understanding of machine learning concepts and experience supporting data scientists with structured, high-quality datasets
Domain knowledge (Preferred)
- Solid engineering foundation (electrical, mechanical, chemical, physical), preferably within the energy, electrical testing, or battery domain
- Understanding of sensor calibration, noise, drift, and data validation