Are you excited by the challenge of building scalable data infrastructure that powers search, browse, and recommendation systems used by millions of users? We are setting up a brand-new Data Engineering team within leading European Online Fashion & Beauty Retailer to take ownership of key pipelines that fuel the company's ML and data-driven services.
Our initial focus will be on migrating and improving data pipelines currently maintained by our ML Platform team (Databridge). We’ll start by resolving technical debt, decoupling dependencies, and establishing engineering excellence within our own domain. This is a great opportunity to build foundational systems from the ground up — the right way.
Take ownership of existing data pipelines, migrating them from Databridge to our new team
Address and resolve high-priority technical debt in Spark-based ETL systems
Build and maintain scalable, production-grade data pipelines using PySpark, SQL, and Airflow
Work within Databricks to optimize workflows and data processing
Collaborate with ML and product teams to ensure reliable delivery of high-quality data
Contribute to establishing best practices, documentation, and engineering processes within the team
Support data modeling efforts to improve downstream data usability
Strong hands-on experience with PySpark and Apache Spark
Proficiency with Airflow for orchestrating data pipelines
Solid SQL skills for data transformation and validation
Experience working with Databricks in a production environment
Familiarity with data modeling concepts (desired, but not required)