Site Reliability Engineer
Other, Eastern Europe
Devops, HTML / CSS, JS, Salesforce, Security, Site Reliability

Our client is a Danish jewelry brand, and one of the most famous jewelry brands in the world. We are building a team to help improve the systems’ reliability and develop great partnerships for years.

In the capacity of a Site Reliability Engineer with a specialization in e-commerce and proficiency in Salesforce Commerce Cloud (SFCC)/Storefront Reference Architecture (SFRA), you will collaborate with product teams to enhance the sophistication of engineering methodologies and protocols within one of Europe's most prominent jewelry e-commerce initiatives.

Your role will involve close partnership with Development teams to refine architectural and developmental methodologies, instituting industry-leading practices for testing, and the deployment of e-commerce solutions into production environments. You will provide guidance on optimizing system observability, augmenting performance, and harmonizing integration with Order Management Systems (OMS), Enterprise Resource Planning (ERP), and fulfillment services. Your efforts will be pivotal in minimizing production disruptions and elevating the consumer experience.

Should you possess expertise in e-commerce systems, particularly SFCC/SFRA, you are presented with the opportunity to make a significant impact on one of the world's largest online and omni-channel retail platforms utilizing this cloud-based e-commerce solution.

  • Implementing observability, setting up metric/log based monitoring and alerting

  • Implement monitoring and logging solutions to proactively identify and address performance bottlenecks

  • Defining SLOs and measuring SLIs, Error budgets of production applications/services

  • Improving operaitonal KPIs like MTTD/MTTR, service availability & reliability

  • Collaborating with product team to optimise processes throughout product life cycle

  • Verifying system performance and scalability by participating in performance, load, and stress testing

  • Evangelizing SRE’s mission within the company including cloud engineering best practices and operational readiness

  • Work with engineering teams to refine deployment and release processes

  • Monitor and stress test systems to collect metrics for tuning and capacity planning

  • Work to automate detection and resolution of recurring issues (problem management)

  • Ensure safety, predictability, repeatability, and suitability of all build and deploy processes

  • Eliminate toil by automating repetitive tasks

  • Implement self-healing automation to handle known errors and prevent incident reoccurrence

  • Experience with e-commerce projects

  • Understanding of Infrastructure as Code using tools like Terraform or ARM templates to automate the deployment of AKS clusters

  • Strong hands-on technical experience in software deployment and operations on public Cloud platforms, CI/CD, deployment automation, and Pipelines

  • Experience scaling and securing microservices based architectures

  • Experience with NodeJS (or JavaScript runtime environments), Rest APIs

  • Understanding of event streaming (Kafka or any other analogs like RabbitMQ, Apache ActiveMQ Artemis, IBM MQ, Apache Pulsar)

  • Knowledge of best practices in product lifecycle from solution design, development, testing to operating large scale real-time systems

  • Fluent in scalability and root cause analysis exercises (blameless RCA, Postmortems)

  • Comfortable scripting and debugging distributed web-based applications

Nice to have:

  • Experience with Salesforce Commerce Cloud (SFCC)

  • Expertise in e-commerce systems (Storefront Reference Architecture (SFRA))

  • Experience implementing feature flagging and integrating with APM tools (New Relic, Datadog, Dynatrace, AppDynamics )

  • Experience within Microsoft Azure, containerisation and Azure Kubernetes Service (AKS)

  • A strong understanding of release automation/continuous integration and trunk-based development with experience coaching engineers to adopt automation and full stack feature flagging to achieve measurable efficiency gains.

  • Experience with CDN: Caching, security features

  • Experience with e-commerce integrations (PIM, CMS, OMS, etc) 

  • Experience with incident command/management (ServiceNow), ITSM and ITIL processes

  • Experience in training and coaching engineering teams on cloud services, tooling and best practices 

  • Pro-activeness and persistence in driving team’s tasks to completion with stakeholders inside company as well as with 3rd party vendors

Discover what it's like to work with us
Join Our Team!
Attaching my CV:
Your message is sent. Thank you for contacting us, we will get in touch with you soon.