SRE Middle engineer

Apply Refer a friend
Our Vacancie

Location:

🇷🇴Romania

Partner:

Pandora

Technologies:

Security ServiceNow Site Reliability Java Python .Net System Administration

Seniority:

Middle
  • Overview

    Our client is a Danish jewelry brand, and one of the most famous jewelry brands in the world. From our side, we're focusing on creating a team to improve automation processes and develop great partnerships for years. 

    As a Site Reliability Engineer, you’ll be working with the team on creating an SRE process from scratch for one of the biggest jewelry e-commerce projects in Europe, assessing process maturity in several Dev teams, implementing Observability with tools like NewRelic, OpsGenie for a range of existing on-premise and cloud applications (Azure): e-com SFCC/SFRA, IBM Sterling OMS, Data & Analytics, ERP.

  • Responsibilities

    • Creating metric/log based monitors and dashboards (NewRelic) and alerting capabilities (OpsGenie)
    • Defining SLOs and measuring SLIs, Error budgets of production applications/services
    • Improving operaitonal KPIs like MTTD/MTTR, service availability & reliability
    • Onboarding production application/services to SRE process
    • Ensuring site performance and capabilities by participating in performance, load, and stress testing
    • Evangelizing SRE’s mission to the company including cloud engineering best practices and operational readiness
    • Work with engineering teams to refine deployment and release processes
    • Monitor and stress test systems to collect metrics for tuning and capacity planning
    • Work to automate detection and resolution of recurring issues (problem management)
    • Ensure safety, predictability, repeatability, and suitability of all build and deploy processes
    • automate repetitive tasks and prevent incident re-occurrence
  • We require

    • Experience with Azure Cloud
    • Experience with one or more: Salesforce Commerce Cloud (SFCC,Demandware), IBM Sterling OMS, MS Dynamics ERP, web-applications, REST API, Event Driven Architecture (Kafka)
    • Experience with either .NET or JAVA software and systems
    • Expert knowledge in all aspects of designing, developing, and managing large real-time systems
    • Comfortable scripting and debugging distributed web-based applications
    • Natural collaboration skills and an eye for continuous improvement
    • Fluent in scalability and root cause analysis exercises (blameless RCA, Postmortems)
    • Dedicated to continuous integration and improving processes (ADO Pipelines creation & improvement)
    • Strong hands-on technical experience in software deployment and operations on public Cloud platforms, CI/CD, deployment automation, and Pipelines

    Will be a plus:

     

    • Experience with incident command/management (ServiceNow), ITSM and ITIL frameworks
    • Experience in training and educating to engineering as a whole on infrastructure and internal tooling (on Azure cloud, NewRelic, Azure DevOps Pipelines, writing runbooks/SOP, ‘solution design articles’ for SRE/Support cases
    • pro-activeness and persistence in driving team’s tasks to completion with stakeholders inside company as well as with 3rd party vendors
    • extreme ownership & knowledge sharing within organization
    • ability to explain complex technical problems in simple words

Join our team!

Send us your CV and we will contact you as soon as possible.

X

okYour message is sent. Thank you for contacting us, we will get in touch with you soon.

*mandatory fields
Upload CV