Site Reliability Engineer
AIMS is a Norwegian technology start-up with global ambitions: we are on a mission to automate IT Operations by providing a state-of-the-art platform for Artificial Intelligence in IT Operations.
About the Role:
Are you obsessed with making large systems reliable and scalable? Do you think keeping a system running seamlessly 24/7 with no maintenance outages is a fun intellectual challenge? If so, we would like to talk to you!
Responsibilites:
Define and manage Service Level Objective (SLO), Service Level Indicator (SLI) and Error budgets for AIMS product suite
Build monitoring that proactively alerts on leading indicators rather than on outages.
Document everything, so your findings turn into repeatable actions and then into automation. Relentlessly automate repetitive tasks.
Use the AIMS product to monitor AIMS itself, to improve the product as much as possible
Improve operational processes (such as deployments and upgrades) to make them as boring and repeatable as possible.
Design, build and maintain core infrastructure that enables AIMS to scale to support hypergrowth
Drive Capacity model for AIMS services
Debug production issues across services and levels of the stack.
Minimum 5 years relevant experience
Strong programming skills, including shell scripting and at least one additional scripting language
Ability to collaborate and communicate asynchronously.
Have an urge to document all the things so you don't need to learn the same thing twice.
Have an urge to automate all manual processes, particularly boring or error-prone tasks
Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it.
Have an urge for delivering quickly and effectively, and iterating fast.
Have experience with Docker, Kubernetes, Terraform, or similar technologies
Planning: familiarity with agile methodologies; use epics, issues to drive projects
About AIMS:
AIMS is a Norwegian technology start-up with global ambitions. Our core software, AIMS AIOps, enhances IT Operations teams decision making by contextualizing large volumes of varied and volatile data. The complexity of IT systems has been exponentially increasing for the past several years. This has led to IT Operations teams that lack modernized skill sets, processes, and tooling to struggle with achieving complete visibility into the digital services they provide to customers. AIMS has invested 10 M$ in the AIOps platform and is on a mission to automate IT Operations by providing a state-of-the-art platform for Artificial Intelligence in IT Operations.
- Department
- Engineering
- Locations
- Oslo
- Remote status
- Hybrid Remote
Oslo
Workplace & culture
Become a part of the rapidly-growing groundbreaking AIOps company. We promote a healthy work-life balance, personal development, and career growth. We strive to push innovation in the performance monitoring space and to help our customers succeed with their digital businesses.
About AIMS
AIMS is a Norwegian technology start-up with global ambitions. Our core software, AIMS AIOps, enhances IT Operations teams decision-making by contextualizing large volumes of varied and volatile data.
The complexity of IT systems has been exponentially increasing for the past several years. This has led to IT Operations teams that lack modernized skill sets, processes and tooling to struggle with achieving complete visibility into the digital services they provide to customers. AIMS has invested 10 M$ in the AIOps platform and is on a mission to automate IT Operations by providing a state-of-the-art platform for Artificial Intelligence in IT Operations.
Site Reliability Engineer
AIMS is a Norwegian technology start-up with global ambitions: we are on a mission to automate IT Operations by providing a state-of-the-art platform for Artificial Intelligence in IT Operations.
Loading application form