Maxar is currently seeking a Site Reliability Engineer (SRE) to join our team in our Longmont and Westminster, CO locations. This position's primary location is Westminster with some days in the Longmont office.
Life with Us
Your Project: Site Reliability Engineering is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. In this position, you will ensure that Maxar's services have reliability and up-time appropriate to customers' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance. You will have the mindset and a set of engineering approaches to run better production systems; ideally, building your own, creative engineering solutions to operations problems. Much of our software development focuses on optimizing existing systems, building infrastructure, and eliminating work through automation. You will focus on operational procedures, code fixes, increasing the automation, repeatability, and consistency of operational tasks. The successful candidate has a breadth of knowledge to solve for complex problems across the entire technology stack.
Your Career: We are very serious about professional development and continuing education at Maxar. We offer our team members with the opportunity to define their own career trajectory. Our group has amazing resources to support learning and development. You will work with your manager, or a mentor, to set goals and design a development plan to advance your career.
What We Offer: Time for dedicated professional development, conference attendance, corporate partner and industry training, peer group collaboration, hackathons, as well as paid certifications, education reimbursement and student loan forgiveness.
What you'll do day-to-day (with your colleagues):
Provide service acceptance by adopting new processes into operations; develop new monitoring, or exposure of events; training operations teams for action to be taken against those events; creating automated fix actions against repeatable corrective actions
Monitor and report on service level objectives for system-wide application and infrastructure services. Work with service and product owners to establish KPIs to identify trends and quantify whether at the site/system level we are getting better, or not
Define standards for configuration, monitoring, reliability, and performance
Participate actively and critically in retrospectives that had broad impact and/or are leading indicators of potential site issues
Provide deep troubleshooting for production issues
Engage with service owners on root cause analysis for service interruption recovery and create preventive measures
On-call resource to provide support duties for operations team when the need arises
Design and architect operational solutions for managing applications and infrastructure
Minimum requirements for this position:
Must be a U.S. Citizen who is willing and able to obtain U.S. Government Secret security clearance
Associate's degree in computer science, information systems, or a related field. Two years of experience may be substituted for a degree
Minimum of 5 years of troubleshooting software engineering issues
Experience with Unix/Linux systems, with a high comfort level at the command line
Proficient with at least one programming language (e.g., Python, Ruby, Java)
These skills would be amazing:
Understanding of K8sDocker and automated deployment via pipeline (Concourse or Jenkins)
Familiarity with infrastructure as code, AWS cloud platform
Familiarity with distributed version control systems such as Git
A knack for troubleshooting tough problems with a high level of ownership and curiosity to empower this skill
Effectively prioritize work and encourage best practices in others
Meticulous and cautious with an ability to identify and consider all risks and balance those with performing the task efficiently
Positive, flexible, and personable; adaptive to change
Good understanding of networking fundamentals
"Make it happen" attitude
Organized with an ability to document and communicate ongoing work tasks and projects
Receptive to giving, receiving, and implementing feedback in a highly collaborative environment
Ability to learn rapidly in a fast-paced environment while being extremely curious about how things work
The compensation range for this position is $89,250 to $148,750 annually, dependent on skills and experience.
As a federal contractor, Maxar is required to comply with the vaccination and other COVID-19 requirements in President Biden's September 9 Executive Order on Ensuring Adequate COVID Safety Protocols for Federal Contractors. Accordingly, all Maxar team members must be fully vaccinated for COVID-19 no later than January 4, 2022, except in cases where a team member is legally entitled to an accommodation.
Maxar Technologies values diversity in the workplace and is an equal opportunity/affirmative action employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected veteran status, age, or any other characteristic protected by law.