Mujeres en Negocios & Finanzas logo

Reliability Engineer

Mujeres en Negocios & Finanzas
Department:Data Engineer
Type:REMOTE
Region:Australia
Location:Australia
Experience:Mid-Senior level
Estimated Salary:A$110,000 - A$150,000
Skills:
SYSTEM ARCHITECTURECLOUD COMPUTINGMONITORING TOOLSSCRIPTINGAUTOMATIONCONTAINERIZATIONINFRASTRUCTURE AS CODEDATABASESINCIDENT RESPONSEPERFORMANCE OPTIMIZATIONAWSGOOGLE CLOUDAZURE
Share this job:

Job Description

Posted on: January 31, 2026

🗂 We’re Hiring: Reliability Engineer🕒 Employment Type: Full-Time

💼 Level: Mid-Level / Senior

We are seeking a skilled and detail-oriented Reliability Engineer to join our team. As a Reliability Engineer, you will be responsible for ensuring the continuous performance and reliability of our systems, applications, and infrastructure. You will work closely with cross-functional teams to identify areas of improvement, automate processes, and resolve issues before they impact customers. Your role will be critical in driving long-term system stability, improving uptime, and enhancing operational efficiency. If you have a strong technical background, a proactive mindset, and a passion for optimizing system performance, we’d love to hear from you!

🎯 Key Responsibilities:

  • Design, implement, and maintain monitoring systems to ensure the reliability and performance of infrastructure, applications, and services.
  • Develop and deploy automated solutions to detect, troubleshoot, and resolve system and application issues.
  • Identify and address potential failure points in systems to proactively improve system reliability and uptime.
  • Collaborate with DevOps, engineering, and product teams to build scalable, highly available, and fault-tolerant systems.
  • Conduct root cause analysis of incidents and problems, developing long-term solutions to prevent recurrence.
  • Develop and maintain disaster recovery plans, ensuring systems can recover quickly and efficiently from failures.
  • Perform capacity planning, ensuring systems have the necessary resources to scale with user demand.
  • Analyze system logs, metrics, and trends to identify opportunities for optimization and performance tuning.
  • Participate in on-call rotations to provide timely responses to system outages or performance degradation.
  • Drive continuous improvement initiatives through data-driven analysis and performance feedback.
  • Create and maintain documentation for reliability-related processes, systems, and incidents.
  • Contribute to creating a culture of reliability within the engineering and operations teams, promoting best practices and a focus on system health.

✅ Requirements:

  • Proven experience as a Reliability Engineer, Site Reliability Engineer (SRE), DevOps Engineer, or in a similar role.
  • Strong knowledge of system architecture, cloud computing, and high-availability systems.
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, New Relic).
  • Proficiency with scripting and automation tools (e.g., Python, Bash, Terraform, Ansible).
  • Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Familiarity with infrastructure-as-code (IaC) practices and tools (e.g., AWS CloudFormation, Terraform).
  • Solid understanding of databases, caching systems, and load balancing.
  • Experience with incident response, root cause analysis, and post-mortem processes.
  • Ability to analyze and optimize performance at both the system and application levels.
  • Knowledge of cloud platforms (AWS, Google Cloud, Azure) and distributed systems.
  • Strong problem-solving skills and the ability to handle complex technical issues.
  • Excellent communication skills, both written and verbal, with the ability to collaborate across teams.
  • Ability to manage multiple priorities and meet deadlines in a fast-paced environment.
  • A degree in Computer Science, Engineering, or a related field is preferred.
  • Relevant certifications (e.g., AWS Certified Solutions Architect, Google Professional Cloud Architect) are a plus.

🌟 What We Offer:

  • A dynamic, collaborative, and innovative work environment.
  • Opportunities for career growth and development in system reliability and DevOps practices.
  • Access to cutting-edge technologies and tools in a growing tech company.
  • Competitive compensation and benefits package.
  • A team culture that values continuous learning, reliability, and automation.
  • Flexible work arrangements, including remote work options.
  • Ongoing training opportunities to enhance your technical expertise and certifications.
Originally posted on LinkedIn

Apply now

Please let the company know that you found this position on our job board. This is a great way to support us, so we can keep posting cool jobs every day!

Mujeres en Negocios & Finanzas logo

Mujeres en Negocios & Finanzas

View company page
RemoteITJobs.app logo

RemoteITJobs.app

Get RemoteITJobs.app on your phone!