7 Reliability Engineer jobs in Bahrain
Site Reliability Engineer
Posted 4 days ago
Job Viewed
Job Description
Key Responsibilities:
- Design, build, and maintain scalable and highly available systems.
- Implement and manage infrastructure as code (IaC) using tools like Terraform and Ansible.
- Develop and maintain CI/CD pipelines for automated deployments.
- Implement and manage robust monitoring, logging, and alerting systems.
- Conduct performance analysis and capacity planning for production environments.
- Automate operational tasks and processes to improve efficiency.
- Participate in incident response and conduct post-mortems to prevent recurrence.
- Collaborate with development teams to ensure system reliability and performance.
- Troubleshoot and resolve complex infrastructure and application issues.
- Contribute to security best practices for infrastructure management.
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related technical field.
- Minimum of 5 years of experience in DevOps, SRE, or a related systems engineering role.
- Proficiency with cloud computing platforms (AWS, Azure, GCP).
- Strong experience with containerization technologies (Docker, Kubernetes).
- Expertise in scripting and programming languages (Python, Go, Bash).
- Hands-on experience with IaC tools (Terraform, Ansible, Chef, Puppet).
- Solid understanding of networking protocols and concepts.
- Experience with monitoring tools (Prometheus, Grafana, Datadog).
- Excellent problem-solving and troubleshooting skills.
- Ability to work effectively in a remote, collaborative team environment.
Lead Site Reliability Engineer
Posted 1 day ago
Job Viewed
Job Description
Responsibilities:
- Lead the design, implementation, and management of scalable and reliable infrastructure.
- Develop and enforce site reliability engineering best practices, including monitoring, alerting, and incident response.
- Architect solutions for high availability, disaster recovery, and business continuity.
- Automate operational tasks, deployments, and scaling processes.
- Conduct root cause analysis for production incidents and implement preventive measures.
- Collaborate with development teams to ensure the reliability and operability of new features and services.
- Mentor and guide junior SRE team members, fostering a culture of continuous learning and improvement.
- Define and track key reliability metrics (SLOs, SLIs) and service level objectives.
- Manage and optimize cloud infrastructure (e.g., AWS, Azure, GCP).
- Develop and maintain infrastructure-as-code (IaC) solutions.
- Participate in on-call rotation and respond to critical incidents.
- Contribute to security best practices and ensure system compliance.
- Evaluate and integrate new technologies to enhance reliability and efficiency.
- Bachelor's degree in Computer Science, Engineering, or a related field; Master's degree preferred.
- Minimum of 8 years of experience in Site Reliability Engineering, Systems Engineering, or DevOps.
- Proven experience leading and mentoring engineering teams.
- Strong proficiency in at least one major cloud platform (AWS, Azure, GCP).
- Extensive experience with containerization technologies (Docker, Kubernetes).
- Deep understanding of scripting and programming languages (e.g., Python, Go, Bash).
- Expertise in monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Experience with CI/CD pipelines and tools.
- Solid understanding of networking protocols and security principles.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities.
- Experience with large-scale distributed systems is highly desirable.
Remote Site Reliability Engineer
Posted 1 day ago
Job Viewed
Job Description
Key Responsibilities:
- Design, build, and maintain scalable and reliable infrastructure.
- Automate deployment, monitoring, and operational tasks.
- Implement and manage CI/CD pipelines.
- Develop and maintain robust monitoring, logging, and alerting systems.
- Respond to and resolve production incidents, ensuring minimal downtime.
- Conduct root cause analysis for system failures.
- Perform capacity planning and performance tuning.
- Collaborate with development teams to improve system design and operability.
- Contribute to security best practices and implementation.
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field.
- Minimum of 4 years of experience in Site Reliability Engineering, DevOps, or Systems Engineering.
- Strong experience with cloud platforms (AWS, Azure, or GCP).
- Proficiency in containerization technologies like Docker and Kubernetes.
- Skilled in scripting and programming languages such as Python, Go, or Bash.
- Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog).
- Solid understanding of networking principles and protocols.
- Experience with CI/CD tools and practices.
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration abilities in a remote environment.
Senior Site Reliability Engineer (SRE)
Posted today
Job Viewed
Job Description
Senior Site Reliability Engineer (SRE)
Posted today
Job Viewed
Job Description
- Bachelor's degree in Computer Science, Engineering, or a related technical field.
- Minimum of 6 years of experience in system administration, operations, or site reliability engineering.
- Proven experience with cloud platforms (AWS, Azure, GCP) and associated services.
- Strong proficiency in scripting languages such as Python, Go, or Bash.
- Experience with containerization technologies (Docker, Kubernetes) and orchestration.
- Knowledge of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK stack).
- Understanding of networking concepts (TCP/IP, DNS, HTTP/S, load balancing).
- Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, Ansible).
- Excellent problem-solving, analytical, and troubleshooting skills.
- Strong communication and collaboration skills, ability to work effectively in a remote team.
Senior Site Reliability Engineer (Construction Technology)
Posted 2 days ago
Job Viewed
Job Description
Key Responsibilities:
- Design, build, and maintain scalable and reliable infrastructure supporting construction management software.
- Develop and implement automated deployment pipelines (CI/CD) for infrastructure and applications.
- Monitor system performance, identify potential issues, and implement proactive solutions to prevent outages.
- Manage cloud infrastructure (e.g., AWS, Azure, GCP) focusing on cost-optimization and security.
- Develop and maintain robust disaster recovery and business continuity plans.
- Collaborate with development and operations teams to troubleshoot and resolve complex technical issues.
- Implement and manage infrastructure-as-code solutions (e.g., Terraform, Ansible).
- Conduct root cause analysis for incidents and implement preventative measures.
- Optimize application and system performance through tuning and capacity planning.
- Document system architecture, operational procedures, and incident reports.
- Champion best practices in site reliability engineering within the construction technology domain.
- Bachelor's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in Site Reliability Engineering, DevOps, or Systems Administration with a focus on reliability.
- Proven experience with cloud platforms (AWS, Azure, or GCP).
- Strong proficiency in scripting languages such as Python, Bash, or Go.
- Experience with containerization technologies like Docker and Kubernetes.
- In-depth knowledge of CI/CD tools and practices.
- Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Understanding of networking concepts and protocols.
- Experience with database administration and performance tuning.
- Excellent problem-solving and debugging skills.
- Ability to thrive in a fast-paced, remote work environment.
- Knowledge of construction industry workflows is a plus.
Senior Site Reliability / Gitops Engineer
Posted 7 days ago
Job Viewed
Job Description
Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical
Senior Site Reliability / Gitops Engineer3 days ago Be among the first 25 applicants
Join to apply for the Senior Site Reliability / Gitops Engineer role at Canonical
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers, and industry leaders in many sectors. The company is a pioneer of global distributed collaboration, with 1200+ colleagues in 75+ countries and very few office-based roles. Teams meet two to four times yearly in person, in interesting locations around the world, to align on strategy and execution.
The company is founder-led, profitable, and growing.
We are hiring a Senior Site Reliability / Gitops Engineer to our Information Systems (IS) team. This role is an opportunity for an "automation-first" senior technologist with a passion for Linux to build a career with Canonical and drive the success with those leveraging Ubuntu and open source products. If you have experience of IT operations automation, Infrastructure as Code and a passion for technology, then you will enjoy working with some of the best people in the industry at Canonical.
Job Summary
The IS team at Canonical supports and maintains all of Canonical's IT production services. The team is in charge of running services used by over 60 million Ubuntu users.
As an Senior SRE & Gitops engineer you'll be in a unique position to drive operations automation to the next level, both in our own private clouds as well as in the public clouds. We do this by utilizing the best of open source infrastructure as code software, software development practices such as CI/CD pipelines, and Canonical's leading products for software operation automation.
In addition to defining the infrastructure as code, you will improve Canonical products and the open-source technologies they're based on by providing critical feedback to developers on how their products operate at scale. This is done by submitting bugs (and sometimes writing pull requests) and collaborating on design and implementations with other teams within the company.
You'll be part of a global team of SREs that work together and support each other to provide the best possible services to our company, Canonical's customers and the Ubuntu Community.
As a Senior Site Reliability / Gitops Engineer you will
- Drive the development of automation, Gitops in your team as an embedded tech lead
- Closely collaborate with the IS architect to align your solutions with the IS architecture vision
- Design and architect services that IS can offer to the organization as products
- Apply your experience of IaC to develop infrastructure as code practice within IS by constantly increasing automation and improving IaC processes
- Automate software operations for re-usability and consistency across private and public clouds, taking into consideration the complexities of distributed systems
- Maintain operational responsibility for all of Canonical's core services, networks, and infrastructure
- Develop skills in troubleshooting, capacity planning, and performance investigation, Setting up, maintaining and using observability tools such as Prometheus, Grafana, and Elasticsearch; design, implement and maintain monitoring and alerting for various systems and services
- Provide assistance and work with globally distributed engineering, operations, and support peers
- Be given uninterrupted development time to focus on larger projects and automation of manual tasks
- Share your experience, know-how and best practices with other team members in design sessions, mentorship and 'doing work together'
- Carry final responsibility for time-critical escalations
- A modern view on hosting architecture, driven by infrastructure as code across both private and public clouds.
- A product mindset thriving to develop products rather than solutions.
- Python software development experience, with large projects
- Experience working with Kubernetes or other container orchestration systems.
- Proven exposure to manage and deploy cloud infrastructure with code.
- Practical knowledge of Linux networking, routing, and firewalls
- Affinity with various forms of Linux storage, from Ceph to Databases
- Hands-on experience administering enterprise Linux servers
- Extensive knowledge of cloud computing concepts and technologies
- Bachelor's degree or greater, preferably in computer science or related engineering field
- Able to communicate clearly and effectively in English over email, chat, video or voice calls and in-person
- Motivated and able to troubleshoot from kernel to web, and willing to ask others when appropriate
- A willingness to be flexible and able to learn new things quickly
- Be inspired by the needs of fast-changing environments
- Happy to work within distributed teams
- Be passionate and familiarized about open-source, especially Ubuntu or Debian
We consider geographical location, experience, and performance in shaping compensation worldwide. We revisit compensation annually (and more often for graduates and associates) to ensure we recognize outstanding performance. In addition to base pay, we offer a performance-driven annual bonus or commission. We provide all team members with additional benefits which reflect our values and ideals. We balance our programs to meet local needs and ensure fairness globally.
- Distributed work environment with twice-yearly team sprints in person
- Personal learning and development budget of USD 2,000 per year
- Annual compensation review
- Recognition rewards
- Annual holiday leave
- Maternity and paternity leave
- Team Member Assistance Program & Wellness Platform
- Opportunity to travel to new locations to meet colleagues
- Priority Pass and travel upgrades for long-haul company events
Canonical is a pioneering tech firm at the forefront of the global move to open source. As the company that publishes Ubuntu, one of the most important open-source projects and the platform for AI, IoT, and the cloud, we are changing the world of software. We recruit on a global basis and set a very high standard for people joining the company. We expect excellence; in order to succeed, we need to be the best at what we do. Most colleagues at Canonical have worked from home since our inception in 2004. Working here is a step into the future and will challenge you to think differently, work smarter, learn new skills, and raise your game.
We are proud to foster a workplace free from discrimination. Diversity of experience, perspectives, and background create a better work environment and better products. Whatever your identity, we will give your application fair consideration.
Seniority level
- Seniority level Mid-Senior level
- Employment type Full-time
- Job function Engineering and Information Technology
- Industries Software Development
Referrals increase your chances of interviewing at Canonical by 2x
Sign in to set job alerts for “Senior Site Reliability Engineer” roles.Manama, Capital Governorate, Bahrain 4 months ago
Manama, Capital Governorate, Bahrain 1 month ago
Bahrain $60,000.00-$120,000.00 1 month ago
Manama, Capital Governorate, Bahrain 3 days ago
Manama, Capital Governorate, Bahrain 3 weeks ago
Junior Software Engineer - Cross-platform C++ - MultipassManama, Capital Governorate, Bahrain 2 months ago
Software Engineer - Solutions EngineeringManama, Capital Governorate, Bahrain 3 days ago
Graduate Software Engineer, Open Source and Linux, Canonical UbuntuManama, Capital Governorate, Bahrain 3 days ago
Manama, Capital Governorate, Bahrain 5 months ago
Manama, Capital Governorate, Bahrain 6 months ago
Seef, Capital Governorate, Bahrain 4 weeks ago
Manama, Capital Governorate, Bahrain 2 months ago
Manama, Capital Governorate, Bahrain 3 days ago
System Software Engineer - GCC/LLVM compiler, tooling, and ecosystemManama, Capital Governorate, Bahrain 3 days ago
Manama, Capital Governorate, Bahrain 1 month ago
Software Engineer - Python - Container ImagesManama, Capital Governorate, Bahrain 2 months ago
Software Engineer - Cross-platform C++ - MultipassManama, Capital Governorate, Bahrain 5 months ago
Manama, Capital Governorate, Bahrain 3 days ago
Manama, Capital Governorate, Bahrain 3 weeks ago
Manama, Capital Governorate, Bahrain 3 days ago
Manama, Capital Governorate, Bahrain 3 days ago
Software Engineer - Python - Container ImagesManama, Capital Governorate, Bahrain 2 months ago
Manama, Capital Governorate, Bahrain 1 month ago
Manama, Capital Governorate, Bahrain 1 month ago
Manama, Capital Governorate, Bahrain 3 days ago
Manama, Capital Governorate, Bahrain 2 months ago
Manama, Capital Governorate, Bahrain 3 days ago
Distributed Systems Software Engineer, Python / GoManama, Capital Governorate, Bahrain 4 months ago
Python and Kubernetes Software Engineer - Data, AI/ML & AnalyticsManama, Capital Governorate, Bahrain 6 months ago
Software Engineer - Python - Container ImagesManama, Capital Governorate, Bahrain 4 months ago
Manama, Capital Governorate, Bahrain 2 months ago
Software Engineer - Immutable Ubuntu DesktopManama, Capital Governorate, Bahrain 4 months ago
Manama, Capital Governorate, Bahrain 2 months ago
Manama, Capital Governorate, Bahrain 3 days ago
Python Software Engineer - Ubuntu Hardware Certification TeamManama, Capital Governorate, Bahrain 4 months ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-LjbffrBe The First To Know
About the latest Reliability engineer Jobs in Bahrain !