38 Maintenance Reliability jobs in Bahrain
Industrial Equipment Maintenance Technician
Posted 13 days ago
Job Viewed
Job Description
Industrial Equipment Maintenance Technician
Posted 14 days ago
Job Viewed
Job Description
Senior Industrial Equipment Maintenance Engineer
Posted 7 days ago
Job Viewed
Job Description
Responsibilities:
- Oversee the planning, scheduling, and execution of preventive, predictive, and corrective maintenance for all industrial equipment.
- Lead and mentor a team of maintenance technicians, assigning tasks and ensuring work quality.
- Diagnose and troubleshoot complex mechanical, electrical, and hydraulic system failures.
- Perform high-level repairs and overhauls of production machinery and plant equipment.
- Develop and implement maintenance strategies to minimize downtime and optimize equipment performance.
- Ensure compliance with all plant safety regulations, policies, and procedures.
- Manage spare parts inventory and ensure availability of critical components.
- Maintain detailed maintenance logs and records using CMMS (Computerized Maintenance Management System).
- Analyze equipment performance data to identify trends and areas for improvement.
- Participate in equipment installation, commissioning, and upgrades.
- Contribute to capital expenditure planning and equipment modernization projects.
- Train technicians on new equipment and maintenance procedures.
- Bachelor's degree in Mechanical Engineering, Electrical Engineering, or a related field.
- Minimum of 7 years of progressive experience in industrial equipment maintenance within a manufacturing or heavy industry setting.
- Proven expertise in troubleshooting and repairing a wide range of industrial machinery (e.g., CNC machines, conveyors, automation systems, heavy presses).
- Strong knowledge of mechanical, electrical, hydraulic, and pneumatic systems.
- Experience with CMMS software and inventory management.
- Proficiency in reading and interpreting technical drawings, schematics, and manuals.
- Excellent leadership, team management, and communication skills.
- Strong analytical and problem-solving abilities.
- Commitment to workplace safety and environmental regulations.
- Ability to work flexible hours and respond to urgent maintenance needs.
Site Reliability Engineer
Posted today
Job Viewed
Job Description
Job Description:
An SRE is responsible for keeping all user-facing and internally used services running smoothly. SREs are a blend of a software engineer and system administrative operator that applies infrastructure knowledge towards the betterment of the team, and the quality of the product.
A person in this position will know and specialize in the systems that keep the company afloat, making sure that their availability, reliability and scalability are in peak condition.
Job Expectations
- Triage and Handle Node Health issues in-hours
- Participate in Firefighting along with development engineers
- Own the Design, execution and support the deployment topology of the product through infrastructure as code
- Own and maintain the distribution, scaling, metrics collection, and monitoring of multiple clusters
- Support the engineers in their needs to define resourcing for services that they are building as a stakeholder
- Own the running of our CI/CD systems and work with the Testing Engineers to create a well tested product
- Improve and own operational processes
- Have knowledge and focus in the security of the topologies that we have running in production
- Plan the growth of the infrastructure based on business needs and inputs
Required Skills
- Kubernetes, Docker, and Helm
- Very comfortable operating in Linux, including a knowledge of BASH
- Cloud hosting platform (Ideally GCP, but AWS or Azure are ok)
- Able to write code in Python
- Experience deploying and maintaining modern CI/CD systems (Zuul, CircleCI, Concourse, etc.)
- A knowledge and passion for infrastructure as code
Job Type: Full-time
Senior Reliability Engineer
Posted 14 days ago
Job Viewed
Job Description
Responsibilities include maintaining reliability databases, developing and updating technical documentation, and conducting training for maintenance personnel. You will also be involved in the evaluation of new technologies and equipment to ensure their reliability and maintainability. A strong understanding of mechanical and electrical systems, coupled with expertise in reliability engineering principles and tools, is essential. Proficiency in data analysis software and experience with CMMS (Computerized Maintenance Management Systems) are highly desirable. The ideal candidate will possess a Bachelor's degree in Mechanical Engineering, Electrical Engineering, or a related discipline, with at least 5 years of experience in reliability engineering within an industrial setting. Excellent problem-solving skills, strong analytical abilities, and effective communication skills are crucial for this role. This is a challenging and rewarding opportunity to significantly contribute to operational excellence and asset management.
Site Reliability Engineer (SRE)
Posted 1 day ago
Job Viewed
Job Description
Responsibilities:
- Design, build, and maintain highly reliable and scalable production systems.
- Implement and manage monitoring, alerting, and logging solutions.
- Define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
- Automate operational tasks through scripting and development.
- Lead incident response and post-mortem analysis to prevent recurrence.
- Conduct capacity planning and performance tuning of systems.
- Collaborate with development teams to ensure the operability of new features.
- Implement and maintain CI/CD pipelines for reliable deployments.
- Develop and execute disaster recovery plans.
- Contribute to infrastructure security and compliance efforts.
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience.
- 5+ years of experience in SRE, Systems Engineering, or Software Engineering with a focus on reliability.
- Proficiency in at least one programming language (e.g., Python, Go, Java).
- Experience with cloud platforms (AWS, Azure, GCP).
- Familiarity with containerization and orchestration technologies (Docker, Kubernetes).
- Strong understanding of Linux/Unix systems.
- Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog).
- Knowledge of networking, databases, and distributed systems.
- Excellent problem-solving and debugging skills.
- Ability to work effectively in a team environment.
Senior Site Reliability Engineer
Posted 8 days ago
Job Viewed
Job Description
The ideal candidate will possess a deep understanding of distributed systems, cloud computing platforms (e.g., AWS, Azure, GCP), and containerization technologies (e.g., Docker, Kubernetes). You should have a strong background in scripting and automation (e.g., Python, Go, Bash) and a proven ability to troubleshoot complex production issues. Experience with CI/CD pipelines, infrastructure as code (e.g., Terraform, Ansible), and performance tuning is highly valued. You will work closely with development and operations teams to embed reliability best practices into the software development lifecycle. Excellent communication and problem-solving skills are essential, as is the ability to work effectively in both remote and on-site settings. Your contributions will be vital in maintaining high standards of service uptime and performance for our client's users.
Key Responsibilities:
- Design, implement, and maintain scalable and reliable cloud infrastructure.
- Develop automation tools and scripts to streamline operations and deployments.
- Build and manage robust monitoring, alerting, and logging systems.
- Lead incident response efforts, conduct post-mortems, and implement preventative measures.
- Collaborate with development teams to improve system design and performance.
- Manage and optimize container orchestration platforms like Kubernetes.
- Implement and maintain Infrastructure as Code (IaC) solutions.
- Perform performance tuning and capacity planning.
- Ensure security best practices are integrated into all aspects of infrastructure management.
Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- 5+ years of experience in Site Reliability Engineering, DevOps, or a similar role.
- Strong proficiency in at least one scripting language (Python, Go, Bash).
- Extensive experience with cloud platforms (AWS, Azure, GCP).
- Deep understanding of containerization technologies (Docker, Kubernetes).
- Experience with CI/CD tools and practices.
- Familiarity with Infrastructure as Code tools (Terraform, Ansible).
- Excellent troubleshooting and problem-solving skills.
- Strong communication and collaboration abilities, suitable for a hybrid work environment.
Be The First To Know
About the latest Maintenance reliability Jobs in Bahrain !
Senior Site Reliability Engineer
Posted 14 days ago
Job Viewed
Job Description
The ideal candidate will have extensive experience with cloud platforms (AWS, Azure, GCP), containerization technologies (Docker, Kubernetes), and infrastructure-as-code tools (Terraform, Ansible). You should be proficient in scripting languages such as Python, Go, or Bash, and have a strong background in system administration and networking. Responsibilities include designing and implementing robust monitoring and alerting systems, developing automation tools to reduce manual operational effort, participating in on-call rotations, and leading incident post-mortems to identify root causes and implement preventative measures. Collaboration with development teams to ensure production readiness of new features and services is a key aspect of this role. A Bachelor's degree in Computer Science, Engineering, or a related field is required, along with a minimum of 5 years of experience in SRE, DevOps, or a similar role. Strong problem-solving skills, excellent communication abilities, and a proactive approach to system resilience are essential. Experience with CI/CD pipelines and application performance monitoring (APM) tools is highly desirable.
Key Responsibilities:
- Design, build, and maintain scalable and reliable production systems.
- Implement and manage monitoring, alerting, and logging solutions.
- Automate infrastructure provisioning and configuration management using IaC tools.
- Develop and maintain CI/CD pipelines for efficient software deployment.
- Respond to and resolve production incidents, leading post-mortems.
- Collaborate with development teams to ensure system reliability and performance.
- Perform capacity planning and performance tuning of distributed systems.
- Manage and scale container orchestration platforms (e.g., Kubernetes).
- Develop and maintain system documentation and runbooks.
- Participate in an on-call rotation schedule.
- Bachelor's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in Site Reliability Engineering, DevOps, or System Administration.
- Proficiency with cloud platforms (AWS, Azure, GCP).
- Strong experience with containerization (Docker) and orchestration (Kubernetes).
- Expertise in infrastructure-as-code tools (Terraform, Ansible).
- Proficient scripting skills (Python, Go, Bash).
- Solid understanding of networking concepts and protocols.
- Experience with monitoring tools (Prometheus, Grafana, Datadog).
- Excellent troubleshooting and problem-solving abilities.
- Strong communication and collaboration skills.
Remote Site Reliability Engineer
Posted 15 days ago
Job Viewed
Job Description
Key Responsibilities:
- Design, build, and maintain scalable and reliable infrastructure on cloud platforms.
- Develop and implement automation for deployment, scaling, and operational tasks.
- Monitor system performance, availability, and capacity, and respond to incidents.
- Diagnose and resolve complex production issues across distributed systems.
- Implement and manage CI/CD pipelines and infrastructure-as-code solutions.
- Conduct root cause analysis for incidents and implement preventative measures.
- Contribute to disaster recovery planning and testing.
- Collaborate with development teams to ensure the reliability and operability of new features.
- Document system architecture, operational procedures, and best practices.
- Proven experience in Site Reliability Engineering or a similar role.
- Strong proficiency with cloud platforms (AWS, Azure, or GCP).
- Expertise in containerization (Docker, Kubernetes).
- Proficiency in infrastructure-as-code tools (Terraform, Ansible).
- Strong scripting skills (Python, Bash, Go).
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack).
- Deep understanding of networking protocols and distributed systems.
- Excellent troubleshooting and problem-solving abilities.
- Ability to work effectively in a remote, collaborative environment.
Senior Site Reliability Engineer
Posted 16 days ago
Job Viewed
Job Description
Responsibilities:
- Design, build, and maintain highly available and scalable systems using infrastructure as code principles.
- Develop and implement automation for deployment, monitoring, and operational tasks.
- Proactively identify and resolve performance bottlenecks and system issues.
- Lead incident response efforts, conduct root cause analyses, and implement preventative measures.
- Define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
- Collaborate with software engineering teams to improve system design for reliability and operability.
- Contribute to the development and maintenance of CI/CD pipelines.
- Mentor junior engineers and share expertise on SRE best practices.
- Participate in on-call rotation for production incident management.
- Stay current with emerging technologies and industry trends in site reliability and cloud infrastructure.
- Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
- Minimum of 5 years of experience in Site Reliability Engineering, DevOps, or Systems Engineering.
- Strong proficiency in at least one programming or scripting language (e.g., Python, Go, Bash).
- Extensive experience with cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).
- Deep understanding of distributed systems, microservices architectures, and network protocols.
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Proven experience with infrastructure as code tools (e.g., Terraform, Ansible).
- Excellent problem-solving and debugging skills.
- Strong communication and collaboration skills, essential for remote team dynamics.
- Experience working in a remote-first or distributed team environment is highly preferred.