At Roblox, we are looking for a highly experienced Site Reliability Engineer to join our engineering team. We are looking for an individual who has the technical expertise and creative problem-solving skills to help us maintain and improve the reliability and scalability of our platform.The ideal candidate will have a strong background in software engineering and operations, and should be experienced in developing, deploying, and managing large-scale distributed systems. Additionally, the candidate should have excellent communication skills, be comfortable working in a fast-paced and collaborative environment, and possess a strong attention to detail.If you have a passion for technology, a drive to innovate, and a commitment to excellence, then you may be the perfect fit for this role.
Responsibilities:
- Develop, deploy, and manage large-scale distributed systems.
- Monitor system performance and improve scalability and reliability of the platform.
- Create and maintain technical documentation.
- Troubleshoot system problems and provide solutions.
- Collaborate with other engineering teams to ensure system stability and security.
- Implement best practices to ensure maximum system uptime and availability.
- Identify and address system performance bottlenecks.
- Analyze system requirements and propose appropriate solutions.
- Research new technologies and develop prototype systems.
- Assist with the training and onboarding of new engineers.
Strong Understanding Of Linux/Unix Systems
Experience With Cloud-Based Infrastructure
Experience With Automation
Scripting
And Configuration Management Tools (Eg. Ansible
Puppet
Chef)
Knowledge Of Network Protocols And Monitoring Systems (Eg. Snmp
Nagios
Zabbix)
Experience With Container Technologies (Eg. Docker
Kubernetes)
Knowledge Of Web-Based Technologies (Eg. Apache
Nginx)
Experience With Database Systems (E
Security
Networking
Troubleshooting
Data Analysis
Debugging
Distributed systems
Scripting
Automation
Cloud Computing
Deployment
Capacity planning
Performance tuning
Configuration management
Incident response
Monitoring
Communication
Leadership
Negotiation
Problem Solving
Time management
Interpersonal Skills
self-motivation
Organizational skills
Teamwork
Adaptability
According to JobzMall, the average salary range for a Site Reliability Engineer in San Mateo, CA, USA is $125,000 - $170,000. This salary range may vary depending on the experience of the engineer and the company.
Apply with Video Cover Letter Add a warm greeting to your application and stand out!
Roblox is a global platform that brings people together through play.

Get interviewed today!
JobzMall is the world‘ s largest video talent marketplace.It‘s ultrafast, fun, and human.
Get Started