Site Reliability Engineer (SRE) - Distributed Systems

101 X Corp.

Full-time

Remote friendly (US - CA - Palo Alto United States of America)

Are you prepared to join the X team and help build the ultimate real-time information-sharing app, revolutionizing how people connect? At X, we’re on a mission to become the trusted global digital public square, committed to protecting freedom of speech and building the future unlimited interactivity. Our goal is to empower every user to freely create and share ideas, fostering open public discourse without barriers. Join us in shaping this thrilling journey where your contribution will be invaluable to our success!

Location:

US: Palo Alto, New York

London (UK), Dublin (IE)

Salary Range:

$120,000 to $297,000

Who We Are:

X serves our community of users and customers by preserving free expression and choice, fostering limitless interactivity, and creating a marketplace for economic success.

What You'll Do:

Join X's dynamic Site Reliability Engineering teams across various domains and locations. As an SRE, you will play a crucial role in ensuring the high performance, reliability, and security of our systems. Each team focuses on different aspects of our infrastructure.

Team:

As a Staff SRE Engineer on the Coordination team you will:

Oversee the operation of software and services in the Coordination team.
Leverage deep expertise in networking, routing, and traffic patterns.
Leverage deep expertise in service discovery, distributed services, and cloud infrastructure.
Develop monitoring, alerting, and incident response solutions.
Contribute to ongoing enhancement efforts and champion reliability engineering best practices.

Who You Are:

Highly motivated team player with initiative.
Strong debugging, documentation and communication skills.
Ability to work collaboratively in a dynamic environment.
Availability for occasional travel (up to 20%).

Qualifications:

Bachelor's degree or above in Computer Science, Engineering, or related field.
5+ to 10+ years of experience in site reliability engineering or related roles.
Expertise in relevant technologies, such as CDN operations, containerization, incident management, traffic routing, and distributed systems.
Proficiency in scripting and automation (Python, Perl, Go).
Strong knowledge of Unix/Linux system administration at scale.

This job is closed.