Site Reliability Engineer (SRE) Senior Windows & Web Infrastructure (Remote) at Stack Overflow

Come join the SRE team at Stack Overflow!  As one of the top 50 websites by traffic volume worldwide, we hit some unique challenges.

We’re looking for a senior Windows SRE for our on-prem (data center) website infrastructure. Candidates should have 3+ years experience with Windows, plus at least some Linux Server experience.
You’ll join our team of SREs who maintain and grow our on-prem infrastructure which supports and other web properties. The infrastructure is Windows-centric (IIS, MsSql), held together with Linux for load balancing, security, caching, and search. We have systems in 2 data centers plus a growing Azure presence.

As an SRE, you’ll bring a developer mindset to system administration, always looking for ways to automate manual work and create repeatable, scalable systems and processes. We are wiki-centric and prefer to document as we work, not at the end of the project.

We are a remote-first team with team members all over the world.  There may be rare travel to our data centers in Denver, CO and Jersey City, NJ.

What problems are we trying to solve:
  • Up our monitoring game by implementing meaningful SLI/SLOs (we use SignalFX)
  • Maintain 20+ IIS servers in a more automated fashion.
  • Upgrade our Active Directory infrastructure from WS2012R2 to 2019
  • Institute a Chaos Engineering program
  • Improve our Redis cache infrastructure
  • Use Infrastructure-as-Code (Terraform) in our datacenter, similar to how we use it in Azure
What you’ll do:
  • Maintain the services and infrastructure platform used by the Stack Overflow websites
  • Collaborate with developers to maintain and improve uptime and performance.
  • Be part of our on-call rotation
  • Act as a subject matter expert around our Windows Server and Active Directory infrastructure
  • Integrate with our Linux infrastructure, VMWare and Dell storage
Technologies you’ll work with:
  • Windows Server 2016 and 2019, PowerShell and C#, and Go
  • The Windows developer ecosystem (Visual Studio)
  • Puppet
  • Linux CentOS 7 and Alpine
  • Fastly CDN
  • Haproxy, Redis, Elasticsearch
  • Dell servers and EMC storage
  • IIS, Active Directory, SQL Serverx

Skills & Requirements

We’re looking for:
  • In-depth experience with Windows Server (and comfortable working with Linux)
  • Experience with web services on IIS (HTTP, TLS; load balancers like HAProxy, Fastly/Varnish) 
  • Some experience with a configuration management systems or Infrastructure as Code (we use Puppet and Terraform)
  • A track record of taking on challenges and delivering thorough, stable, and maintainable systems
  • Strong written communication skills and a strong inclination to “document as you go”
  • Work iteratively to scope and deliver large projects

What you’ll get in return:

  • Flexible hours
  • 20 days paid vacation + holidays
  • Completely free health insurance - no copay, no premiums (US residents)
  • Generous parental leave (10-16 weeks at 100% pay), family care leave, and unlimited sick days
  • Employees will never be poked with a sharp stick

About Stack Overflow

Stack Overflow is the largest, most trusted online community for developers to learn, share their knowledge, and build their careers. More than 50 million professional and aspiring programmers visit Stack Overflow each month to help solve coding problems, develop new skills, and find job opportunities.
We partner with businesses to help them understand, hire, engage, and enable the world's developers. Our products and services are focused on developer marketing, technical recruiting, market research, and enterprise knowledge sharing.
We believe in hiring smart people and getting out of their way. We have an office in New York with some of the best amenities of any New York startup, and we have people who work remotely all over the world.  We keep meetings and ceremony to an absolute minimum.
Employment is conditioned upon successful completion of a background check and upon having the appropriate legal right to work.

Diverse teams build better products.

Legally, we need you to know this: 

Stack Overflow does not discriminate in employment matters on the basis of race, color, religion, gender, national origin, age, military service eligibility, veteran status, sexual orientation, marital status, disability, or any other protected class. We support workplace diversity. 

But we want to add this:

We strongly believe that diversity of experience contributes to a broader collective perspective that will consistently lead to a better company and better products. We are working hard to increase the diversity of our team wherever we can and we actively encourage everyone to consider becoming a part of it.