Staff Systems Reliability Engineer
Glendale, California, United StatesApply NowApply Later
Job ID 787640BR Location Glendale, California, United States Business The Walt Disney Company (Corporate) Date posted Apr. 29, 2021
Job Summary:At Disney, we‘re storytellers. We make the impossible, possible. We do this through utilizing and developing cutting-edge technology and pushing the envelope to bring stories to life through our movies, products, interactive games, parks and resorts, and media networks. Now is your chance to join our talented team that delivers unparalleled creative content to audiences around the world.
The Systems Reliability Engineering (SRE) Tech Evangelism team helps elevate SRE practices at TWDC, promoting and onboarding new technologies, solving complex problems and integrating with next generation digital platforms.
Systems Reliability Engineers use a software engineering approach to architect, design, automate, monitor, and build applications at scale. This includes operating and engineering software with close business segment alignment to deliver platforms through efficient, effective and resilient architectures. SREs are talented engineers that are focused on improving quality through a data driven approach: instrumentation, automation, and functional/unit testing.
This position is for an experienced systems reliability engineer (SRE) eager to play an integral role on the SRE Tech Evangelism team for The Walt Disney Company to help elevate SRE practices, onboard new technologies, solve complex problems and integrate next generation digital platforms.
The Staff SRE will help create, build and deliver amazing experiences for our guests, fans and businesses. Primary responsibilities include helping existing, new and emerging business teams onboard new technologies or platforms to accelerate their businesses. This will include consultation, designing, building, and supporting development pipelines, automating infrastructure and operations, creating telemetry for monitoring, engineering high reliability and reinforcing best practices to secure our company and guest data.
The Staff SRE is expected to have expert level systems administration skills in Linux and Windows platforms, and must have experience with software development (e.g. Python, Go, Java, Node), CI Pipeline tools (e.g.Gitlab CI Jenkins), Git source management, cloud hosting (AWS, GCP & Azure), container computing (e.g. Docker, Serverless Technologis), web technologies and the DevOps team culture. This position will also bring expertise on systems, network, operational excellence and application stability, security, performance, and capacity management, as well as documentation.
The Staff SRE must be prepared to work with engineering, creative and production teams in an extremely collaborative and high-energy environment to brainstorm, architect, gather requirements, troubleshoot, and provide stellar customer support. The ideal Staff SRE is passionate about constantly learning, taking technology to the next level to solve complex problems, and is a highly motivated, optimistic, proactive, creative thought leader and project manager and working closely with our Business Units & Segments.
Responsibilities:Translate ideas into tangible products that shape experiences by focusing on a systematic approach to automation, resiliency, efficiency, stability, security, performance, and capacity management, as well as documentation and serve as a subject matter expert through internal and external tech talks and conferences.
Make an impact on a transformative team and culture by designing, building, and supporting systems for a large-scale enterprise production environment that hosts a variety of digital workloads and experiences for The Walt Disney Company.
Collaborate and serve as a thought partner to work with various Engineering and Production teams to gather requirements, troubleshoot issues, apply a scientific approach to continuous improvement, challenge status quo, promote a high accountability trust culture and provide stellar customer support.
Inspire and lead initial discovery, architecture, design, automation, implementation and operationalization, including:
- Business Engagement and Requirements Gathering
- Architectural Review, Proof of Concept Work, and Onboarding
- Project: Build and Operationalize New Systems/Sites/Services/Products
- Systematic Load Testing, Troubleshooting, Optimization and Tuning
- Create System and Application Monitors, Trending Metrics and Reports
- Development: Tools and Automation Frameworks
- Hosting Platforms and Infrastructure Design and Support
Basic Qualifications:Technical Requirements
- Expertise in multiple scripting languages and advanced skills in programming languages (e.g. Go, Python, Ruby, Dart, Node, Java, others alike) with ability to build test coverage for all software being developed.
- Software Development Continuous Integration (CI) Pipeline knowledge (e.g. Jenkins, Gitlab CI)
- Expertise with Distributed Systems and Container Platforms (e.g. Kubernetes/GKE, ECS, Mesos, Fargate, Nomad)
- Experience with Source Control Management systems (e.g. Git)
- Expertise in public and private cloud hosting services (AWS, Google Cloud, Azure)
- Recognized as a subject matter expert on at least one OS and proficient in multiple operating systems, including OS performance monitoring, setup, configuration, tuning, and troubleshooting.
- Proficient in web server technologies (e.g. Apache, Node.js, NginX, Tomcat, IIS, Caddy Server) including setup, configuration, performance monitoring, tuning, clustering, and debugging (e.g. JConsole).
- Proficient with data technologies (e.g. NoSQL, MySQL, MongoDB, Redis, Elastic) including being able to perform basic setup, configuration, and troubleshooting.
- Able to implement existing base standards for new systems and/or applications for all of the following:
- Site/Systems monitoring and instrumentation
- Application monitoring and instrumentation
- System monitoring and instrumentation
- Resilience, performance & Telemetry data
- Able to diagnose simple to complex systems and process problems.
- Able to perform and provide in depth analysis on load test runs against a moderately complex system.
- Demonstrate exceptional troubleshooting methodology, including the ability to author and instruct new methodologies to the SRE team.
- Independently resolve moderately to highly complex system and application incidents.
- Able to identify and propose system and application fixes for performance bottlenecks.
- Able to evaluate new application requirements for capacity and run-time best practices.
- Able to evaluate new system and/or infrastructure solutions for technical feasibility against known requirements and standards.
- Effective at dealing with change: Able to transition in role or handle a significant modification or technology with minimal ramp-up time and with very little guidance.
Required EducationBachelor of Science degree in computer science or related field or equivalent experience in technical operations and software engineering
About The Walt Disney Company (Corporate):
At Disney Corporate you can see how the businesses behind the Company’s powerful brands come together to create the most innovative, far-reaching and admired entertainment company in the world. As a member of a corporate team, you’ll work with world-class leaders driving the strategies that keep The Walt Disney Company at the leading edge of entertainment. See and be seen by other innovative thinkers as you enable the greatest storytellers in the world to create memories for millions of families around the globe.
About The Walt Disney Company:
The Walt Disney Company, together with its subsidiaries and affiliates, is a leading diversified international family entertainment and media enterprise with the following business segments: media networks, parks and resorts, studio entertainment, consumer products and interactive media. From humble beginnings as a cartoon studio in the 1920s to its preeminent name in the entertainment industry today, Disney proudly continues its legacy of creating world-class stories and experiences for every member of the family. Disney’s stories, characters and experiences reach consumers and guests from every corner of the globe. With operations in more than 40 countries, our employees and cast members work together to create entertainment experiences that are both universally and locally cherished.
This position is with Disney Worldwide Services, Inc., which is part of a business segment we call The Walt Disney Company (Corporate).
Disney Worldwide Services, Inc. is an equal opportunity employer. Applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, or protected veteran status or any other basis prohibited by federal, state or local law. Disney fosters a business culture where ideas and decisions from all people help us grow, innovate, create the best stories and be relevant in a rapidly changing world.