Observability Engineer, Automation & Reliability Engineering-Solarwinds


Job Details

Every great story has a new beginning, and yours starts here.Welcome to Warner Bros. Discovery... the stuff dreams are made of.Who We Are... When we say, "the stuff dreams are made of," we're not just referring to the world of wizards, dragons and superheroes, or even to the wonders of Planet Earth. Behind WBD's vast portfolio of iconic content and beloved brands, are the storytellers bringing our characters to life, the creators bringing them to your living rooms and the dreamers creating what's next...From brilliant creatives, to technology trailblazers, across the globe, WBD offers career defining opportunities, thoughtfully curated benefits, and the tools to explore and grow into your best selves. Here you are supported, here you are celebrated, here you can thrive.Your New Job:Automation & Reliability Engineering (ARE) combines software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. ARE ensures that Discovery's services-both our internally critical and our externally-visible systems-have reliability, uptime appropriate to user's needs and a fast rate of improvement. Additionally, OEs will keep a watchful eye on our systems capacity and performance. Much of our engineering focuses on optimizing existing systems, building infrastructure and eliminating work through automation. An OE is a practitioner and advocate of good monitoring practices and configuration management within GT&O, and so should be a great communicator and enthusiastic champion of Technology Operations. The core purpose of the role is to ensure that our applications, platforms, and infrastructure are effectively monitored for availability, performance, and functionality, and that alerts driven by our monitoring systems are accurate and actionable.On the ARE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Discovery, while using your expertise in observability, monitoring, and system design.Your Role Accountabilities:Design, roadmap, and administers tools used in discovering and monitoring Discovery's applications, services, platforms, and infrastructure.Build monitoring systems that assist in infrastructure and application event detection and alert remediation.Ensure all relevant infrastructure and services are properly covered within our monitoring and alerting systems in a manner consistent with our standards; collect the right metrics at the right frequency and ensure the data is readily available for effective alerting, reporting, and analysisDefine business and operations success metrics, establish a departmental process model for benchmarking, standardization, and process improvements.Collaborate with a cross-functional team of Dev, Ops, Engineers, and architects to understand complex application architectures to implement an effective top-down monitoring strategy of holistic service visibility.Participate in strategy and future implementation discussions for the redesign and implementation of monitoring environments to modernize with latest technology trends.Leveraging performance counters to diagnose and troubleshoot infrastructure problems.Create/maintain documentation for monitoring requirements, processes, and implementation.Assist in the deployment, organization, and management of standard operating procedures.Perform other duties as needed.Qualifications & Experiences:Bachelor's degree in Computer Science, Information Technology or related technical field, or equivalent practical experience.5+ years of experience in systems engineering and/or administration in an enterprise production environment.Experience installing, configuring, and maintaining monitoring tools with SolarWindsExperience with large-scale distributed systems and architecture knowledge (Linux/UNIX and Windows operating systems, networking, storage) in a cloud computing or traditional IT infrastructure environment.Experience in the use of network management protocols (e.g. SNMP, WMI, Syslog, ICMP, NetFlow, etc.).Experience managing a SolarWinds environment including main application server, additional polling engines, additional web servers, and database required.Experience working with SolarWinds Network Performance Monitor, Network Traffic Analyzer, Network Configuration Monitor, Server & Application Monitor, Storage Resource Monitor, Web Performance Monitor, and dashboards required.Experience with scripting and automation using one or more of the following: Python, PowerShell, BASH.Experience with configuration management and Infrastructure as Code tools such as: Terraform, Ansible, Puppet, or Chef.*This is a hybrid position, in which reporting onsite 2-3 times per week is required.How We Get Things Done...This last bit is probably the most important! Here at WBD, our guiding principles are the core values by which we operate and are central to how we get things done. You can find them at www.wbd.com/guiding-principles/ along with some insights from the team on what they mean and how they show up in their day to day. We hope they resonate with you and look forward to discussing them during your interview.The Legal Bits...Warner Bros. Discovery embraces the opportunity to build a workforce that reflects the diversity of our society and the world around us. Being an equal opportunity employer means that we take seriously our responsibility to consider qualified candidates on the basis of merit, without regard to race, color, religion, national origin, gender, sexual orientation, gender identity or expression, age, mental or physical disability, and genetic information, marital status, citizenship status, military status, protected veteran status or any other category protected by law.If you're a qualified candidate and you require adjustments or accommodations to search for a job opening or apply for a position, please contact us at ...@wbd.com.





 Warner Bros. Discovery

 04/21/2024

 Silver Spring,MD