AllenRecruiter Since 2001
the smart solution for Allen jobs

Senior Site Reliability Engineer

Company: First Horizon
Location: Plano
Posted on: March 30, 2026

Job Description:

Weekly Schedule: Monday-Friday, 9am-5pm We are seeking a Senior Site Reliability Engineer who will be the guardian of our Azure infrastructure reliability. This role focuses on building comprehensive observability platforms, implementing intelligent monitoring systems, and proactively identifying issues before they impact production. You will create the tools and automation that predict, detect, and prevent problems rather than simply reacting to them. Your primary mission is ensuring our Azure infrastructure and applications never surprise us with failures. The ideal candidate has deep expertise in Azure Monitor, Application Insights, Log Analytics, and KQL, combined with strong scripting skills in Python or PowerShell. You should have 5-7 years of experience implementing observability platforms and a proven track record of preventing incidents through proactive monitoring and automation. Youll work with technologies like Prometheus, Grafana, OpenTelemetry, and Azure services (AKS, App Services, Azure SQL, Cosmos DB) while building self-healing automation and predictive analytics tools that keep our systems healthy. Key Responsibilities: Design and implement comprehensive observability stack across all Azure resources and applications Build intelligent alerting systems with anomaly detection and predictive capabilities to prevent incidents Create self-healing automation and auto-remediation tools that resolve issues without human intervention Develop internal monitoring platforms, dashboards, and CLI tools for engineering teams Write KQL queries and analyze metrics/logs to identify optimization opportunities and predict failures Implement continuous resource monitoring for Azure quotas, costs, security posture, and service health Build capacity forecasting and trend analysis tools to prevent resource exhaustion Reduce alert noise while improving coverage and actionability of monitoring systems Participate in light on-call rotation (prevention-focused approach reduces reactive incidents) About Us First Horizon Corporation is a leading regional financial services company, dedicated to helping our clients, communities and associates unlock their full potential with capital and counsel. Headquartered in Memphis, TN, the banking subsidiary First Horizon Bank operates in 12 states across the southern U.S. The Company and its subsidiaries offer commercial, private banking, consumer, small business, wealth and trust management, retail brokerage, capital markets, fixed income, and mortgage banking services. First Horizon has been recognized as one of the nations best employers by Fortune and Forbes magazines and a Top 10 Most Reputable U.S. Bank. Benefit Highlights • Medical with wellness incentives, dental, and vision • HSA with company match • Maternity and parental leave • Tuition reimbursement • Mentor program • 401(k) with 6% match • More FirstHorizon.com/First-Horizon-National-Corporation/Careers/Our-Benefits

Keywords: First Horizon, Allen , Senior Site Reliability Engineer, IT / Software / Systems , Plano, Texas


Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest Texas jobs by following @recnetTX on Twitter!

Allen RSS job feeds