AllenRecruiter Since 2001
the smart solution for Allen jobs

Site Reliability Engineer

Company: S3
Location: Irving
Posted on: April 2, 2026

Job Description:

Job Description Strategic Staffing Solutions is currently looking for a Site Reliability Engineer , a W2 contract opportunity with one of our largest clients! This is a W2 contract opportunity, and the candidates should be willing to work on our W2 ONLY, NO C2C. Site Reliability Engineer Location: Irving, TX/Charlotte, NC /Phoenix, AZ Type: W2 Contract – 12-month contract Work Schedule: Hybrid Schedule: 3 days in office Top Skills - Strong Kubernetes (K8s) experience (OCP/OpenShift preferred) - Hands-on Harness (CD tool) experience - DevOps / SRE background (5 years overall) - CI/CD platform support (NOT just usage) - Cloud exposure (OCP primary; Azure/GCP acceptable) Platform Ownership & Reliability (SRE): -Support end-to-end reliability, availability, and performance of the Harness CD platform across non-prod, prod, and BCP environments -Maintain and report on SLIs, SLOs, error budgets, deployment success rates, and platform health metrics -Lead incident response, troubleshooting, and RCA for deployment failures, delegate outages, or platform performance issues -Identify and remediate scaling, performance, and capacity constraints across delegates, pipelines, Kubernetes clusters, and cloud integrations Automation & Engineering Excellence: -Develop automation for provisioning, configuration, scaling, upgrades, and maintenance of Harness components -Build Infrastructure as Code (IaC) using Terraform, Ansible, Helm, or equivalent tools -Automate common operational tasks including delegate lifecycle, cluster onboarding, secret rotation, and pipeline validation -Reduce manual work by implementing resilient, repeatable, and self-service automation workflows DevOps & CI/CD Integration: -Maintain and enhance Harness integrations with GitHub, Jenkins, Azure DevOps, Kubernetes/OpenShift clusters, and cloud providers -Ensure an efficient developer experience through well-optimized pipelines and reliable deployment mechanisms -Partner with DevOps teams to optimize orchestration strategies (blue/green, canary, rolling) -Work with Security teams to embed DevSecOps controls such as policy enforcement, governance pipelines, and security checks Observability & Monitoring: -Implement and maintain monitoring, logging, dashboards, and alerting for all Harness components -Use Splunk, Prometheus, Grafana, AppDynamics, or similar tools to build actionable alerts -Detect and escalate issues such as delegate saturation, pipeline slowdowns, API failures, and Kubernetes resource constraints -Support proactive monitoring to reduce mean time to detection and resolution Modernization & Continuous Improvement: -Assist with Harness upgrades, hotfixes, patching, and vendor-recommended lifecycle activities -Contribute to modernization efforts including containerization, cloud-native deployments, and multi-cloud expansion -Support resiliency improvements such as BCP validation, backup verification, and BCP readiness -Evaluate new Harness features, modules on platform capabilities for enterprise usage Technical Leadership: -Act as a technical SME for Harness platform operations and enhancements -Provide platform guidance, documentation, architecture details, and runbook development -Partner with senior engineers to improve standards, automation patterns, and operational excellence Required Qualifications: Core Technical Skills: -5 7 years of experience in DevOps, SRE, Platform Engineering, or Cloud Engineering roles -Hands-on experience with Harness CD -Strong experience with Kubernetes/OpenShift, Linux, cloud services and deployment best practices -Solid understanding of CI/CD workflows and software release automation SRE & Automation: -Experience applying SRE concepts such as SLIs/SLOs, error budgets, and operational maturity improvements -Strong automation/scripting skills using Python, Bash, or PowerShell -Infrastructure as Code experience with Terraform, Ansible, Helm, or equivalent tooling Observability & Troubleshooting : -Experience with observability tools (Prometheus, Grafana, Splunk, ELK, AppDynamics, etc.) -Strong troubleshooting skills across container, OS, networking, platform, and cloud technology layers Preferred Qualifications: -Experience supporting CD platforms at enterprise scale (hundreds of teams, multi-region deployments) -Experience in cloud-native and hybrid cloud environments (Azure, GCP) -Familiarity with DevSecOps practices, policy automation frameworks, and governance models -Experience supporting complex upgrades, platform migrations, or modernization projects “Beware of scams. S3 never asks for money during its onboarding process.”

Keywords: S3, Allen , Site Reliability Engineer, IT / Software / Systems , Irving, Texas


Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest Texas jobs by following @recnetTX on Twitter!

Allen RSS job feeds