Manager of Infrastructure & Operations
Lead IT infrastructure and operations for a regional government agency supporting engineers, scientists, and planners who run production-critical GIS, traffic, and environmental modeling systems. Own availability, resiliency, and operational discipline across hybrid on-prem and cloud infrastructure and multiple data centers; serve as senior technical authority for architecture, escalations, and recovery events. Manage a hybrid team model, senior systems/network engineers, internal help desk leadership, and a third-party MSP (Tier 1/2).
- Leading SCAG's first full-production disaster recovery test, ~70 VMs across two data centers (Las Vegas and Oregon), expanding beyond prior financial-system-only exercises and validating a sub-4-hour RTO against a 24-hour organizational requirement.
- Executed a live, phased DR failover of core production infrastructure (directory/authentication, certificate authority, messaging, file, endpoint, security, and backup systems), identifying and resolving issues in real time without rollback.
- Maintain 99.9% uptime SLAs through proactive monitoring, disciplined incident response, and controlled change execution.
- Operate and enforce formal change management through an established ServiceNow-based CAB, adding mandatory cross-team peer review, rollback planning, blackout windows, and structured communication.
- Implemented Privileged Identity Management (PIM) where none existed, eliminating standing administrative privileges with time-bound, justified, audited elevation, and drove organization-wide MFA to 100% coverage.
- Introduced Conditional Access and Microsoft Intune endpoint management as greenfield deployments, integrated for device-based Zero Trust access.
- Built an in-house serverless AWS monitoring dashboard in ~2 weeks, replacing a ~$60K vendor quote; migrated modeling compute (Z1D→R6A) for simultaneous cost reduction and performance improvement.
- During the July 2024 CrowdStrike outage, held mass communication ~90 minutes to verify scope and accuracy, preventing a panic / security-breach narrative across a constituency of 6 counties and 191 cities.
- Maintain a defense-in-depth posture: network segmentation, least-privilege access, and continuous monitoring across identity, network, endpoint, and email.