|
Ampcus Inc. is a certified global provider of a broad range of Technology and Business consulting services. We are in search of a highly motivated candidate to join our talented Team. Job Title: Kubernetes Engineer Location(s): Sandy Springs, GA (Hybrid: Onsite 2 days/week at Atlanta office) About the Role We're seeking a hands-on Kubernetes Engineer to build, operate, and evolve container platforms that power business-critical applications. You'll work closely with application teams and operations to deliver secure, reliable, and costefficient Kubernetes (EKS/AKS/GKE or upstream) environments, with a strong focus on automation, observability, and platform resilience. What You'll Do
- Design & Operate Clusters: Plan, deploy, and manage highly available Kubernetes clusters (cloud-managed or self-managed), including multi-cluster architectures and multi-tenant isolation.
- Platform Engineering: Implement GitOps workflows (Argo CD/Flux), manage Helm charts, and curate golden/base images and templates for repeatable, secure app onboarding.
- Networking & Security: Configure CNI (Calico/Cilium), Ingress/Service mesh (Istio/Linkerd), TLS, RBAC, and policy-as-code (OPA/Gatekeeper/Kyverno). Integrate secrets management (e.g., Vault/KMS).
- Scalability & Performance: Tune autoscaling (HPA/VPA/KEDA, Cluster Autoscaler), right-size workloads, and optimize node pools and runtime parameters for throughput and cost.
- Storage & Data: Manage persistent volumes and CSI drivers; support stateful workloads, backup/restore (Velero), and disaster recovery strategies.
- CI/CD & Automation: Build and maintain pipelines (GitHub Actions/Jenkins/Azure DevOps), IaC (Terraform), and reusable modules for cluster lifecycle (provisioning, upgrades, patching).
- Observability: Establish end-to-end monitoring and tracing (Prometheus/Grafana/Alertmanager, OpenTelemetry), centralized logging (ELK/OpenSearch), SLO/SLI dashboards, and actionable alerts.
- Reliability & Operations: Lead incident response and postmortems; drive reliability improvements, performance baselines, capacity planning, and upgrade playbooks.
- Security & Compliance: Enforce container and image security (Trivy, Aqua, Prisma Cloud), vulnerability remediation, admission controls, and compliance reporting.
- Developer Enablement: Partner with teams to containerize applications, troubleshoot deployments, and champion Kubernetes best practices and guardrails.
- Documentation & Knowledge Sharing: Author runbooks, standards, and architecture diagrams; mentor peers and evangelize platform engineering practices.
Required Qualifications
- 5-7 years overall in platform/DevOps/SRE roles with 3+ years operating Kubernetes in production.
- Proficiency with Docker/OCI, Helm, GitOps (Argo CD or Flux), and one or more service meshes (Istio/Linkerd).
- Hands-on with at least one cloud: AWS (EKS), Azure (AKS), or Google Cloud (GKE); solid understanding of managed control planes and cloud networking.
- Strong Infrastructure as Code skills (Terraform preferred) and CI/CD (GitHub Actions, Jenkins, or Azure DevOps).
- Solid Linux fundamentals, container runtime internals, networking (L3/L4/L7), and security (RBAC, OPA/Gatekeeper/Kyverno, secrets mgmt).
- Experience with observability stacks (Prometheus/Grafana/Alertmanager, OpenTelemetry) and centralized logging (ELK/OpenSearch).
- Scripting in Python, Go, or Bash for automation and tooling.
- Demonstrated ownership of cluster upgrades, break/fix, performance tuning, and cost optimization.
- Clear, concise communication skills and the ability to collaborate with crossfunctional teams in a customer-facing environment.
- Ability to be onsite 2 days/week at the Customer's Atlanta office.
Preferred/Nice to Have
- Certifications: CKA, CKAD, CKS.
- Experience with Kubernetes security tooling (e.g., Trivy, Aqua, Prisma Cloud), backup/restore (Velero), and policy-as-code frameworks.
- Knowledge of KEDA, Cilium eBPF, NVIDIA GPU Operator, Windows containers, or air-gapped/regulated environments.
- Exposure to OpenShift or hybrid/on-prem Kubernetes (kOps, Rancher).
Work Arrangement & Hours
- Hybrid: Onsite 2 days/week at the Customer's Atlanta office; remote on other days.
Ampcus is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, protected veterans or individuals with disabilities.
|