Resource will be part of a team responsible for reliability and support of Container (Openshift) on-prem and external cloud (MS Azure/AWS/Google). This includes monitoring and troubleshooting alerts and incidents related to the platforms, and any required Incident and Problem Management. Application onboarding, troubleshooting, and support throughout the lifecycle. The role will require weekend on-call coverage and shift coverage as part of 24x7 Global Ops team. Resource will liaison regularly with teammates and shift leads. Additionally, as part of support will routinely interact with platform clients and vendors. BS/MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.
Skills and Requirements:
• 5+ years of hands-on experience supporting Kubernetes/Openshift/RKE/EKS Container platform.
• Experience with Python, Ansible, Golang, and shell scripting.
• Kubernetes/Openshift/Terraform certifications are a plus.
• Strong experience in major services related to Compute, Storage, Network and Security.
• Experience with monitoring tools like Prometheus and Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics.
• Strong understanding and background of working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and Ping Identity or other SSO solutions.
• Advanced knowledge of Linux OS, DNS, DHCP, Kerberos, and Windows Authentication.
• Experience with CI/CD tools git/Jenkins, GitOps model.
• Excellent understanding of Linux/Windows operating systems administration.
• Experience in Container security and vulnerability remediation.
• Systematic problem-solving approach, sense of ownership and drive.
• Ability to juggle competing priorities and adapt to changes in project scope.
• Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
• Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
• Experience in Openshift, RKE, CSP Kubernetes services such as AKS and EKS.
• Experience in Terraform, ArgoCD, Tekton, and K-native technologies.
• Experience in agile deployment methodologies (GitOps).
• Knowledge of various container runtimes.
• Familiarity with the operator deployment pattern.
• Experience working in a highly available multidatacenter environment.
• Experience working with monitoring tools such as Prometheus, Splunk, Dynatrace, Sysdig, or similar tools.
• Understanding of cost management, inventory management, FinOps model.