This role will join our Observability team, specifically working on our Splunk infrastructure.
It will be responsible for handling internal user support tickets, client escalations and help us revamp our deployment mechanisms.
The ideal candidate will have experience with Splunk, Infrastructure as Code, AWS, Configuration Management (Salt) and Deployment tools (Github Actions, Jenkins, etc).
Required Skills
BS and 3-5 YOE
Experience with internally hosted logging systems like Splunk, Loki, Elastic, Clickhouse, assisting clients and improving environment performance and stability
Programming experience with languages like Python; Experience building integrations and applications to large-scale Observability environments.
Experience designing and implementing systems for fault tolerance, scalability and stability.
Experience developing, deploying and running distributed applications on cloud platforms.
Ensure the highest level of up-time and Quality of Service (QoS) to Client's customers through operational excellence
Knowledge of (public and/or private) cloud
Experience in designing and maintaining production monitoring systems
Experience in solving performance and stability issues using a wide variety of tools
Exceptional communicator in and across teams, driving projects to completion
Impacts the organization through contribution to technical direction and strategic decisions.