Location: Providence, RI, US, 02903
Requisition ID: 17859
Brightstar is an innovative, forward-thinking global leader in lottery that builds on our renowned expertise in delivering secure technology and producing reliable, comprehensive solutions for our customers. As a premier pure play global lottery company, our best-in-class lottery operations, retail and digital solutions, and award-winning lottery games enable our customers to achieve their goals, fulfill player needs and distribute meaningful benefits to communities. Brightstar has a well-established local presence and is a trusted partner to governments and regulators around the world, creating value by adhering to the highest standards of service, integrity, and responsibility. Brightstar has approximately 6,000 employees.
We are seeking a Cloud/Site Reliability Engineer to join our Cloud Infrastructure Engineering, Operations & Automation team. This role is designed for engineers who are passionate about building resilient systems, preventing incidents before they occur, and driving operational excellence through intelligent monitoring, AI-driven automation, and continuous improvement. You'll play a pivotal role in evolving our cloud-hosted environments to be more self-aware, self-healing, and scalable, ensuring high availability and performance of our applications and services, and contributing with your investigation on issues that are meant to facilitate the engagement of L3 product engineers in case of production incidents.
As a Cloud/Site Reliability Engineer, you will focus on Level 2 (L2) operational ownership with a strong emphasis on proactive monitoring, root cause analysis, and automation-driven remediation:
Hands-on experience in cloud operation or site reliability engineering field. Practical experience in public cloud infrastructure and services management (Azure / AWS public cloud knowledge would be preferred). Proficiency in scripting and automation (Terraform, PowerShell, Python, Bash). Experience with Infrastructure as Code (IaC) and GitOps principles. Hands-on experience on K8s and containers orchestration. Expertise in monitoring tools (Dynatrace, Datadog, Prometheus, ELK). Strong analytical, troubleshooting, and communication skills.
Apply Agentic AI techniques to drive intelligent automation, optimize cloud services, accelerate troubleshooting and root-cause analysis, and enhance system resilience and recoverability. Familiarity with AI/ML Ops or AI-assisted observability tools. Thorough understanding of Java application workloads, and Java performance related topics. Deep knowledge of one programming language (Java/ Python / Go). Strong Linux and networking skills. Understanding software architecture patterns and app-dev principles. Public cloud certifications would be considered as a plus. Experience in a 24/7 operations environment.
You'll be part of a forward-thinking Cloud Infrastructure Engineering, Operations & Automation team that values prevention over reaction, automation over repetition, and collaboration over silos. Your work will directly contribute to building a more resilient, scalable, and intelligent cloud ecosystem.
Building collaborative relationships. Decision making. Drive results. Foster innovation. Personal energy. Self-leadership