View All Jobs 141227

Data Engineer

Build and deploy high-throughput data pipelines for cybersecurity use cases
Santa Clara, California, United States
Mid-Level
yesterday
Veterans Staffing

Veterans Staffing

A platform dedicated to connecting U.S. military veterans with employment opportunities and career development resources.

656 Similar Jobs at Veterans Staffing

Data Engineer

We're seeking a highly skilled Data Engineer to design, build, and maintain production-grade data pipelines that process and transform terabytes of data. This role involves close collaboration with data scientists and software engineers to ensure data infrastructure is scalable, reliable, and cost-effective.

Pipeline Development & Deployment - Architect, develop, and deploy batch and streaming pipelines using Airflow and containerized workflows for cybersecurity use cases. Containerize data-processing jobs with Docker, orchestrate with Kubernetes, and manage releases using Helm charts.

Distributed Computing - Build high-throughput data transformations using Dask or Apache Spark. Maintain training data clusters across hybrid environments (on-prem and cloud). Optimize training jobs for performance, resiliency, and cost efficiency.

Monitoring & Reliability - Implement observability solutions (logging, metrics, alerting) to maintain pipeline health and SLA adherence. Troubleshoot, debug, and resolve data-processing failures in production environments.

Collaboration & Best Practices - Work with cross-functional teams to define data contracts, schemas, and quality checks. Enforce software engineering best practices: CI/CD, code reviews, automated testing, and documentation.

Data Modeling & Storage - Design and maintain data models and schemas for AI/ML continuous training use cases. Load and manage data in cloud storage and data lakes, ensuring performance and accessibility.

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.

Skills and Requirements - 3–5 years of professional experience designing and operating production data pipelines at scale. Containerization & Orchestration: Expertise with Docker, Kubernetes, and Helm. Workflow Management: Hands-on experience building DAG-based pipelines in Apache Airflow. Programming: Strong proficiency in Python for data engineering tasks. Distributed Frameworks: Practical experience with Dask or Apache Spark for large-scale data processing. Cloud Fundamentals: Familiarity with deploying and managing services in a cloud environment. Compiled Languages: Experience writing data services in Go or Rust. GCP Proficiency: Hands-on with Google Cloud services (e.g., Pub/Sub, Big Query, Cloud Storage, GKE). Equivalent experience in other public cloud providers is fine. ML Pipelines: Exposure to deploying cross-cluster model-training workflows using Ray or similar frameworks. Infrastructure as Code: Familiarity with Terraform for deployment. Security & Compliance: Knowledge of data governance, encryption, and role-based access control.

+ Show Original Job Post
























Data Engineer
Santa Clara, California, United States
Engineering
About Veterans Staffing
A platform dedicated to connecting U.S. military veterans with employment opportunities and career development resources.