Senior Database Engineer - Platform Engineering
Join our DevOps Engineering team as a Senior Database Engineer to design, build, and engineer cloud-native database platforms across a modern, multi-engine data stack. This is an engineering role, not a DBA role, focused on building scalable systems, writing infrastructure-as-code, and embedding databases into software delivery pipelines.
You'll work closely with DevOps and Product Engineering to build high-performing data infrastructure that supports critical applications and analytics. You will own and evolve a diverse ecosystem spanning AWS RDS, Aurora, DynamoDB, Redshift, Azure SQL, PostgreSQL, Snowflake, and NoSQL engines, integrating AI-driven automation and MLOps-ready data foundations to support critical applications and machine learning workflows.
Key Responsibilities
Multi-Engine Cloud Data Architecture & Platform Engineering
- Design, build, and engineer hybrid data solutions spanning relational (PostgreSQL, Aurora, RDS, Azure SQL), columnar (Redshift, Snowflake), and NoSQL (DynamoDB, DocumentDB, OpenSearch) engines — selecting the right engine per workload.
- Architect cloud-native data lakehouse platforms on AWS using S3, Lake Formation, Glue, and open formats (Apache Iceberg, Delta Lake, Parquet), with Azure Data Lake as a secondary target.
- Implement and manage Medallion Architecture (Bronze / Silver / Gold) patterns to support raw ingestion, curated analytics, and business-ready datasets.
- Build and optimize hybrid data platforms spanning operational databases (PostgreSQL / RDS / Aurora / DynamoDB) and analytical systems (Snowflake / Redshift).
- Develop and maintain semantic layers and analytics models to enable consistent, reusable metrics across BI, analytics, and AI use cases.
- Engineer efficient data models, ETL/ELT pipelines, and query performance tuning for analytical and transactional workloads.
- Engineer replication topologies, partitioning strategies, and data lifecycle automation as code — not manual DBA operations.
- Build automated schema migration pipelines (Flyway/Liquibase) and data versioning workflows integrated into CI/CD replacing manual schema change management.
- Design and implement API-first data access patterns, enabling engineering teams to interact with databases through well-defined, versioned interfaces rather than direct connection strings.
Advanced Data Pipelines, Streaming & Orchestration
- Engineer ELT/ETL pipelines using AWS-native services (Glue, Kinesis, MSK, Step Functions, EventBridge) and modern tooling (dbt, Airflow) for batch, micro-batch, and near-real-time workloads.
- Build streaming data pipelines using AWS Kinesis Data Streams, Kinesis Firehose, and MSK (Managed Kafka) for event-driven, low-latency ingestion across multiple database targets.
- Implement data quality checks, schema enforcement, lineage, and observability across pipelines.
- Optimize performance, cost, and scalability across ingestion, transformation, and consumption layers.
- Implement change data capture (CDC) using AWS DMS, Debezium, or native engine features to synchronize data across SQL, NoSQL, and analytical systems.
NoSQL & Document Store Engineering
- Design and optimize DynamoDB schemas using single-table design patterns, GSIs, LSIs, and DynamoDB Streams for event-driven architectures.
- Architect DocumentDB (MongoDB-compatible) clusters for document workloads requiring flexible schema and hierarchical data models.
- Build and manage OpenSearch / ElasticSearch clusters for full-text search, log analytics, and observability use cases.
- Evaluate and recommend the right NoSQL engine (DynamoDB vs DocumentDB vs OpenSearch vs ElastiCache) based on access patterns, latency, and cost profile.
- Implement TTL policies, DynamoDB Accelerator (DAX), and ElastiCache (Redis/Memcached) for high-throughput caching layers.
AI-Enabled Data Engineering & MLOps Foundations
- Apply AI and ML techniques to data architecture and operations, including intelligent data quality validation, anomaly detection, schema drift detection, and query workload pattern analysis — using AWS SageMaker and Amazon Bedrock.
- Design and build ML-ready data foundations: SageMaker Feature Store, training dataset pipelines, experiment tracking, and inference data pipelines using AWS-native MLOps services.
- Integrate LLM capabilities via Amazon Bedrock for AI-assisted data documentation, query generation, lineage summarization, and automated data cataloging.
- Implement vector database solutions (pgvector on Aurora/RDS, OpenSearch k-NN) to support AI similarity search and retrieval-augmented generation (RAG) use cases.
- Build AI-powered observability using ML-driven anomaly detection on pipeline metrics, query performance trends, and data quality SLAs.
Software Engineering, DevOps & Infrastructure as Code
- Build and manage all data infrastructure as code using Terraform and AWS CDK — covering RDS, Aurora, DynamoDB, Redshift, Glue, MSK, Kinesis, Snowflake, and supporting IAM/networking components.
- Integrate database changes into CI/CD pipelines (GitHub Actions, AWS CodePipeline) with automated schema testing, data contract validation, deployment, and rollback.
- Develop internal platform tooling using Python, SQL, and AWS SDK (boto3) — building self-service capabilities that allow engineers to provision governed database environments on demand.
- Implement database-as-code practices: automated schema migrations, snapshot/restore testing pipelines, and environment clone automation — eliminating manual DBA provisioning tasks.
- Build and publish internal data platform APIs and SDKs that abstract database complexity from application teams.
Security, Governance & Compliance Engineering
- Engineer enterprise-grade data governance across all engines: RBAC, column/row-level security, field-level encryption, dynamic data masking, and comprehensive audit logging, implemented as code, not manual configuration.
- Define and enforce data contracts and ownership using AWS Lake Formation, Glue Data Catalog, and Snowflake governance — versioned and managed in source control.
- Partner with Security and Compliance teams to ensure audit readiness and regulatory alignment (SOC 2, HIPAA, GDPR where applicable).
- Manage AWS IAM policies, KMS encryption, VPC security groups, and private endpoints (PrivateLink, VPC Endpoints) for least-privilege access and network isolation.
- Implement secrets management using AWS Secrets Manager and Parameter Store with automated credential rotation for all database engines.
Qualifications
Experience
- 7+ years of experience in database platform engineering, data engineering, or cloud infrastructure engineering in production environments.
- Proven experience as a lead or senior engineer on multi-engine database platforms spanning both SQL and NoSQL workloads — with a software engineering, not administration, mindset.
- Strong track record of designing and operating data platforms at scale in AWS environments, with databases managed as code from day one.
AWS & Cloud Databases
- Deep hands-on expertise with AWS RDS (PostgreSQL, MySQL, Oracle), Aurora (Serverless v2, Global Database), and RDS Proxy.
- Production experience with DynamoDB: single-table design, GSI/LSI strategy, Streams, DAX, and capacity planning.
- Working knowledge of AWS Redshift, Glue, Lake Formation, Kinesis, MSK, and EventBridge for pipeline and lakehouse architectures.
- Familiarity with Azure SQL, Azure Data Factory, or Azure Synapse is a plus.
Snowflake
- Strong hands-on Snowflake experience: performance tuning (clustering, materialized views, query profiling), cost optimization (warehouse sizing, auto-suspend, credits), security (RBAC, dynamic masking, network policies), and data sharing.
SQL, NoSQL & Data Modeling
- Deep SQL expertise across multiple engines (PostgreSQL, T-SQL, Snowflake SQL, DynamoDB PartiQL).
- Strong understanding of Medallion Architecture, semantic layers, and analytics engineering best practices.
- Proven NoSQL data modeling: DynamoDB single-table design, document store schema design, and search index architecture.
Pipelines & Orchestration