Senior Data Engineer
Technology
Atlanta, GA (Hybrid)
150,000 - $170,000 + Equity
The Company
Our client is transforming safety standards for infrastructure projects through advanced AI and Machine Learning. Their technology provides precise strength assessments, greatly improving safety and reliability.
Role Description
As a Senior Data Engineer, you will be instrumental in designing and optimizing scalable data pipelines to support machine learning and analytics applications. Your role will involve architecting a robust data infrastructure, working closely with ML engineers, and ensuring the efficient storage and processing of large-scale industrial and vision data.
Key Responsibilities
- Design and develop scalable data pipelines with Dagster to facilitate data transformation and movement for analytics and machine learning.
- Optimize storage solutions for large-scale industrial and vision data, ensuring efficient access and processing.
- Build data ingestion frameworks to support real-time and batch processing of images, video, and metadata.
- Collaborate with ML engineers to structure data for effective model training and experimentation.
- Deploy and maintain data pipelines within Kubernetes-based environments.
- Automate CI / CD workflows for data infrastructure using GitLab CI / CD.
- Manage AWS-based data infrastructure, utilizing Terraform for Infrastructure as Code.
- Enhance batch and real-time processing frameworks for improved scalability and performance.
- Provide technical leadership by defining best practices and guiding future data scaling strategies.
Requirements
Experience in data engineering with a focus on scalable, production-grade data infrastructure.Strong expertise in Python (Pandas, PyArrow, Dask, etc.).Hands-on experience with data orchestration tools (Dagster, Prefect, or Airflow).Proficiency in Kubernetes for data pipeline orchestration.Experience deploying infrastructure via Terraform (or similar IaC tools).Cloud expertise, preferably AWS, including S3, EKS, Lambda, Glue, and RDS.Experience with streaming and event-driven architectures such as Apache Ray Core, Kafka, Kinesis, Pulsar, or Storm.Strong database skills, including SQL, NoSQL, and columnar storage (Postgres, BigQuery, ClickHouse).CI / CD expertise with GitLab for managing automated data pipeline deployments.The Benefits
As a Senior Data Engineer, you will receive a base salary of $150,000 to $170,000 + equity and benefits like health, dental, and vision.
How to Apply
Please register your interest by sending your resume to Neha Nidamarti via the Apply link on this page.
Key Words
Python, Dagster, Kubernetes, Terraform, AWS, Data Pipelines, CI / CD, Streaming Data, Apache Ray, Kafka, Computer Vision, Machine Learning, SQL, Big Data, Data Engineering