Robotics Data Engineer
All the best with your application!
Want more jobs like this straight to your inbox?
Get Job Alerts
Get a curated list of the top robotics roles delivered straight to your inbox each week. We sift through hundreds of postings to find the high-salary positions, leading companies, and remote opportunities you actually want.
Unsubscribe anytime. We respect your privacy.
Summary
Atlanta, United States
Full-time
About this Job
About the role
This role is designed for a robotics data engineer who can build and own the data foundation behind our autonomy, perception, GNC, flight sciences, and test efforts.
Data is the lifeblood of our engineering effort. We are a deeply data-driven team: we evaluate algorithms, characterize hardware, troubleshoot in the field, and train the models behind our autonomy by leaning hard on the data our robotic systems generate.
We generate an enormous amount of it. Our field-testing tempo is high and our vehicles evolve quickly, producing a constant stream of new data that never stops growing. Our engineers already extract real value from this data every day. What we need now is to improve the systems that take a fast-growing volume of mission data across many sources and formats and turn it into a single, synchronized, queryable asset the whole company can build on.
You will build and own that foundation: a unified data backbone that serves as the company’s source of truth across both high-rate time-series and imagery data. You will be responsible for the entire data lifecycle, from logging onboard the vehicle, through transfer, storage, and organization, to the analysis, tooling, and model-training pipelines that let every engineering team self-serve the cross-section of data they need.
This is a high-ownership role for someone who can bring order and cohesion to a demanding data landscape, then mature it into robust, automated infrastructure the whole company depends on.
The problem has wrinkles particular to our domain. A single mission can involve multiple vehicles, interceptor and target tracks, and increasingly swarms, producing parallel, heterogeneous data streams that must be synchronized and made coherent before they mean anything. Our hardware and software also evolve rapidly, so data must be versioned by configuration to keep comparisons across tests trustworthy.
The fundamentals are shared with any high-performance UAS program, but doing this well against multi-vehicle, multi-entity flight data is where the real challenge lives.
What you'll do
You will own the data lifecycle end to end, from what gets logged onboard the vehicle to the tooling engineers use to make sense of it.
You will focus on problems in all of the following areas:
- Pipeline & Infrastructure: Design and build the unified data backbone that ingests, parses, validates, and synchronizes data across multiple vehicles, sources, and modalities, aligning time-series telemetry and imagery into a single coherent timeline.
- Mission Data Warehousing: Develop a warehousing strategy that converts raw logs into efficient, queryable formats serving as the company’s source of truth for all mission data.
- Storage Management: Manage and scale storage for the large datasets generated by flight testing.
- Data Provenance & Versioning: Track data provenance and version it against the software and hardware configuration it came from, so analyses can filter by flight type, vehicle configuration, sensor payload, and software version while avoiding the mixing of incompatible data.
- Onboard Logging: Partner with autonomy on, or take full ownership of, hardening the onboard C++ logging framework for efficiency, reliability, and completeness.
- Analysis Tooling: Develop and maintain core libraries for programmatic, offline-first data analysis: the primary way engineers load mission data, run analysis, and generate consistent, high-quality visualizations for reports, debugging, and development.
- Validation & Regression Automation: Build automated validation and regression pipelines integrated with CI/CD, flagging performance deviations automatically once mission data lands.
- Reusable Engineering Workflows: Mature existing tooling into a fully featured suite so engineers can compose analysis tasks without writing ad hoc code.
- Retrieval & Query Systems: Build retrieval and query systems on top of the backbone, letting teams pull the exact cross-section of data they need by flight context, software/hardware version, or content.
- Cross-Functional Analysis Support: Work directly with GNC, perception, flight sciences, and state estimation engineers to understand their data needs and build the routines that characterize system performance and troubleshoot issues.
- ML Data Enablement: Build the data infrastructure that feeds model training and evaluation, including versioning, labeling, and curating datasets so they are ready for ML pipelines.
- Complex Data Analysis: Take on complex, data-heavy analysis tasks directly, including the backlog the team has already identified.
Basic Qualifications
- Education: Degree in Computer Science, Robotics, or a related technical field, or equivalent practical experience.
- Experience: A few years of professional experience in a data engineering, robotics, or backend software role, building systems that others depend on.
- Python: Strong proficiency in Python and its data ecosystem, including Pandas and NumPy, with the software discipline to build maintainable libraries rather than one-off scripts.
- Large, Complex Datasets: Proven experience structuring and manipulating large, complex datasets, especially time-series data drawn from multiple unsynchronized sources.
- Travel: Ability to travel as needed to support field and flight-test operations.
Preferred Qualifications
Experience in any of the following areas is a plus:
- C++ & Onboard Logging: Working knowledge of C++, sufficient to develop and maintain onboard logging software.
- Robotics Data Formats: Hands-on experience with data formats common in robotics and data engineering, including uLog, ROS 2 bags, MCAP, Parquet, Protobuf, HDF5, or related formats.
- Data Visualization: Strong skills in data visualization for analysis and reporting, including high-quality programmatic plots using Matplotlib, Plotly, Seaborn, or related tools.
- Interactive Analysis Tools: Experience building interactive, exploratory frontends that help engineers navigate and make sense of complex mission data, including Plotly Dash, Bokeh, Streamlit, Rerun, Foxglove, PlotJuggler, or related tools.
- Databases & Query Systems: Experience with relational databases, such as PostgreSQL, and SQL for building queryable data stores.
- Cloud Storage: Experience with cloud storage and its access patterns, including GCS buckets, object metadata, IAM, or related systems.
- CI/CD & Automated Testing: Experience building and maintaining CI/CD pipelines for data processing and automated testing.
- Communication Protocols: Familiarity with communication protocols used to move data between systems, including ZMQ, WebSocket, or related protocols.
- Low-Level Storage & I/O: Familiarity with low-level Linux storage and I/O internals, including ext4, memory-mapped I/O, page cache, io_uring, NVMe, zero-copy, fsync/durability, and the ROS 2 stack used for onboard logging.
- MLOps: Familiarity with MLOps principles and tools for dataset versioning, experiment tracking, and training-pipeline automation.
- VLM-Assisted Data Workflows: Comfort scripting against off-the-shelf VLMs, including Gemini, Llama, or related models, for tasks such as data labeling and curation.
This position may involve access to technology, material, technical data, defense articles, or information subject to U.S. export-control laws, including the International Traffic in Arms Regulations (ITAR), the Export Administration Regulations (EAR), and applicable contract requirements. Assignment to covered work is contingent upon the company’s ability to verify that the candidate is authorized to receive access to such items or information, including by qualifying as a “U.S. person” as defined in 22 C.F.R. § 120.62, or through any required export-control authorization, notice, approval, or access-control process.
About the Company
