Data Engineer (Forward Deployed)

Applied Computing

Department:Data Engineer

Type:REMOTE

Region:UK

Location:United Kingdom

Experience:Mid-Senior level

Estimated Salary:£70,000 - £110,000

Skills:

POSTGRESQLPYTHONAWSDATABRICKSPYSPARKTIME-SERIES DATAKUBERNETESSPARKLAKEHOUSE ARCHITECTUREETLDATA PIPELINESINDUSTRIAL DATACLOUD COMPUTINGDATA ENGINEERINGAIMACHINE LEARNINGREAL-TIME PROCESSINGSTREAMING DATADATABASE OPTIMIZATION

Share this job:

Job Description

Posted on: March 26, 2026

About Applied Computing

Founded in 2024, Applied Computing is on a mission to deliver sustainable abundance for a growing planet through AI built for the energy industry.

Energy is an enduring necessity it powers our planet. Yet its complexity has kept the industry tethered to legacy systems, with critical decisions made on less than 10% of available data.

We built Orbital to change that. Orbital is a Multi-Foundation AI system that enables energy companies to finally trust AI in the control room, harnessing 100% of their data and optimising in real time for any metric. The result: faster decisions, safer operations, and higher performance.

In 2025, we raised $10.7 million in seed funding one of the largest Seed rounds for an AI company in the UK and we are just getting started.

We’re building the data backbone for Orbital, an industrial AI system that ingests and learns from complex refinery and process data in real time. As our Data Engineer, you’ll architect and maintain pipelines that make high-frequency time-series, lab, and historian data into a scalable Lakehouse architecture, usable for both deep learning models and real-time LLMs. You’ll be working across AWS (EKS, S3, EBS, KMS, CloudWatch) and Databricks/PySpark, ensuring data is contextualised, synchronised, and optimised for both deep learning models and real-time LLM workloads.

This isn’t a traditional ETL role, you’ll be solving problems at the intersection of control systems, industrial data engineering, and AI enablement.

Technical Requirements

Deep expertise in PostgreSQL(partitioning, indexing, query optimisation, storage design).
Strong proficiency in Python for data processing, scripting, and pipeline orchestration.
Hands-on experience with **AWS (EKS, S3, EBS, IAM, KMS, CloudWatch, etc.)**for secure and scalable data pipelines.
Proven ability to work with Databricks and PySparkfor large-scale distributed data processing.
Familiarity with time-series industrial data (control systems, DCS/SCADA logs, process historians).
Experience in unstructured data sync and management within hybrid cloud/on-prem environments.
Bonus: Experience working as a data engineer in oil and gas or energy environments
Bonus: Knowledge of streaming frameworks (Kafka, Flink, Spark Streaming) orMLOpsstacks for data versioning and lineage.

Core Responsibilities1. Ingest & Contextualise Data

Ingest from OPC UA servers, process historians, IoT sensors, LIMS systems, alarms/events, and P&IDs.
Map signals to their physical processes (tags, units, hierarchies) for interpretability in AI pipelines.

2. Data Movement & Accessibility

Build pipelines that handle real-time streaming and batch ingestion into the Lakehouse.
Manage synchronisation between historian archives, unstructured files, and AWS storage (S3/EBS).
Orchestrate DatabricksLakeflow/Connectorsfor integrating data into Lakebase/Lakehouse.
Handle secure, high-throughput transfers between historian archives and sandbox/live environments.

3. Change Tracking & Integrity

Detect and manage schema changes, signal drift, and inconsistencies acrosstime.
Implement lineage and audit trails across Spark/Databricks and AWS pipelines.

4. Data Preparation for AI

Build andmaintaindual pipelines:
Training→ large-scale historical data prep for time-series + LLM training.
Inference→ low-latency, real-time pipelines for anomaly detection, optimisation, and LLM search.
Support heterogeneous AI workloads (time-series forecasting and retrieval-augmented LLMs).

5. Database Performance & Optimisation

Tune PostgreSQLand sparkfor high-throughput time-series workloads (partitioning, indexing, query optimisation).
Optimise pipelines for both fast analytical queries and high-efficiency model training.
Deploy and manage data pipelines in **AWS EKS (Kubernetes)**with persistent EBS-backed storage.

What Success Looks Like

Live data streams are contextualised,queryable, and AI-ready.
Schema changes and signal drift are detected and handled without breaking downstream workflows.
Training and inference pipelines run smoothly in parallel, optimised for scale and latency.

What we offer

Competitive compensation plus equity
Work from home setup allowance
Private Medical
Learning and conferencing allowances
More to come

Originally posted on LinkedIn

Apply now

Please let the company know that you found this position on our job board. This is a great way to support us, so we can keep posting cool jobs every day!

Applied Computing

View company page

RemoteITJobs.app

Get RemoteITJobs.app on your phone!

Get on Google Play Get on App Store

SIMILAR JOBS

Data Engineer (Forward Deployed)

Job Description

Apply now

Applied Computing

RemoteITJobs.app

Senior Software Engineer (Agentic) – AI & Workflow Automation

Senior Software Engineer (Agentic) – AI & Workflow Automation

Data Engineer (Forward Deployed)

Data Engineer (Forward Deployed)

Senior Data Engineer

Senior Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer