⚙️ Composer — Data Pipelines Drag-and-drop operators The AI Data Engineer's tool

Every pipeline, drag-and-dropped.

Drag. Drop. Deploy. No Python. No lock-in. No waiting.

Composer turns ETL operators into visual blocks. Build a production-grade workflow on a canvas — wire a Dataset, blend with Data Blend, transform, then map and load with MIL. Powered by xAQUA's in-memory query engine. Version it in Git. Deploy with one click. First pipeline on Day 1.

Built for data engineers who've had enough of boilerplate, and for analysts who shouldn't need Python to move data. Composer composes the workflow. Your team owns the outcome.

See Composer in action → Request Demo ↓ Download Brochure PDF · 502 KB

⚙️

Composer

Visual workflow editor · live

Building

🚀 Drag from the operator library. Drop on the canvas. Wire it up. Composer generates the workflow — runs on the in-memory query engine, deploys to K8s.

Composer in production

Drag · Drop

Visual Editor

Day 1

First Pipeline

Python Required

Pipeline Templates

In-Memory

Query Engine

Why Composer Exists

Pipelines shouldn't be a full-time job.

Most data teams spend more time writing and maintaining pipeline code than getting value from the data that moves through it. The scripts sprawl. Schema breaks on Friday at 5pm. Nobody remembers why the cron exists. By the time a new source lands, you're three sprints behind.

Composer replaces the code with a canvas. xAQUA's in-memory query engine runs the workflow underneath — pipelines auto-generate, version in Git, and deploy through CI/CD to Kubernetes. Schema contracts catch breaks at design time. Quality operators catch bad data before it leaves the pipeline. Observability tells you what broke, and why, in minutes.

Pipelines aren't code. They're an operating model.

The Composer Foundation

More than ETL.
Built on a foundation.

Composer isn't another drag-and-drop ETL canvas. It's the tool of xAQUA's AI Data Engineer, sitting on a foundation engineered to solve every critical data pipeline challenge — semantics, lineage, observability, master data, migration testing — at the root.

PILLAR 01 · IDENTITY

Operated by the AI Data Engineer

Composer is the tool of xAQUA's AI Data Engineer — an AI agent that proposes pipelines, configures operators, and validates contracts in plain English. Powered by Active Metadata. You review, approve, and steer. Augmentation, not replacement.

Agent-driven NL pipeline construction Human-in-the-loop

PILLAR 02 · UNDERSTANDING

Semantic Layer Foundation

Powered by SemantIQ. Composer understands both sides of every pipeline — source and target — in business terms. Source-to-target field mapping is auto-generated, not hand-documented.

SemantIQ Auto schema inference Auto mapping

PILLAR 03 · LINEAGE

End-to-End Column-Level Lineage

Active Metadata from SemantIQ tracks every transform at the column level. Forward impact: "if I change this, what breaks?" Backward root-cause: "this dashboard is wrong — where did the data come from?"

Column-level Forward impact Root cause

PILLAR 04 · MIGRATION TESTING

Natural Language Migration Testing

The killer capability. xAQUA Analytics Data Lake lets you reconcile source and target migrations in plain English — no SQL, no scripts. Ask "Do Q3 totals match?"; get row counts, sums, deltas, and the rows that don't reconcile. Migration testing that used to take weeks, in minutes.

Analytics Data Lake ConverseSQL No SQL required

PILLAR 05 · TRUST

Built-in Observability & Trust Score

SLA tracking, anomaly detection, schema drift alerts, and dataset-level Trust Scores are not bolted on — they're built into every operator. Quality gates fire before bad data leaves the pipeline.

SLA · Drift · Anomaly Trust Score Quality gates

PILLAR 06 · MASTER DATA

Master Data Built-In

Automated MDM, Probabilistic Entity Resolution, and SCD-0/1/2/3 strategies — all built into Composer's MIL operator. No separate MDM tool. Customer 360, Patient 360, Member 360 — by configuration.

Automated MDM Probabilistic ER SCD-0/1/2/3

Active Pipeline Health

Data engineers spend 60% of their time fixing broken pipelines.
Composer fixes that.

Schema drift. Silent failures. Cascading downstream errors. The firefighting tax. Composer collapses it with a four-part defense — prevent at design time, detect in real time, trace through end-to-end lineage, alert before bad data leaves the gate.

60%

The firefighting tax

Industry research

The data engineer's biggest line item isn't building. It's fixing.

Most data teams report spending roughly 60% of their working time investigating, diagnosing, and repairing pipelines that broke overnight — schemas that drifted, sources that changed, queries that silently returned the wrong rows. That's three days a week, per engineer, lost to firefighting. Composer reclaims those days. Pipelines built on Composer don't break the same way — and when something does shift upstream, you know it minutes after deploy, not the morning the dashboard is wrong.

Why 60% — three reasons

CAUSE 01 · BUILD

Every new source means a custom Python script, a code review, a deploy, and a hope-for-the-best Monday.

CAUSE 02 · DEPLOY

Schema breaks in prod because nobody validated the contract at design time. Three downstream reports are already wrong.

CAUSE 03 · OPERATE

An orchestrator runs the pipeline. Someone else watches quality. A third tool does lineage. Nothing talks.

←

Shift Left · Prevent breaks before they happen

Battle-tested operators from Athyna

Composer's operators come from Athyna — the same transformations data analysts have already tested interactively on real data in the studio. By the time they land in a production pipeline, they've been proven. Fewer novel transformations means fewer novel breaks.

Pre-validated · Reusable

SemantIQ schema contracts

Schema contracts validate at design time and runtime. SemantIQ tracks every source schema; when a column is added, removed, or retyped, contracts catch it — before the next run executes, not after the dashboard is wrong. Broken contracts surface in both editors and pipelines.

SemantIQ · Active Metadata

→

Shift Right · Detect and diagnose in seconds

Real-time observability

SLA tracking, throughput, latency, row-count anomaly detection, distribution drift. Slack, Teams, and PagerDuty routing. Know in minutes — not the next morning — that something is off, and exactly which run, which operator, which row count is suspect.

Live · Per-step

End-to-end column-level lineage

SemantIQ tracks every transform with column-level lineage. Forward: "if I change this column, what breaks?" Backward: "this dashboard is wrong — where did the data come from?" One graph. Every dependency. Root cause in seconds, not days.

SemantIQ · Column-level

SEMANTIQ ACTIVE METADATA Schema change → instant impact analysis LIVE ALERT · 2 min ago

✓ Caught at design time. SemantIQ's column-level lineage flagged the change before the next scheduled run, with a forward-impact list ready for review.

contract: members_daily/v2.3.yaml · alerted: #data-engineering

Legacy Migration & Modernization

From legacy systems to a modern stack.
In weeks, not years.

Government agencies are stuck on mainframes. Commercial firms are stuck on systems someone wrote in 1998. Both face the same trap: undocumented business rules, opaque schemas, and migration projects that overrun every estimate. Composer breaks the trap. Built on a semantic-layer foundation that understands both sides of the migration — your legacy schema and your target system — Composer auto-generates the mapping, enforces master-data quality, and lets you reconcile source and target in plain English.

01 · UNDERSTAND

Both sides of the migration, understood.

SemantIQ models the semantics of your legacy source and your target system — Salesforce, Snowflake, Databricks, BigQuery, whatever you're migrating to. With both sides understood, source-to-target field mapping is auto-generated, not hand-documented.

Semantic layer for source & target
Auto-inferred schemas at design & runtime
Auto-generated source-to-target mapping
Documented lineage from day one

02 · ENSURE QUALITY

The highest quality migrated data.

Migration that loses or corrupts master data isn't migration — it's data debt with a new database. Composer's quality engineering is built into every operator: profile, cleanse, deduplicate, resolve, and history-track on the way through.

Auto data profiling — structure · pattern · value · integrity
Automated Master Data Management
Probabilistic Entity Resolution (golden records)
SCD-0, SCD-1, SCD-2, SCD-3 — all built in

03 · RECONCILE

Reconcile source & target in plain English.

The killer feature: xAQUA Analytics Data Lake lets you virtually reconcile source and target — without writing a single line of SQL. Ask in English: "Do Q3 totals match?" The data lake responds with row counts, sums, deltas, and the rows that don't reconcile.

Plain-English reconciliation queries
NL → ConverseSQL → in-memory query engine
Row count · sum · checksum · field-level diff
Discrepancy detail report, instantly

XAQUA ANALYTICS DATA LAKE Migration testing & reconciliation, in plain English LIVE · NO SQL REQUIRED

👤

Reconcile Q3 2024 benefit payments between the legacy mainframe (BENEFITS_HIST) and the Snowflake target (warehouse.payments). Are totals and row counts identical?

⚡

Running reconciliation across both data sources. Results below.

RECONCILIATION · Q3 2024 PAYMENTS ⚠ DISCREPANCY DETECTED

SOURCE → TARGET

Source Legacy.BENEFITS_HIST

Target snowflake.warehouse.payments

ROW COUNT

source 12,847,103

target 12,847,089

delta ⚠ -14 records

SUM (PAYMENT_AMT)

source $4,287,341,022.18

target $4,287,338,994.61

delta ⚠ $2,027.57 (0.00005%)

CHECKSUMS BY CATEGORY

Standard ✓ matched

Reversed ⚠ 14 records · 9/28–9/30

Adjusted ✓ matched

⚡

Discrepancy isolated to the cutover window (Sep 28–30) — likely reversed-payment edge cases that didn't replay. Want me to auto-generate a remediation pipeline in Composer to backfill these 14 records?

CASE STUDY · CALIFORNIA STATE AGENCY Salesforce Migration · Production

datasets migrated

weeks · DEV → TEST → PROD

fractional analyst

From legacy chaos to a clean Salesforce CRM — without an army of consultants.

A California state agency ran a tangle of legacy datasets in diverse formats, with severe data quality problems and no reliable master or reference data. Compliance reporting depended on manual reconciliations. They needed to migrate to Salesforce — fast, with audit-grade quality.

Using xAQUA Athyna with natural-language transformations (NL → ConverseSQL → in-memory query engine) for prep, and xAQUA Composer for no-code ETL into Salesforce, the team profiled, cleansed, deduplicated, and loaded six datasets through DEV → TEST → PROD with one fractional analyst. Master-data uniqueness was enforced with SCD-0 and SCD-1 strategies built directly into Composer's MIL operator.

Migrated: Compliance Tracking V1 · Compliance V2 · ISR (Farm Monitoring) · NASS (Agricultural Statistics) · plus reference datasets

Master & Reference Data

Account Contact Location Address Account Contact Associated Location Commodity Commodity Category Regulatory Code

Nine Pipeline Templates

Every pipeline pattern —
templated and ready.

Drag a template. Configure the sources. Deploy. Nine production-grade starting points for the patterns teams build every quarter.

🏢

Cloud DW Integration

Snowflake · Databricks · BigQuery · Redshift. Full schema inference.

🔄

ETL / ELT Pipeline

Batch or streaming. Any source, any target. Contract-enforced.

🤝

Data Sharing

Partner DaaS and API gateway. Governed, masked, metered.

🧠

ML Training Prep

Acquire, profile, split — feed features into model training.

🤖

ML Data Pipeline

Train · Evaluate · Test · Package. End to end, no notebooks.

🧬

Multi-Domain MDM Hub

Patient · Member · Customer 360. Probabilistic entity resolution built in.

🏛️

Legacy Migration

Extract · cleanse · modernize. 18-month projects in 6 weeks.

🧹

Data Wrangling

Merge · dedupe · aggregate · filter. Reusable across pipelines.

🔍

Data Profiling

Structure, value, integrity. Profile the source before you move it.

What Composer Does

Pipelines as a first-class operating model.

🧩

Visual Workflow Composer

Drag operators onto a canvas. Wire them up. Composer generates the workflow definition, validates it, and runs it on xAQUA's in-memory query engine. No boilerplate — ever.

Visual task configuration with JSON schema
Sub-workflow reuse and parameterized templates
Auto-generated workflow definition file

📦

Purpose-Built Operator Library

Dataset, Data Blend, Transformation, MIL, Data Load, Python, Salesforce, MySQL→SQS, SQS→Email and more. All in a searchable library. Drag into any pipeline. Configure visually.

Database · SaaS · file · API connectors
Custom operator onboarding in minutes
Versioned, governed operator library

🛡️

Schema Contracts & Quality Gates

Enforce schema contracts at design time and runtime. Quality operators gate every step — bad data doesn't make it past the fence.

Design-time and runtime schema validation
Built-in DQ operators — validate, dedupe, resolve
Probabilistic Entity Resolution operator

⚡

CDC & Streaming

Real-time and near-real-time sync via Apache Kafka. Pull change data from Salesforce, ServiceNow, SAP, or any operational database. Ship it in seconds.

Kafka-native CDC streams
API polling for SaaS systems
Batch, micro-batch, and true streaming

🔀

Git-Versioned · CI/CD-Deployed

Every pipeline is automatically versioned in Git. Every deploy runs through CI/CD. Every promotion is reviewable. Engineering process, built in.

Integrated GitHub repository
One-click deploy to Kubernetes
Environment promotion · dev → staging → prod

📡

Observability Built In

SLA tracking, anomaly detection, and alerting across every pipeline. Know in minutes — not the next morning — why a pipeline is off.

SLA, throughput, and latency tracking
Anomaly detection on row counts and distributions
Slack, Teams, PagerDuty routing

The Pipeline Lifecycle

One platform. Five steps. Zero handoffs.

Create. Modify. Deploy. Run. Monitor. The whole loop, on one canvas — no second tools, no copy-paste between systems.

Operator
Catalog

Pipeline
Repository

Metadata
Repository

UDP 360
Database

GitHub
Integration

CI/CD
Pipeline

Docker
Registry

Universal Connectivity

Connect anything to anything.

Any source. Any target. Any format. Composer's operators handle the integration tax — extract, transform, resolve, and load across every system you run.

01 · EXTRACT

Pull from any source

Real-time, batch, or streaming. Out-of-the-box connectors for relational, NoSQL, SaaS, cloud storage, and file formats.

02 · TRANSFORM

Cleanse, blend, aggregate

Filter, sort, pivot, merge, group, impute, and reshape — visually configured, run on the in-memory query engine.

03 · ENTITY RESOLUTION

Match across systems

Probabilistic Entity Resolution operator unifies records — Customer 360, Patient 360, Member 360 — without custom code.

04 · MAP & LOAD

Deliver to any target

SCD-0/1/2/3 strategies, append or upsert, schema mapping. Land into warehouses, lakes, databases, or DaaS APIs.

Four Operators · One Pipeline

Compose the canvas. Composer runs the workflow.

Drag a Dataset Operator to extract. Add a Data Blend Operator to integrate sources. Cleanse and aggregate with the Transformation Operator. Map and load to your target with the MIL Operator. Wire them up — Composer generates the workflow and runs it on xAQUA's in-memory query engine. Version it in Git. Deploy to your K8s cluster.

In-memory query engine · zero infrastructure tax
Schema contracts enforced at design and run time
Quality gate operators on every step
Generated workflow is yours — inspect, edit, export

composer · core operators

── composer pipeline · risk_analytics ──

[1] UDP Dataset Operator
  source: "postgres://risk.transactions"
  asset:  "Query Asset · last_30d"           ✓ extracted

[2] UDP Data Blend Operator
  join:   "INNER · on customer_id"
  with:   "File Asset · customer_master.csv"  ✓ blended

[3] UDP Transformation Operator
  tasks:  "filter, group_by, aggregate"
  engine: "in-memory query engine"           ✓ transformed

[4] UDP MIL Operator
  target: "snowflake.risk.scores"
  scd:    "SCD-2 · history preserved"          ✓ loaded

── deploy · main@a3f9c2 ──
  workflow  RiskAnalytics.pipeline
  schedule  0 */6 * * *          ✓ active
  k8s pod   healthy             ✓ green

STATUS: GREEN · next run in 47m

Use Cases

Where Composer earns its keep.

🏛️

Legacy System Modernization

State agencies · regulatory bodies

Extract, cleanse, transform, and load from legacy mainframes and siloed operational systems. One regulator finished what was a planned 18-month migration in 6 weeks — with a single analyst on Composer.

✓ Migrations that ship, not stall

🧬

MDM Hub & 360 Views

Healthcare · Financial Services

Build a Master Member / Patient / Customer Index using probabilistic entity resolution. Ingest from every operational system. Share governed 360s via DaaS API. No custom ER code.

✓ Single source of identity, across every system

❄️

Cloud Data Warehouse Pipelines

Snowflake · Databricks · BigQuery

Land data into Snowflake or Databricks without writing a line of Python. CDC from SaaS, validate the schema, quality-gate the load, and watch every run. Schema drift gets caught at the door.

✓ Warehouses you can trust on Monday morning

🧠

ML & Predictive Pipelines

Healthcare · Insurance · Public Health

Blend hospital and emergency discharge data, impute missing fields, deduplicate incidents, filter by cohort, and feed a fatal-incidence prediction model — one workflow, end to end. Train, evaluate, package, deploy.

✓ Reproducible training, retrained on schedule

Why Composer

Not another ETL tool.

Composer is a module of a unified platform — not a standalone pipeline product that needs its own catalog, its own quality engine, and its own lineage.

In-memory query engine · zero lock-in

Composer runs on xAQUA's own in-memory query engine. No proprietary runtime to license. No JVM cluster to babysit. The pipeline definition is portable, inspectable, and yours — Composer is a faster way to build it, not a cage around it.

Contracts before code runs

Schema contracts validate at design time, not just at runtime. A breaking source change fails the build — before it breaks your 3am pipeline.

Quality gates, every step

Every task has an optional quality gate — validate, dedupe, resolve, enforce. Bad data is stopped at the fence, not chased through downstream dashboards.

Same semantic layer · every pipeline

Composer reads and writes against the same business vocabulary your analysts, BI, and governance tools use. No two versions of "customer." Ever.

Operated by AI · augmented by humans

Most ETL tools are operated by humans, with an AI assistant glued on. Composer inverts that — the AI Data Engineer agent operates the canvas; humans review, approve, and steer. Augmentation, not replacement.

Deploys where your data lives

Private VPC, air-gapped, on-prem, or cloud. No data leaves your boundary. Pipelines run next to the data — including FedRAMP-aligned environments.

Meet the AI Data Engineer

Composer is the tool of xAQUA's AI Data Engineer.

The AI Data Engineer is an xAQUA agent that lives inside Composer. Powered by Active Metadata from the semantic layer, the catalog, and the lineage graph, the agent understands your sources, your business definitions, and your governance rules. Ask in English; the agent composes the pipeline, configures every operator, validates contracts, and wires the workflow.

Promote ad-hoc work from Athyna — xAQUA's interactive data studio — into Composer with one prompt. Same semantic layer. Same catalog. Same governance. The agent wraps the recipe into a scheduled, monitored, Git-versioned production pipeline. You review, approve, and steer.

See the full AI Data Team →

⚡

AI Data Engineer

xAQUA AI Data Team · always on

Operates Composer · Active Metadata

👤

Turn Priya's Athyna member prep into a daily pipeline.

⚡

Promoted Athyna recipe → Composer pipeline:

· members_daily.pipeline — 5 operators
· Schedule: 0 2 * * * · SLA 10min
· Quality gate: 99.2% threshold
· Git: main/pipelines/members_daily
· Deployed to K8s · observability wired

Running tonight. First output ready by 02:10.

Ready to stop writing boilerplate?

See Composer build a CDC pipeline from Salesforce to Snowflake — with entity resolution, quality gates, Git versioning, and K8s deployment — in under fifteen minutes.

Request a Composer Demo → Meet the Engineer ↓ Download Brochure PDF · 502 KB

Overview

🔌

The Six AI Data Agents

🔮

AI Data ScientistPredictive models

Architecture

Technical Docs

Browse all products →See what can be licensed

Featured Story

8× ROI in 3 weeks

$300B+ pension fund deployed xAQUA UDP in three weeks.

Read the story →

Governance

Data Management

Data Products

Intelligence

Predictive ModelsClickML

Vertical Products

🛡️

xAQUA Aegis LiveCybersecurity · GRC

🏛️

xAQUA for Pensions Roadmap

🏦

xAQUA for FinServ Roadmap

⚕️

xAQUA for Healthcare Future

Product Roadmap →

By Use Case

Data Preparation & Transformation

Data Migration & Integration

Analytics & Reporting

AI & ML

Data as a Product (DaaP)

Self-Service Data Management

Data Governance & Quality

Browse all solutions →

By Industry

By Role

Need help implementing?xAQUA Expert Services →

UDP Editions

◐

xAQUA EssentialsSMB · self-serve · from $49/mo

◑

xAQUA EnterprisePrivate VPC or air-gapped

●

xAQUA for GovernmentGovCloud · FedRAMP aligned

Modules & Products

Modules à la carteLicense only what you need

Vertical ProductsAegis · Pensions · FinServ

Compare Options

Request Custom Quote

Buying Resources

ROI Calculator

Pricing FAQ

Need to accelerate?xAQUA Expert Services →

Prefer a partner?Find a Partner →

Learn

Blog

Documentation

Webinars & Events

Whitepapers & Guides

Glossary

Newsletter

Customer Stories

All Customer Stories

$300B+ Public Pension8× ROI in 3 weeks

Salesforce MigrationStalled year → 6 weeks, one analyst

Testimonials

ROI Calculator

Thought Leadership

Forbes Articles

The Frankenstack Problem

The Integration Tax

The Smartphone Moment

True Unification vs M&A

About

About xAQUA

Careers

Trust & Security

Contact

More than ETL.
Built on a foundation.