Designing Idempotent Payment Ingestion Workflows

In the first BillingTracker post, I focused on the relational foundation: tenants, source systems, external identifiers, payments, ingestion runs, errors, and sync state.

The next step was proving that the schema could support real ingestion behavior.

For payment ingestion, the goal is not just to insert rows into billing.payments. The goal is to make the import safe to retry, resistant to duplicate source payloads, and protected against stale updates from an external system.

This post walks through the SQL prototype I used to test that behavior before wrapping it in an API.

From Schema to Behavior

The first schema pass created the pieces needed for ingestion:

clients
source systems
external ID mappings
locations
guarantors
dependents
payments
ingestion runs
ingestion errors
sync state

Those tables define the structure of the system. But ingestion is not only a modeling problem. It is also a behavior problem.

The next question was:

Can the schema support a payment import that is safe to retry?

That led to the first ingestion proof: an idempotent payment upsert.

Why Idempotency Matters

External ingestion workflows are messy.

A source system might resend the same payment after a timeout. A batch job might be retried. A payload might be processed twice. In some cases, an older version of a record might arrive after a newer version has already been imported.

For this project, idempotency means the same payment import can be safely repeated without creating duplicate canonical payments.

For payments, the natural idempotency key is:

client_id,
source_system_id,
external_payment_id

That combination means:

For a given tenant and source system, this external payment ID should map to one canonical payment record.

A unique constraint can prevent duplicate rows, but that is only part of the problem. The ingestion workflow also needs to decide whether the incoming record should insert, update, fail, or be skipped because it is stale.

Resolving External IDs Before Insert

Incoming payment records do not arrive with internal platform UUIDs.

They arrive with source-system identifiers, such as:

external_client_id
external_guarantor_id
external_dependent_id
external_location_id
external_payment_id

Before the system can insert or update a canonical payment, those external identifiers have to be resolved to internal IDs.

The SQL prototype starts with an incoming_payment CTE that simulates one external payment payload:

with incoming_payment as (
    select
        'dentrix-client-100'::text as external_client_id,
        'Dentrix'::text as source_system_name,
        'DX-PAY-INGEST-0001'::text as external_payment_id,
        'G-DX-1001'::text as external_guarantor_id,
        'D-DX-1001'::text as external_dependent_id,
        'DX-LOC-MIDTOWN'::text as external_location_id,
        99.99::numeric(12,2) as payment_amount,
        '2026-05-24T12:30:00Z'::timestamptz as payment_date,
        'posted'::text as payment_status,
        '2026-05-24T11:45:00Z'::timestamptz as source_updated_at
),

This is not a permanent table. It is a small stand-in for the kind of payload that will eventually come through an API or ingestion worker.

The next CTE, resolved_payment, performs the important mapping step:

resolved_payment as (
    select
        c.client_id,
        ss.source_system_id,
        l.location_id,
        g.guarantor_id,
        d.dependent_id,
        ip.external_payment_id,
        ip.payment_amount,
        ip.payment_date,
        ip.payment_status,
        ip.source_updated_at
    from incoming_payment ip
    join billing.source_systems ss
        on ss.source_system_name = ip.source_system_name
    join billing.client_source_mappings csm
        on csm.source_system_id = ss.source_system_id
       and csm.external_client_id = ip.external_client_id
    join billing.clients c
        on c.client_id = csm.client_id
    join billing.guarantors g
        on g.client_id = c.client_id
       and g.source_system_id = ss.source_system_id
       and g.external_guarantor_id = ip.external_guarantor_id
    join billing.dependents d
        on d.client_id = c.client_id
       and d.source_system_id = ss.source_system_id
       and d.external_dependent_id = ip.external_dependent_id
    join billing.location_source_mappings lsm
        on lsm.client_id = c.client_id
       and lsm.source_system_id = ss.source_system_id
       and lsm.external_location_id = ip.external_location_id
    join billing.locations l
        on l.location_id = lsm.location_id
)

The purpose of resolved_payment is to transform a source-system payment into something the platform can write as canonical data.

At that point, the ingestion flow has moved from this:

Dentrix client ID
Dentrix guarantor ID
Dentrix dependent ID
Dentrix location ID
Dentrix payment ID

to this:

client_id
source_system_id
guarantor_id
dependent_id
location_id
external_payment_id

That resolved shape is the bridge between the external source payload and the canonical payment row. The incoming record still preserves the external payment ID, but the relationships needed by the platform have been resolved to internal UUIDs.

This is the point where the workflow is ready to write to billing.payments.

Upserting the Payment with ON CONFLICT

Once the payment has been resolved, the insert can use the natural idempotency key:

client_id,
source_system_id,
external_payment_id

If no matching payment exists, we insert a new row. If a matching payment already exists, we route the operation through the conflict handler.

insert into billing.payments (
    client_id,
    location_id,
    source_system_id,
    guarantor_id,
    dependent_id,
    external_payment_id,
    payment_amount,
    payment_date,
    payment_status,
    source_updated_at
)
select
    rp.client_id,
    rp.location_id,
    rp.source_system_id,
    rp.guarantor_id,
    rp.dependent_id,
    rp.external_payment_id,
    rp.payment_amount,
    rp.payment_date,
    rp.payment_status,
    rp.source_updated_at
from resolved_payment rp
on conflict (client_id, source_system_id, external_payment_id)
do update set
    location_id = excluded.location_id,
    guarantor_id = excluded.guarantor_id,
    dependent_id = excluded.dependent_id,
    payment_amount = excluded.payment_amount,
    payment_date = excluded.payment_date,
    payment_status = excluded.payment_status,
    source_updated_at = excluded.source_updated_at,
    updated_at = now()
where billing.payments.source_updated_at <= excluded.source_updated_at
returning
    payment_id,
    external_payment_id,
    payment_amount,
    payment_status,
    source_updated_at;

The ON CONFLICT clause is what makes retrying the same payment safe. The same external payment does not become two canonical payment rows.

But the conflict handler does more than avoid duplicates. It also gives the workflow a place to decide whether the incoming record should update the existing payment.

Protecting Against Stale Updates

The most important part of the update is the where clause at the end of the conflict handler:

where billing.payments.source_updated_at <= excluded.source_updated_at

This prevents an older source record from overwriting newer payment state already stored by the platform.

That matters because ingestion systems do not always receive records in a perfect order. A retry, delayed batch, or source-system timing issue could send an older version of a payment after a newer one has already been processed.

The rule is:

If the incoming source_updated_at is newer than or equal to the stored value, the payment can update.
If the incoming source_updated_at is older than the stored value, the update is skipped.

That turns the upsert from a simple duplicate-prevention mechanism into a safer ingestion workflow.

Verifying the Workflow

I tested the prototype with four scenarios.

The first run used a new external payment ID:

'DX-PAY-INGEST-0001'::text as external_payment_id

That created a new row in billing.payments.

The second run used the same payload again. Because the payment already existed for the same tenant, source system, and external payment ID, PostgreSQL hit the conflict rule instead of inserting a duplicate. With the current source_updated_at guard, an equal timestamp is allowed to flow through the update path, but the important behavior for this first proof is that the repeated payload still resolves to one canonical payment row.

The third run kept the same external payment ID but used a newer source_updated_at.

175.25::numeric(12,2) as payment_amount,
'posted'::text as payment_status,
'2026-05-24T13:45:00Z'::timestamptz as source_updated_at

That updated the existing payment row.

The fourth run kept the same external payment ID but used an older source_updated_at:

99.00::numeric(12,2) as payment_amount,
'2026-05-24T11:00:00Z'::timestamptz as source_updated_at

That did not overwrite the newer payment row.

The verification query checked that the external payment still resolved to one canonical payment:

select
    client_id,
    source_system_id,
    external_payment_id,
    count(*) as payment_count
from billing.payments
where external_payment_id = 'DX-PAY-INGEST-0001'
group by client_id, source_system_id, external_payment_id;

The expected count is 1.

That is the key behavior I wanted to prove before building the API: the same source payment can be retried without creating duplicates, newer source data can update the canonical record, and stale source data cannot overwrite newer state.

Why This Matters Before Building the API

I intentionally proved this behavior in SQL before wrapping it in application code.

The API should not invent the ingestion rules. It should orchestrate them.

Once this SQL prototype worked, the API responsibility became clearer:

accept the external payment payload
resolve source-system identifiers
start an ingestion run
upsert payments safely
record failed records in ingestion_errors
return per-record outcomes such as inserted, updated, skipped, or failed

That gives the application layer a concrete behavior to wrap instead of a vague idea of “import payments.”

Next Steps

The next step is to wrap this behavior in the .NET API as the primary transactional backend for the project.

The SQL prototype proved the core ingestion rule. The API layer will add request validation, service boundaries, ingestion run tracking, error capture, and a response model that reports what happened for each payment in a batch.

I am also keeping a Python/FastAPI implementation in the repo as a separate exploration of ingestion and data-workflow patterns. The goal is not to copy the .NET API line by line, but to use the same billing ingestion domain to compare implementation tradeoffs across different backend stacks.

The important milestone in this phase is that BillingTracker has moved from schema design into repeatable ingestion behavior.

This does not solve every ingestion problem yet. It does not handle batch orchestration, partial failure recovery, retry scheduling, or event history end-to-end. But it proves the core write behavior the API can now build around.

Designing Idempotent Payment Ingestion Workflows

From Schema to Behavior

Why Idempotency Matters

Resolving External IDs Before Insert

Upserting the Payment with ON CONFLICT

Protecting Against Stale Updates

Verifying the Workflow

Why This Matters Before Building the API

Next Steps

Comments

Building BillingTracker

Designing the Core Schema for a Billing Ingestion Platform

More from this blog

Designing the Core Schema for a Billing Ingestion Platform

Command Palette

From Schema to Behavior

Why Idempotency Matters

Resolving External IDs Before Insert

Upserting the Payment with ON CONFLICT

Protecting Against Stale Updates

Verifying the Workflow

Why This Matters Before Building the API

Next Steps

Comments

Building BillingTracker

Designing the Core Schema for a Billing Ingestion Platform

More from this blog