Platform Foundation

Async Job Engine

BullMQ-backed queue/worker pattern with idempotency keys, retry policies, dead-letter queues, and surfaced job status — built for operations to inspect.

Why this matters for enterprise procurement

Long-running operations (bulk imports, schedule generation, QA cycle close-outs, exports) need to be reliable, observable, and idempotent. FrontLine's job engine is built on BullMQ with explicit retry policies, dead-letter queues for failures that exhausted retries, and an operator-facing job inspector so your ops team can see what's running, what failed, and what's pending.

How it's implemented

Built so operators can see what's running, what failed, and what's pending

Jobs are enqueued via a typed contract per job kind, with a required idempotency key. The worker service consumes from named queues with configured concurrency. Each job kind declares retry parameters (max attempts, backoff strategy) and timeout. Failures past retry land in a DLQ for inspection. Job state, progress percentage, and result are queryable via API. Redis runs in noeviction mode so queued jobs cannot be silently dropped under memory pressure.

Capabilities

What's covered out of the box

Typed job contracts with idempotency keys

Per-queue concurrency and rate limiting

Exponential backoff with configurable max attempts

Dead-letter queue with operator inspection UI

Job state machine surfaced to the API consumer

Webhook on completion / failure

Redis noeviction policy — no silent data loss under memory pressure

Audit event emitted on job submission and completion

Standards & compliance

Audit-ready artifacts your reviewers can lean on

SOC 2 Type II — availability and reliability controls
Job retry policies documented per job kind
DLQ retention defaults to 30 days
Operator runbooks for common failure modes

Procurement FAQ

What security and compliance reviewers actually ask

What happens if a worker crashes mid-job?+

The job is returned to the queue and reattempted by another worker. Idempotency keys ensure repeated execution doesn't double-write.

Can we see what jobs are running?+

Yes. The platform admin UI exposes per-queue depth, in-flight jobs with progress, and the dead-letter queue. API endpoints expose the same data.

Are job submissions audited?+

Yes — every job submission emits an audit event with the actor, payload hash, and idempotency key. Job completion emits a follow-up event with status.

How do you handle stuck jobs?+

Each job kind has a declared timeout. Exceeding it marks the job as failed and triggers retry per the configured policy. Operators can manually requeue from the DLQ.

Run this past your security team

We share security overviews, RLS policy DDL, audit-event schemas, and SOC 2 progress on request. Book a 30-minute security review with the founders.

Book a security review Back to the Atlas