Platform Foundation

Async Job Engine

BullMQ-backed queue/worker pattern with idempotency keys, retry policies, dead-letter queues, and surfaced job status — built for operations to inspect.

Why this matters for enterprise procurement

Long-running operations (bulk imports, schedule generation, QA cycle close-outs, exports) need to be reliable, observable, and idempotent. FrontLine's job engine is built on BullMQ with explicit retry policies, dead-letter queues for failures that exhausted retries, and an operator-facing job inspector so your ops team can see what's running, what failed, and what's pending.

How it's implemented

Built so operators can see what's running, what failed, and what's pending

Jobs are enqueued via a typed contract per job kind, with a required idempotency key. The worker service consumes from named queues with configured concurrency. Each job kind declares retry parameters (max attempts, backoff strategy) and timeout. Failures past retry land in a DLQ for inspection. Job state, progress percentage, and result are queryable via API. Redis runs in noeviction mode so queued jobs cannot be silently dropped under memory pressure.

Capabilities

What's covered out of the box

Typed job contracts with idempotency keys
Per-queue concurrency and rate limiting
Exponential backoff with configurable max attempts
Dead-letter queue with operator inspection UI
Job state machine surfaced to the API consumer
Webhook on completion / failure
Redis noeviction policy — no silent data loss under memory pressure
Audit event emitted on job submission and completion
Standards & compliance

Audit-ready artifacts your reviewers can lean on

  • SOC 2 Type II — availability and reliability controls
  • Job retry policies documented per job kind
  • DLQ retention defaults to 30 days
  • Operator runbooks for common failure modes
Procurement FAQ

What security and compliance reviewers actually ask

What happens if a worker crashes mid-job?+
The job is returned to the queue and reattempted by another worker. Idempotency keys ensure repeated execution doesn't double-write.
Can we see what jobs are running?+
Yes. The platform admin UI exposes per-queue depth, in-flight jobs with progress, and the dead-letter queue. API endpoints expose the same data.
Are job submissions audited?+
Yes — every job submission emits an audit event with the actor, payload hash, and idempotency key. Job completion emits a follow-up event with status.
How do you handle stuck jobs?+
Each job kind has a declared timeout. Exceeding it marks the job as failed and triggers retry per the configured policy. Operators can manually requeue from the DLQ.

Run this past your security team

We share security overviews, RLS policy DDL, audit-event schemas, and SOC 2 progress on request. Book a 30-minute security review with the founders.

Async Job Engine — FrontLine Platform | FrontLine