DNA //evolutions

TourOptimizer Production Guide

How to use TourOptimizer in production: job lifecycle, sync runs, tenant isolation, encryption at rest, and job control.


Overview


Choosing the right mode

TourOptimizer offers two integration modes. Choose based on expected run duration and your deployment type.

Job mode (/api/v1/jobs)Sync run mode (/api/v1/runs)
HTTP modelFire-and-forget: returns 202 immediatelyReturns 202 with runId, then blocks on result endpoint
Result deliveryPoll GET /api/v1/jobs/{jobId}/result after completionGET /api/v1/runs/{runId}/result blocks until done
Suitable forAny run duration, especially long runs (minutes to hours)Short runs only (up to a few minutes)
PersistenceResults stored in MongoDB, retrievable at any timeIn-memory only, no persistence
ConcurrencyMultiple jobs run concurrently, each isolated by jobIdMultiple runs supported, each isolated by runId
Live eventsPoll progress/status from database, or receive completion webhookSubscribe to SSE streams while the run is active
RequiresDatabase (DNA_DATABASE_ACTIVE=true)Sync controllers (DNA_SHOW_SYNCH_CONTROLLERS=true)

For any production deployment or any optimization that might run longer than a minute, use job mode.


Job mode

Submit

Send a POST to /api/v1/jobs with the standard RestOptimization body and the X-Tenant-Id header. The server generates a jobId, starts the optimization asynchronously, and immediately returns HTTP 202.

POST /api/v1/jobs
Content-Type: application/json
X-Tenant-Id: my-tenant-123

Response (HTTP 202):

{
  "jobId": "648d2724-3a77-47f4-b937-d3ab6abf2341",
  "creatorHash": "11aa65b13c2a6d34f8727e82e403ce869e3bba1d35c45c595e8cc5ce5e74e57a",
  "ident": "JOpt-Run-1774127074120",
  "submittedAt": 1774131229940,
  "status": "ACCEPTED"
}
FieldDescription
jobIdUUID v4, generated server-side. The primary handle for all subsequent operations.
creatorHashSHA-256 of the creator name from the request's creatorSetting.
identUser-defined label echoed back from the input.
submittedAtEpoch-millisecond timestamp of submission.
statusAlways ACCEPTED at submission. The actual status is tracked in the database.

Save the jobId. You will need it for all subsequent requests.

Poll for status

While the optimization is running, poll for status:

GET /api/v1/jobs/{jobId}/status
X-Tenant-Id: my-tenant-123

Wait for a status of SUCCESS_WITH_SOLUTION, SUCCESS_WITHOUT_SOLUTION, or ERROR.

Poll for progress (optional)

GET /api/v1/jobs/{jobId}/progress
X-Tenant-Id: my-tenant-123

Returns a stream of JOptOptimizationProgress objects. Use ?limit=1&sortDirection=DESC to get only the most recent snapshot.

Retrieve the result

Once the status indicates completion:

POST /api/v1/jobs/result
Content-Type: application/json
X-Tenant-Id: my-tenant-123

Returns the full RestOptimization object including all routes, node assignments, scheduling details, and violation reports.

To retrieve the solution payload only (smaller response):

GET /api/v1/jobs/{jobId}/solution
X-Tenant-Id: my-tenant-123

Download as ZIP

GET /api/v1/jobs/{jobId}/export
X-Tenant-Id: my-tenant-123
X-Encryption-Secret: YourStr0ng!Secret_Here  (only for CLIENT-mode encrypted results)

Returns a ZIP archive of the persisted result. For KMS-encrypted or unencrypted results, omit the X-Encryption-Secret header — decryption is handled transparently by the server.


Sync run mode

Sync runs are suitable for short optimizations where holding an HTTP connection open for a few minutes is acceptable. Each run is isolated by a runId and multiple runs can execute concurrently without interfering with each other.

Start a run

POST /api/v1/runs
Content-Type: application/json

Response (HTTP 202):

{
  "runId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "submittedAt": 1775130148792,
  "ident": "my-short-run"
}

Wait for the result

GET /api/v1/runs/{runId}/result

This call blocks until the optimization finishes, then returns the full RestOptimization. For the solution only:

GET /api/v1/runs/{runId}/solution

Subscribe to live events

While the run is in progress, subscribe to SSE streams in parallel:

GET /api/v1/runs/{runId}/stream/progress
GET /api/v1/runs/{runId}/stream/status
GET /api/v1/runs/{runId}/stream/warnings
GET /api/v1/runs/{runId}/stream/errors

Each stream is isolated to its runId. Concurrent runs each have their own independent subscriptions.

One-shot start signal

If you want to begin subscribing to streams only after the optimizer has actually transitioned to the running state:

GET /api/v1/runs/{runId}/started

Returns true once the optimizer starts. Because streams use ReplaySubject internally, subscribing before this signal is also safe — events are buffered and replayed to late subscribers.


The jobId and runId

Both identifiers are UUID v4 strings (122 bits of randomness), generated server-side before any async work begins. They are opaque handles — clients treat them as strings and never parse their contents.

The jobId is registered in the in-memory JobRegistry at submission time, before the 202 response is returned. This ensures that a stop signal sent immediately after the 202 will find the running optimizer.

Similarly, the runId is registered in the RunRegistry before the 202 is returned.

The jobId is not a security credential. Security is enforced by the tenantId from the verified API gateway header. The jobId's randomness is defense-in-depth, not the primary boundary.


Tenant isolation

Every persisted document (GridFS result, progress, status, warnings, errors) is tagged with both jobId and tenantId. All read queries are automatically scoped by both fields — a request with a valid jobId but a mismatched tenantId returns no data.

How the tenantId flows

  1. The client authenticates with the API gateway (for example Azure API Management) via an API key or OAuth token.
  2. The gateway resolves the subscription to a tenant and injects the X-Tenant-Id header.
  3. TourOptimizer reads the header server-side. The client cannot forge it.
  4. All persistence writes tag the data with this tenantId.
  5. All persistence reads filter by tenantId.

In local Docker and on-premise single-tenant setups, any fixed string is acceptable as the tenant identifier (for example local or my-tenant). The isolation guarantee only matters in multi-tenant deployments behind an API gateway.


Persistence settings

The persistenceSetting block controls what gets saved, whether encryption is active, and how long data is retained. It sits inside the extension object of the RestOptimization body, at the same level as keySetting.

"extension": {
  "timeOut": "PT2H",
  "persistenceSetting": {
    "mongoSettings": {
      "enablePersistence": true,
      "secret": "",
      "expiry": "PT48H",
      "requireUniqueIdentCreatorCombination": false,
      "optimizationPersistenceStrategySetting": {
        "saveConnections": false
      },
      "streamPersistenceStrategySetting": {
        "saveProgress": true,
        "cycleProgress": true,
        "saveStatus": true,
        "cycleStatus": true,
        "saveWarning": true,
        "saveError": true
      }
    }
  }
}

Top-level extension fields

FieldDefaultDescription
timeOutPT8760H (365 days)Maximum duration the optimization is allowed to run. ISO 8601 duration format. The optimizer stops and persists the best result found if this limit is reached. For the sync run mode (/api/v1/runs), a more conservative value like PT5M is appropriate.

MongoDB settings (mongoSettings)

FieldDefaultDescription
enablePersistencetrueMust be true for any data to be saved. If false, the optimizer runs but nothing is written to the database.
secret""Passphrase for CLIENT-mode encryption. Leave empty for KMS or unencrypted storage. See Encryption at rest.
expiryPT48HHow long all data for this job is retained before automatic deletion. ISO 8601 duration. PT48H means 48 hours. See Data expiry and auto-cleanup.
requireUniqueIdentCreatorCombinationfalseWhen true, the server rejects a job submission if an existing result with the same ident and creator combination is already in the database. Useful for preventing duplicate runs in automated pipelines.

Optimization persistence strategy (optimizationPersistenceStrategySetting)

Controls what is included in the stored result snapshot. The full optimization config is always persisted.

FieldDefaultDescription
saveConnectionsfalseWhen true, connection matrices are included in the stored result. Connection data can be large. Set to false to keep the stored document small unless you specifically need to reconstruct or inspect the connection graph after the fact.

Stream persistence strategy (streamPersistenceStrategySetting)

Controls which event stream collections are populated during the optimization run.

FieldDefaultDescription
saveProgresstrueSave progress update events.
cycleProgresstrueWhen true, only the most recent progress document is kept per job, overwritten in place on each update. When false, every progress event creates a new document. Use true for long runs to prevent unbounded collection growth.
saveStatustrueSave lifecycle status events (STARTED, RUNNING, SUCCESS, ERROR).
cycleStatustrueSame cycling behavior as cycleProgress, applied to status documents.
saveWarningtrueSave warning events.
saveErrortrueSave error events.

Data expiry and auto-cleanup

Every persisted document (GridFS result, progress, status, warnings, errors) carries an expireAt timestamp, calculated at write time as submission time + expiry duration. MongoDB's TTL index mechanism automatically deletes documents once this timestamp passes. The server also runs its own cleanup scan on a configurable schedule (touroptimizer.security.database-clean-rate-seconds, default 7200 seconds).

The expiry field in mongoSettings controls the TTL for all documents belonging to that job. Setting it to a negative value or zero disables automatic expiry. If no expiry is provided, the default is PT48H (48 hours).

SaaS deployments enforce a maximum retention period. On the DNA Evolutions SaaS platform, optimizations are retained for a maximum of 24 hours regardless of the value requested in expiry. If your integration needs results available for longer, retrieve and store them on your side within that window.

On-premise deployments can set any expiry duration. For long-term archiving, set a large expiry value or disable it entirely, and rely on your own backup strategy.

The recommended workflow for any client is:

  1. Submit the job and save the jobId.
  2. Wait for completion: either poll GET /api/v1/jobs/{jobId}/status at a reasonable interval, or configure completionWebhookUrl to receive a push notification.
  3. Retrieve the result as soon as the job completes.
  4. Store the result on your side if you need it beyond the expiry window.
  5. Optionally call DELETE /api/v1/jobs/{jobId} once you have retrieved the result, to release database space before the TTL fires.

Completion webhooks

Instead of polling GET /api/v1/jobs/{jobId}/status repeatedly, you can configure a webhook URL and let the server notify you when a job reaches a terminal state.

Configuration

Add two fields inside mongoSettings:

"mongoSettings": {
  "enablePersistence": true,
  "completionWebhookUrl": "https://your-service.example.com/jopt/callback",
  "completionWebhookSecret": "YourStr0ng!Secret_Here",
  ...
}
FieldDefaultDescription
completionWebhookUrl""URL the server POSTs to when the job finishes or fails. Leave empty to disable.
completionWebhookSecret""HMAC-SHA256 signing secret. When set, the server includes an X-JOpt-Signature: sha256=<hex> header so the receiver can verify authenticity. Leave empty for unsigned delivery.

Payload

The server sends a minimal JSON payload — no optimization data, no routes:

{
  "jobId":       "648d2724-3a77-47f4-b937-d3ab6abf2341",
  "tenantId":    "my-tenant-123",
  "status":      "SUCCESS_WITH_SOLUTION",
  "completedAt": 1775130148792
}

status is one of SUCCESS_WITH_SOLUTION, SUCCESS_WITHOUT_SOLUTION, or ERROR. On error an additional errorMessage field is present. Use the jobId to fetch the full result from GET /api/v1/jobs/{jobId}/result.

Delivery semantics

Single attempt only. No retries. The webhook fires after the result has been persisted and the job has been deregistered from the active-jobs registry. If your receiver calls GET /api/v1/jobs/{jobId}/result immediately upon receiving the webhook, the result will already be available. The 10-second response timeout is enforced on the receiving endpoint. A failed delivery is logged at WARN level and never affects the job result.

Signature verification

When completionWebhookSecret is set, verify the signature on every delivery before trusting the payload:

// Java
public boolean verify(byte[] rawBody, String signatureHeader, String secret) throws Exception {
    Mac mac = Mac.getInstance("HmacSHA256");
    mac.init(new SecretKeySpec(secret.getBytes(StandardCharsets.UTF_8), "HmacSHA256"));
    String expected = "sha256=" + HexFormat.of().formatHex(mac.doFinal(rawBody));
    // Always use constant-time comparison to prevent timing attacks
    return MessageDigest.isEqual(expected.getBytes(), signatureHeader.getBytes());
}
# Python
import hmac, hashlib

def verify(raw_body: bytes, signature_header: str, secret: str) -> bool:
    expected = "sha256=" + hmac.new(
        secret.encode(), raw_body, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, signature_header)

URL validation policy

The server validates the webhook URL at job submission time — not when the webhook fires. This prevents delayed SSRF attacks. The active policy is controlled by touroptimizer.security.webhook-validation (environment variable DNA_WEBHOOK_VALIDATION):

PolicyEnvironment valueAllowed URLs
STRICTstrict (default)Public HTTPS only. Private ranges, link-local, and cloud metadata addresses rejected.
RELAXEDrelaxedHTTP and HTTPS. Private network addresses (10.x, 172.16-31.x, 192.168.x) allowed. Loopback still rejected.
NONEnoneAny URL accepted. A prominent startup warning is logged. Local development only.

Loopback addresses (127.x, ::1) are always rejected in strict and relaxed. Within a Docker network, use the container name or host.docker.internal instead.


Stopping and deleting jobs

Stop and delete are intentionally separate operations for jobs. The optimizer may take several seconds to finish gracefully after receiving a stop signal, and the final result is persisted during that window. Deleting before termination would race against that write.

Stop a running job

POST /api/v1/jobs/{jobId}/stop
X-Tenant-Id: my-tenant-123

This sends a graceful stop signal to the running optimizer and returns immediately. The optimizer finishes its current iteration and persists the best result found so far. The actual termination may take several seconds. Poll GET /api/v1/jobs/{jobId}/status to confirm the job has stopped before deleting.

Returns 404 if no active job is found for that jobId (the job has already completed or never started).

Delete persisted data

DELETE /api/v1/jobs/{jobId}
X-Tenant-Id: my-tenant-123

Permanently removes the GridFS result document and all stream documents (progress, status, warnings, errors) for the given jobId. This operation is idempotent — calling it on an already-deleted job returns 200.

This endpoint does not send a stop signal. Only call it after the job has terminated.

Stop a sync run

DELETE /api/v1/runs/{runId}

Sends a graceful stop signal to the running sync optimizer. The result is then returned by the blocking GET /api/v1/runs/{runId}/result call, which completes with the best solution found at stop time. Sync runs have no persistence — there is no separate delete step.


Encryption at rest

Job mode supports three encryption modes. The mode is determined automatically at submission time.

Encryption mode priority

  1. Client secret is non-empty — CLIENT mode. Client manages the key. KMS is ignored even if enabled.
  2. No client secret and KMS enabled — KMS mode. Server manages the key transparently.
  3. No client secret and no KMS — No encryption. Data is compressed with bzip2 only.

The secret field in persistenceSetting.mongoSettings is the switch.

CLIENT mode

The client provides and manages the passphrase.

During submission:

  1. Client includes a non-empty secret in persistenceSetting.mongoSettings.secret.
  2. The server validates the secret strength. A weak secret causes immediate HTTP 400 rejection.
  3. A random 16-byte salt and 12-byte IV are generated.
  4. A 256-bit AES key is derived from the secret using PBKDF2-HMAC-SHA256 (310,000 iterations).
  5. The result is serialized, compressed with bzip2, and encrypted with AES-256-GCM.
  6. The IV, salt, algorithm details, and encMode: "CLIENT" are stored in GridFS metadata.
  7. Privacy-sensitive metadata fields (creator, ident, createdTimeStamp, status) are stripped from the GridFS document.
  8. The secret is discarded from server memory. It is never persisted.

During retrieval:

Pass the original secret in the X-Encryption-Secret header (for the export endpoint) or in the request body's secret field (for JSON read endpoints). The server re-derives the AES key from the stored salt and decrypts. If the secret is wrong or missing, the server returns an error — there is no administrative bypass.

If you lose the secret: The data cannot be recovered. Store secrets in a secrets manager or password vault.

Algorithm details:

ComponentValue
CipherAES-256-GCM (AES/GCM/NoPadding)
Key derivationPBKDF2 with HMAC-SHA256 (PBKDF2WithHmacSHA256)
Iterations310,000 (OWASP 2024 recommendation)
Key length256 bits
IV12 bytes, randomly generated per job
Salt16 bytes, randomly generated per job
Auth tag128 bits (built into GCM mode)

Secret strength requirements

When a non-empty secret is provided, it must meet minimum requirements for length and character class diversity (uppercase, lowercase, digits, and special characters). Weak secrets are rejected immediately with HTTP 400 before the optimization starts.


KMS envelope encryption

When the server has a KMS configured and the client does not provide a secret, the server applies envelope encryption automatically. This is transparent to the client.

How it works

The server generates a random 256-bit AES key (the data encryption key, or DEK) per job. The result is encrypted with this DEK. The DEK is then encrypted (wrapped) by a key encryption key (KEK) managed in an external KMS. The wrapped DEK is stored in GridFS metadata. The plaintext DEK is discarded.

On retrieval, the server reads the wrapped DEK, unwraps it via the KMS, and decrypts the result. The client never sees the DEK.

Algorithm details:

ComponentValue
CipherAES-256-GCM (same as CLIENT mode)
DEK generationKeyGenerator with 256-bit key and SecureRandom
DEK wrappingRSA-OAEP with SHA-256 via external KMS
IV12 bytes, randomly generated per job

Server configuration

KMS mode is configured server-side only. The client makes no changes.

PropertyEnvironment variableValueDescription
touroptimizer.security.kms-providerDNA_KMS_PROVIDERnoneKMS disabled (default)
localIn-memory RSA key pairs, for development only. Keys are lost on restart.
azureAzure Key Vault via DefaultAzureCredential. Required RBAC role: Key Vault Crypto User.

The tradeoff

With KMS mode, the server operator can decrypt the data because the server has access to the KMS. If a client requires that even the server operator cannot read their data, use CLIENT mode.


Encryption mode summary

No encryptionCLIENT modeKMS mode
secret in request"" (empty)Non-empty passphrase"" (empty)
Who manages the keyNobodyClientServer via KMS
Encrypted at restNo (bzip2 only)Yes (AES-256-GCM)Yes (AES-256-GCM)
Secret needed to retrieveNoYes (same passphrase)No (transparent)
Metadata in GridFSFullStrippedStripped
sec.encMode(no sec block)CLIENTKMS
Server can decryptAlwaysOnly with client secretAlways (has KMS access)
Data recoverable if key lostAlwaysNoYes (while KEK exists)

When encryption is active, the server strips creator, ident, createdTimeStamp, and status from the GridFS metadata to prevent personal information leakage. These fields remain inside the encrypted payload and are returned in the decrypted response. To check job status before downloading, use GET /api/v1/jobs/{jobId}/status.

Stream data (progress, status, warnings, errors) is always stored as unencrypted plain text regardless of the encryption mode on the result.


What gets stored in MongoDB

Each job produces the following documents, all tagged with jobId and tenantId.

GridFS — result snapshot

The full optimization result (compressed, optionally encrypted) stored in GridFS.

Unencrypted example:

{
  "filename": "696c31c3-f419-4918-8a65-faf1f769d460",
  "metadata": {
    "_contentType": "application/x-bzip2",
    "creator": "PUBLIC_CREATOR",
    "createdTimeStamp": 1774210149597,
    "ident": "JOpt-Run-1774127074120",
    "type": "OptimizationConfig<JSONConfig>",
    "expireAt": "2026-03-24T20:09:09.626Z",
    "status": {
      "statusDescription": "SUCCESS_WITH_SOLUTION",
      "error": "NO_ERROR"
    },
    "jobId": "696c31c3-f419-4918-8a65-faf1f769d460",
    "tenantId": "my-tenant-123",
    "compression": "bzip2",
    "encrypted": false
  }
}

CLIENT mode example (metadata stripped):

{
  "filename": "34a94ca2-7d6b-421a-ae00-f733917bb36b",
  "metadata": {
    "_contentType": "application/octet-stream",
    "jobId": "34a94ca2-7d6b-421a-ae00-f733917bb36b",
    "tenantId": "my-tenant-123",
    "type": "OptimizationConfig<JSONConfig>",
    "expireAt": "2026-03-26T09:55:23.649Z",
    "sec": {
      "encMode": "CLIENT",
      "iv": "FdI2vU0ji8blCcKs",
      "salt": "GKZnvJALbjCm6YBoERZylg==",
      "encAlgo": "AES/GCM/NoPadding",
      "secretKeyFacAlgo": "PBKDF2WithHmacSHA256",
      "iterationCount": 310000,
      "keyLength": 256
    },
    "compression": "bzip2",
    "encrypted": true
  }
}

KMS mode example (metadata stripped):

{
  "filename": "a299d547-7db9-40aa-856e-e338c9d08593",
  "metadata": {
    "_contentType": "application/octet-stream",
    "jobId": "a299d547-7db9-40aa-856e-e338c9d08593",
    "tenantId": "my-tenant-123",
    "type": "OptimizationConfig<JSONConfig>",
    "expireAt": "2026-03-25T23:52:06.831Z",
    "sec": {
      "encMode": "KMS",
      "iv": "lJYEnGQoDJnG1TEE",
      "encAlgo": "AES/GCM/NoPadding",
      "keyLength": 256,
      "wrappedDek": "u1ndsoDeGw3P494Jn3XU405Mr+g9...",
      "kekId": "local-kms://keys/my-tenant-123"
    },
    "compression": "bzip2",
    "encrypted": true
  }
}

The filename in GridFS is always set to the jobId, providing a consistent identifier across all encryption modes.

Stream collections

Four MongoDB collections hold stream documents: progress, status, warning, error. Each document carries jobId and tenantId. Which collections are populated is controlled by the streamPersistenceStrategySetting in the request body. With cycleProgress=true and cycleStatus=true, only the most recent document is kept per job (overwritten in place), preventing unbounded collection growth for long runs. See Stream persistence strategy for all available options.


Error handling

HTTP StatusMeaning
202 AcceptedJob accepted, optimization running asynchronously
400 Bad RequestInvalid input or weak encryption secret
401 UnauthorizedLicense not valid, element limit exceeded, X-Tenant-Id missing, or jobId/tenantId mismatch on read
404 Not FoundNo active job or run found for this identifier
500 Internal Server ErrorOptimization failure, database error, or KMS communication failure
504 Gateway TimeoutOptimization exceeded the configured timeout

When an optimization fails asynchronously (after the 202 was returned), the error is persisted to the database. Discover it by polling GET /api/v1/jobs/{jobId}/status (which shows ERROR) or GET /api/v1/jobs/{jobId}/errors (which contains the error message and details).


For reading our license agreement and for further information about license plans, please visit www.dna-evolutions.com.

A product by DNA Evolutions GmbH ©