TourOptimizer Production Guide
How to use TourOptimizer in production: job lifecycle, sync runs, tenant isolation, encryption at rest, and job control.
Overview
- Choosing the right mode
- Job mode — async long runs
- Sync run mode — short runs
- The jobId and runId
- Tenant isolation
- Persistence settings
- Data expiry and auto-cleanup
- Completion webhooks
- Stopping and deleting jobs
- Encryption at rest
- KMS envelope encryption
- Encryption mode summary
- What gets stored in MongoDB
- Error handling
Choosing the right mode
TourOptimizer offers two integration modes. Choose based on expected run duration and your deployment type.
Job mode (/api/v1/jobs) | Sync run mode (/api/v1/runs) | |
|---|---|---|
| HTTP model | Fire-and-forget: returns 202 immediately | Returns 202 with runId, then blocks on result endpoint |
| Result delivery | Poll GET /api/v1/jobs/{jobId}/result after completion | GET /api/v1/runs/{runId}/result blocks until done |
| Suitable for | Any run duration, especially long runs (minutes to hours) | Short runs only (up to a few minutes) |
| Persistence | Results stored in MongoDB, retrievable at any time | In-memory only, no persistence |
| Concurrency | Multiple jobs run concurrently, each isolated by jobId | Multiple runs supported, each isolated by runId |
| Live events | Poll progress/status from database, or receive completion webhook | Subscribe to SSE streams while the run is active |
| Requires | Database (DNA_DATABASE_ACTIVE=true) | Sync controllers (DNA_SHOW_SYNCH_CONTROLLERS=true) |
For any production deployment or any optimization that might run longer than a minute, use job mode.
Job mode
Submit
Send a POST to /api/v1/jobs with the standard RestOptimization body and the X-Tenant-Id header. The server generates a jobId, starts the optimization asynchronously, and immediately returns HTTP 202.
POST /api/v1/jobs
Content-Type: application/json
X-Tenant-Id: my-tenant-123
Response (HTTP 202):
{
"jobId": "648d2724-3a77-47f4-b937-d3ab6abf2341",
"creatorHash": "11aa65b13c2a6d34f8727e82e403ce869e3bba1d35c45c595e8cc5ce5e74e57a",
"ident": "JOpt-Run-1774127074120",
"submittedAt": 1774131229940,
"status": "ACCEPTED"
}
| Field | Description |
|---|---|
jobId | UUID v4, generated server-side. The primary handle for all subsequent operations. |
creatorHash | SHA-256 of the creator name from the request's creatorSetting. |
ident | User-defined label echoed back from the input. |
submittedAt | Epoch-millisecond timestamp of submission. |
status | Always ACCEPTED at submission. The actual status is tracked in the database. |
Save the jobId. You will need it for all subsequent requests.
Poll for status
While the optimization is running, poll for status:
GET /api/v1/jobs/{jobId}/status
X-Tenant-Id: my-tenant-123
Wait for a status of SUCCESS_WITH_SOLUTION, SUCCESS_WITHOUT_SOLUTION, or ERROR.
Poll for progress (optional)
GET /api/v1/jobs/{jobId}/progress
X-Tenant-Id: my-tenant-123
Returns a stream of JOptOptimizationProgress objects. Use ?limit=1&sortDirection=DESC to get only the most recent snapshot.
Retrieve the result
Once the status indicates completion:
POST /api/v1/jobs/result
Content-Type: application/json
X-Tenant-Id: my-tenant-123
Returns the full RestOptimization object including all routes, node assignments, scheduling details, and violation reports.
To retrieve the solution payload only (smaller response):
GET /api/v1/jobs/{jobId}/solution
X-Tenant-Id: my-tenant-123
Download as ZIP
GET /api/v1/jobs/{jobId}/export
X-Tenant-Id: my-tenant-123
X-Encryption-Secret: YourStr0ng!Secret_Here (only for CLIENT-mode encrypted results)
Returns a ZIP archive of the persisted result. For KMS-encrypted or unencrypted results, omit the X-Encryption-Secret header — decryption is handled transparently by the server.
Sync run mode
Sync runs are suitable for short optimizations where holding an HTTP connection open for a few minutes is acceptable. Each run is isolated by a runId and multiple runs can execute concurrently without interfering with each other.
Start a run
POST /api/v1/runs
Content-Type: application/json
Response (HTTP 202):
{
"runId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"submittedAt": 1775130148792,
"ident": "my-short-run"
}
Wait for the result
GET /api/v1/runs/{runId}/result
This call blocks until the optimization finishes, then returns the full RestOptimization. For the solution only:
GET /api/v1/runs/{runId}/solution
Subscribe to live events
While the run is in progress, subscribe to SSE streams in parallel:
GET /api/v1/runs/{runId}/stream/progress
GET /api/v1/runs/{runId}/stream/status
GET /api/v1/runs/{runId}/stream/warnings
GET /api/v1/runs/{runId}/stream/errors
Each stream is isolated to its runId. Concurrent runs each have their own independent subscriptions.
One-shot start signal
If you want to begin subscribing to streams only after the optimizer has actually transitioned to the running state:
GET /api/v1/runs/{runId}/started
Returns true once the optimizer starts. Because streams use ReplaySubject internally, subscribing before this signal is also safe — events are buffered and replayed to late subscribers.
The jobId and runId
Both identifiers are UUID v4 strings (122 bits of randomness), generated server-side before any async work begins. They are opaque handles — clients treat them as strings and never parse their contents.
The jobId is registered in the in-memory JobRegistry at submission time, before the 202 response is returned. This ensures that a stop signal sent immediately after the 202 will find the running optimizer.
Similarly, the runId is registered in the RunRegistry before the 202 is returned.
The jobId is not a security credential. Security is enforced by the tenantId from the verified API gateway header. The jobId's randomness is defense-in-depth, not the primary boundary.
Tenant isolation
Every persisted document (GridFS result, progress, status, warnings, errors) is tagged with both jobId and tenantId. All read queries are automatically scoped by both fields — a request with a valid jobId but a mismatched tenantId returns no data.
How the tenantId flows
- The client authenticates with the API gateway (for example Azure API Management) via an API key or OAuth token.
- The gateway resolves the subscription to a tenant and injects the
X-Tenant-Idheader. - TourOptimizer reads the header server-side. The client cannot forge it.
- All persistence writes tag the data with this
tenantId. - All persistence reads filter by
tenantId.
In local Docker and on-premise single-tenant setups, any fixed string is acceptable as the tenant identifier (for example local or my-tenant). The isolation guarantee only matters in multi-tenant deployments behind an API gateway.
Persistence settings
The persistenceSetting block controls what gets saved, whether encryption is active, and how long data is retained. It sits inside the extension object of the RestOptimization body, at the same level as keySetting.
"extension": {
"timeOut": "PT2H",
"persistenceSetting": {
"mongoSettings": {
"enablePersistence": true,
"secret": "",
"expiry": "PT48H",
"requireUniqueIdentCreatorCombination": false,
"optimizationPersistenceStrategySetting": {
"saveConnections": false
},
"streamPersistenceStrategySetting": {
"saveProgress": true,
"cycleProgress": true,
"saveStatus": true,
"cycleStatus": true,
"saveWarning": true,
"saveError": true
}
}
}
}
Top-level extension fields
| Field | Default | Description |
|---|---|---|
timeOut | PT8760H (365 days) | Maximum duration the optimization is allowed to run. ISO 8601 duration format. The optimizer stops and persists the best result found if this limit is reached. For the sync run mode (/api/v1/runs), a more conservative value like PT5M is appropriate. |
MongoDB settings (mongoSettings)
| Field | Default | Description |
|---|---|---|
enablePersistence | true | Must be true for any data to be saved. If false, the optimizer runs but nothing is written to the database. |
secret | "" | Passphrase for CLIENT-mode encryption. Leave empty for KMS or unencrypted storage. See Encryption at rest. |
expiry | PT48H | How long all data for this job is retained before automatic deletion. ISO 8601 duration. PT48H means 48 hours. See Data expiry and auto-cleanup. |
requireUniqueIdentCreatorCombination | false | When true, the server rejects a job submission if an existing result with the same ident and creator combination is already in the database. Useful for preventing duplicate runs in automated pipelines. |
Optimization persistence strategy (optimizationPersistenceStrategySetting)
Controls what is included in the stored result snapshot. The full optimization config is always persisted.
| Field | Default | Description |
|---|---|---|
saveConnections | false | When true, connection matrices are included in the stored result. Connection data can be large. Set to false to keep the stored document small unless you specifically need to reconstruct or inspect the connection graph after the fact. |
Stream persistence strategy (streamPersistenceStrategySetting)
Controls which event stream collections are populated during the optimization run.
| Field | Default | Description |
|---|---|---|
saveProgress | true | Save progress update events. |
cycleProgress | true | When true, only the most recent progress document is kept per job, overwritten in place on each update. When false, every progress event creates a new document. Use true for long runs to prevent unbounded collection growth. |
saveStatus | true | Save lifecycle status events (STARTED, RUNNING, SUCCESS, ERROR). |
cycleStatus | true | Same cycling behavior as cycleProgress, applied to status documents. |
saveWarning | true | Save warning events. |
saveError | true | Save error events. |
Data expiry and auto-cleanup
Every persisted document (GridFS result, progress, status, warnings, errors) carries an expireAt timestamp, calculated at write time as submission time + expiry duration. MongoDB's TTL index mechanism automatically deletes documents once this timestamp passes. The server also runs its own cleanup scan on a configurable schedule (touroptimizer.security.database-clean-rate-seconds, default 7200 seconds).
The expiry field in mongoSettings controls the TTL for all documents belonging to that job. Setting it to a negative value or zero disables automatic expiry. If no expiry is provided, the default is PT48H (48 hours).
SaaS deployments enforce a maximum retention period. On the DNA Evolutions SaaS platform, optimizations are retained for a maximum of 24 hours regardless of the value requested in expiry. If your integration needs results available for longer, retrieve and store them on your side within that window.
On-premise deployments can set any expiry duration. For long-term archiving, set a large expiry value or disable it entirely, and rely on your own backup strategy.
The recommended workflow for any client is:
- Submit the job and save the
jobId. - Wait for completion: either poll
GET /api/v1/jobs/{jobId}/statusat a reasonable interval, or configurecompletionWebhookUrlto receive a push notification. - Retrieve the result as soon as the job completes.
- Store the result on your side if you need it beyond the expiry window.
- Optionally call
DELETE /api/v1/jobs/{jobId}once you have retrieved the result, to release database space before the TTL fires.
Completion webhooks
Instead of polling GET /api/v1/jobs/{jobId}/status repeatedly, you can configure a webhook URL and let the server notify you when a job reaches a terminal state.
Configuration
Add two fields inside mongoSettings:
"mongoSettings": {
"enablePersistence": true,
"completionWebhookUrl": "https://your-service.example.com/jopt/callback",
"completionWebhookSecret": "YourStr0ng!Secret_Here",
...
}
| Field | Default | Description |
|---|---|---|
completionWebhookUrl | "" | URL the server POSTs to when the job finishes or fails. Leave empty to disable. |
completionWebhookSecret | "" | HMAC-SHA256 signing secret. When set, the server includes an X-JOpt-Signature: sha256=<hex> header so the receiver can verify authenticity. Leave empty for unsigned delivery. |
Payload
The server sends a minimal JSON payload — no optimization data, no routes:
{
"jobId": "648d2724-3a77-47f4-b937-d3ab6abf2341",
"tenantId": "my-tenant-123",
"status": "SUCCESS_WITH_SOLUTION",
"completedAt": 1775130148792
}
status is one of SUCCESS_WITH_SOLUTION, SUCCESS_WITHOUT_SOLUTION, or ERROR. On error an additional errorMessage field is present. Use the jobId to fetch the full result from GET /api/v1/jobs/{jobId}/result.
Delivery semantics
Single attempt only. No retries. The webhook fires after the result has been persisted and the job has been deregistered from the active-jobs registry. If your receiver calls GET /api/v1/jobs/{jobId}/result immediately upon receiving the webhook, the result will already be available. The 10-second response timeout is enforced on the receiving endpoint. A failed delivery is logged at WARN level and never affects the job result.
Signature verification
When completionWebhookSecret is set, verify the signature on every delivery before trusting the payload:
// Java
public boolean verify(byte[] rawBody, String signatureHeader, String secret) throws Exception {
Mac mac = Mac.getInstance("HmacSHA256");
mac.init(new SecretKeySpec(secret.getBytes(StandardCharsets.UTF_8), "HmacSHA256"));
String expected = "sha256=" + HexFormat.of().formatHex(mac.doFinal(rawBody));
// Always use constant-time comparison to prevent timing attacks
return MessageDigest.isEqual(expected.getBytes(), signatureHeader.getBytes());
}
# Python
import hmac, hashlib
def verify(raw_body: bytes, signature_header: str, secret: str) -> bool:
expected = "sha256=" + hmac.new(
secret.encode(), raw_body, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, signature_header)
URL validation policy
The server validates the webhook URL at job submission time — not when the webhook fires. This prevents delayed SSRF attacks. The active policy is controlled by touroptimizer.security.webhook-validation (environment variable DNA_WEBHOOK_VALIDATION):
| Policy | Environment value | Allowed URLs |
|---|---|---|
STRICT | strict (default) | Public HTTPS only. Private ranges, link-local, and cloud metadata addresses rejected. |
RELAXED | relaxed | HTTP and HTTPS. Private network addresses (10.x, 172.16-31.x, 192.168.x) allowed. Loopback still rejected. |
NONE | none | Any URL accepted. A prominent startup warning is logged. Local development only. |
Loopback addresses (127.x, ::1) are always rejected in strict and relaxed. Within a Docker network, use the container name or host.docker.internal instead.
Stopping and deleting jobs
Stop and delete are intentionally separate operations for jobs. The optimizer may take several seconds to finish gracefully after receiving a stop signal, and the final result is persisted during that window. Deleting before termination would race against that write.
Stop a running job
POST /api/v1/jobs/{jobId}/stop
X-Tenant-Id: my-tenant-123
This sends a graceful stop signal to the running optimizer and returns immediately. The optimizer finishes its current iteration and persists the best result found so far. The actual termination may take several seconds. Poll GET /api/v1/jobs/{jobId}/status to confirm the job has stopped before deleting.
Returns 404 if no active job is found for that jobId (the job has already completed or never started).
Delete persisted data
DELETE /api/v1/jobs/{jobId}
X-Tenant-Id: my-tenant-123
Permanently removes the GridFS result document and all stream documents (progress, status, warnings, errors) for the given jobId. This operation is idempotent — calling it on an already-deleted job returns 200.
This endpoint does not send a stop signal. Only call it after the job has terminated.
Stop a sync run
DELETE /api/v1/runs/{runId}
Sends a graceful stop signal to the running sync optimizer. The result is then returned by the blocking GET /api/v1/runs/{runId}/result call, which completes with the best solution found at stop time. Sync runs have no persistence — there is no separate delete step.
Encryption at rest
Job mode supports three encryption modes. The mode is determined automatically at submission time.
Encryption mode priority
- Client secret is non-empty — CLIENT mode. Client manages the key. KMS is ignored even if enabled.
- No client secret and KMS enabled — KMS mode. Server manages the key transparently.
- No client secret and no KMS — No encryption. Data is compressed with bzip2 only.
The secret field in persistenceSetting.mongoSettings is the switch.
CLIENT mode
The client provides and manages the passphrase.
During submission:
- Client includes a non-empty
secretinpersistenceSetting.mongoSettings.secret. - The server validates the secret strength. A weak secret causes immediate HTTP 400 rejection.
- A random 16-byte salt and 12-byte IV are generated.
- A 256-bit AES key is derived from the secret using PBKDF2-HMAC-SHA256 (310,000 iterations).
- The result is serialized, compressed with bzip2, and encrypted with AES-256-GCM.
- The IV, salt, algorithm details, and
encMode: "CLIENT"are stored in GridFS metadata. - Privacy-sensitive metadata fields (
creator,ident,createdTimeStamp,status) are stripped from the GridFS document. - The secret is discarded from server memory. It is never persisted.
During retrieval:
Pass the original secret in the X-Encryption-Secret header (for the export endpoint) or in the request body's secret field (for JSON read endpoints). The server re-derives the AES key from the stored salt and decrypts. If the secret is wrong or missing, the server returns an error — there is no administrative bypass.
If you lose the secret: The data cannot be recovered. Store secrets in a secrets manager or password vault.
Algorithm details:
| Component | Value |
|---|---|
| Cipher | AES-256-GCM (AES/GCM/NoPadding) |
| Key derivation | PBKDF2 with HMAC-SHA256 (PBKDF2WithHmacSHA256) |
| Iterations | 310,000 (OWASP 2024 recommendation) |
| Key length | 256 bits |
| IV | 12 bytes, randomly generated per job |
| Salt | 16 bytes, randomly generated per job |
| Auth tag | 128 bits (built into GCM mode) |
Secret strength requirements
When a non-empty secret is provided, it must meet minimum requirements for length and character class diversity (uppercase, lowercase, digits, and special characters). Weak secrets are rejected immediately with HTTP 400 before the optimization starts.
KMS envelope encryption
When the server has a KMS configured and the client does not provide a secret, the server applies envelope encryption automatically. This is transparent to the client.
How it works
The server generates a random 256-bit AES key (the data encryption key, or DEK) per job. The result is encrypted with this DEK. The DEK is then encrypted (wrapped) by a key encryption key (KEK) managed in an external KMS. The wrapped DEK is stored in GridFS metadata. The plaintext DEK is discarded.
On retrieval, the server reads the wrapped DEK, unwraps it via the KMS, and decrypts the result. The client never sees the DEK.
Algorithm details:
| Component | Value |
|---|---|
| Cipher | AES-256-GCM (same as CLIENT mode) |
| DEK generation | KeyGenerator with 256-bit key and SecureRandom |
| DEK wrapping | RSA-OAEP with SHA-256 via external KMS |
| IV | 12 bytes, randomly generated per job |
Server configuration
KMS mode is configured server-side only. The client makes no changes.
| Property | Environment variable | Value | Description |
|---|---|---|---|
touroptimizer.security.kms-provider | DNA_KMS_PROVIDER | none | KMS disabled (default) |
local | In-memory RSA key pairs, for development only. Keys are lost on restart. | ||
azure | Azure Key Vault via DefaultAzureCredential. Required RBAC role: Key Vault Crypto User. |
The tradeoff
With KMS mode, the server operator can decrypt the data because the server has access to the KMS. If a client requires that even the server operator cannot read their data, use CLIENT mode.
Encryption mode summary
| No encryption | CLIENT mode | KMS mode | |
|---|---|---|---|
secret in request | "" (empty) | Non-empty passphrase | "" (empty) |
| Who manages the key | Nobody | Client | Server via KMS |
| Encrypted at rest | No (bzip2 only) | Yes (AES-256-GCM) | Yes (AES-256-GCM) |
| Secret needed to retrieve | No | Yes (same passphrase) | No (transparent) |
| Metadata in GridFS | Full | Stripped | Stripped |
sec.encMode | (no sec block) | CLIENT | KMS |
| Server can decrypt | Always | Only with client secret | Always (has KMS access) |
| Data recoverable if key lost | Always | No | Yes (while KEK exists) |
When encryption is active, the server strips creator, ident, createdTimeStamp, and status from the GridFS metadata to prevent personal information leakage. These fields remain inside the encrypted payload and are returned in the decrypted response. To check job status before downloading, use GET /api/v1/jobs/{jobId}/status.
Stream data (progress, status, warnings, errors) is always stored as unencrypted plain text regardless of the encryption mode on the result.
What gets stored in MongoDB
Each job produces the following documents, all tagged with jobId and tenantId.
GridFS — result snapshot
The full optimization result (compressed, optionally encrypted) stored in GridFS.
Unencrypted example:
{
"filename": "696c31c3-f419-4918-8a65-faf1f769d460",
"metadata": {
"_contentType": "application/x-bzip2",
"creator": "PUBLIC_CREATOR",
"createdTimeStamp": 1774210149597,
"ident": "JOpt-Run-1774127074120",
"type": "OptimizationConfig<JSONConfig>",
"expireAt": "2026-03-24T20:09:09.626Z",
"status": {
"statusDescription": "SUCCESS_WITH_SOLUTION",
"error": "NO_ERROR"
},
"jobId": "696c31c3-f419-4918-8a65-faf1f769d460",
"tenantId": "my-tenant-123",
"compression": "bzip2",
"encrypted": false
}
}
CLIENT mode example (metadata stripped):
{
"filename": "34a94ca2-7d6b-421a-ae00-f733917bb36b",
"metadata": {
"_contentType": "application/octet-stream",
"jobId": "34a94ca2-7d6b-421a-ae00-f733917bb36b",
"tenantId": "my-tenant-123",
"type": "OptimizationConfig<JSONConfig>",
"expireAt": "2026-03-26T09:55:23.649Z",
"sec": {
"encMode": "CLIENT",
"iv": "FdI2vU0ji8blCcKs",
"salt": "GKZnvJALbjCm6YBoERZylg==",
"encAlgo": "AES/GCM/NoPadding",
"secretKeyFacAlgo": "PBKDF2WithHmacSHA256",
"iterationCount": 310000,
"keyLength": 256
},
"compression": "bzip2",
"encrypted": true
}
}
KMS mode example (metadata stripped):
{
"filename": "a299d547-7db9-40aa-856e-e338c9d08593",
"metadata": {
"_contentType": "application/octet-stream",
"jobId": "a299d547-7db9-40aa-856e-e338c9d08593",
"tenantId": "my-tenant-123",
"type": "OptimizationConfig<JSONConfig>",
"expireAt": "2026-03-25T23:52:06.831Z",
"sec": {
"encMode": "KMS",
"iv": "lJYEnGQoDJnG1TEE",
"encAlgo": "AES/GCM/NoPadding",
"keyLength": 256,
"wrappedDek": "u1ndsoDeGw3P494Jn3XU405Mr+g9...",
"kekId": "local-kms://keys/my-tenant-123"
},
"compression": "bzip2",
"encrypted": true
}
}
The filename in GridFS is always set to the jobId, providing a consistent identifier across all encryption modes.
Stream collections
Four MongoDB collections hold stream documents: progress, status, warning, error. Each document carries jobId and tenantId. Which collections are populated is controlled by the streamPersistenceStrategySetting in the request body. With cycleProgress=true and cycleStatus=true, only the most recent document is kept per job (overwritten in place), preventing unbounded collection growth for long runs. See Stream persistence strategy for all available options.
Error handling
| HTTP Status | Meaning |
|---|---|
202 Accepted | Job accepted, optimization running asynchronously |
400 Bad Request | Invalid input or weak encryption secret |
401 Unauthorized | License not valid, element limit exceeded, X-Tenant-Id missing, or jobId/tenantId mismatch on read |
404 Not Found | No active job or run found for this identifier |
500 Internal Server Error | Optimization failure, database error, or KMS communication failure |
504 Gateway Timeout | Optimization exceeded the configured timeout |
When an optimization fails asynchronously (after the 202 was returned), the error is persisted to the database. Discover it by polling GET /api/v1/jobs/{jobId}/status (which shows ERROR) or GET /api/v1/jobs/{jobId}/errors (which contains the error message and details).
For reading our license agreement and for further information about license plans, please visit www.dna-evolutions.com.
A product by DNA Evolutions GmbH ©