Skip to content

Operations

Frank is built to be operated as a set of clear runtime surfaces: API, UI, source worker, transform worker, Temporal workflows, Dagster assets, Iceberg tables, logs, traces, and run records.

Services

ServicePurpose
apiFastAPI application, route registration, pattern sync, schema library access, AI endpoints, and admin APIs.
uiSvelteKit application for source, transform, pipeline, model, ontology, and settings workflows.
source-workerTemporal worker for source discovery and extraction.
transform-workerTemporal worker for transform lifecycle and reconciliation work.
workerGeneral Temporal worker for AI and platform workflows.
DagsterAsset materialization, schedules, sensors, and pipeline execution visibility.
TemporalDurable workflow execution for async source, transform, and orchestration jobs.
Iceberg REST + MinIO/S3Lakehouse catalog and object storage.
TrinoQuery engine for transform execution and previews.
LokiPersistent log querying for run details.
OpenTelemetry collectorTrace export for API and worker paths.

Local stack

bash
cd ../common-infra
docker-compose up -d

cd ../frank-low-code-pipeline
make up
make status

Common access points:

SurfaceURL
API docshttp://localhost:8002/docs
UIhttp://localhost:5175
Healthhttp://localhost:8002/health
Dagsterhttp://localhost:3000 or configured Dagster URL

Startup work

API startup performs the platform initialization that should happen once per deploy:

  • Runs through FastAPI lifespan setup.
  • Initializes Iceberg client.
  • Ensures raw namespace.
  • Initializes AI transformer services.
  • Initializes FIWARE SDM registry and schema libraries.
  • Syncs source patterns from JSON files into the database.
  • Syncs SQL transforms and transform patterns from filesystem config.
  • Registers transform event listeners for Dagster code location reloads.
  • Initializes OpenTelemetry and Langfuse instrumentation.
  • Registers the complete API router set.

Run lifecycle

Frank stores lightweight run summaries in Postgres and sends detailed work to the relevant runtime.

WorkRuntimeUser-facing records
Source discoveryTemporal source workerDiscovery workflow status.
Source syncTemporal source worker + IcebergSync run history, logs, source status.
Transform materializationDagster + API callbackTransformRun records, Dagster run ID, logs, lineage.
Pipeline sandboxAPI + worker orchestrationSandbox workflow status and step results.
Ontology syncTemporal / Dagster sensor pathOntologySyncRun records and backing dataset history.
AI assistanceMartha workflow executionAI trace/execution IDs and structured response payloads.

Logs

Useful CLI commands:

bash
frankctl sources logs <source-id> <run-id> -f
frankctl transforms logs <transform-id> <run-id> -f
frankctl runs get <workflow-id>
frankctl runs wait <workflow-id>

Useful Compose commands:

bash
make logs
make logs-api
make logs-ui
docker-compose logs -f source-worker
docker-compose logs -f transform-worker

The API and workers use structured JSON logging so Loki queries can filter by fields such as workflow ID, Dagster run ID, transform ID, source ID, and trace ID.

Traces

OpenTelemetry is initialized in the API and workers. Dagster-triggered transform paths, source worker paths, and AI paths include trace context where available.

Relevant env:

bash
OTEL_EXPORTER_OTLP_ENDPOINT=alloy:4317

Source operations

Operational checklist:

  1. Source is ready or active.
  2. Discovery schema is current.
  3. Streams are enabled and configured.
  4. Incremental streams have cursor fields.
  5. Merge streams have primary keys.
  6. Target config matches the desired Bronze namespace/table convention.
  7. Sync history shows successful runs.

CLI:

bash
frankctl sources get <source-id>
frankctl sources streams list <source-id>
frankctl sources history <source-id>
frankctl sources sync <source-id>

Transform operations

Operational checklist:

  1. Transform is hydrated.
  2. can_run_now is true in the API/UI.
  3. Current artifact runtime matches the expected execution engine.
  4. Dagster code location has loaded the asset.
  5. Last run outcome is not already running.
  6. Logs are available for the run.
  7. Output table and lineage edges match expectations.

CLI:

bash
frankctl transforms get <transform-id>
frankctl transforms trigger <transform-id>
frankctl transforms runs <transform-id>
frankctl transforms logs <transform-id> <run-id>

Pipeline operations

Pipeline deployment path:

  1. Draft or update pipeline.
  2. Validate DAG.
  3. Run sandbox.
  4. Review step results.
  5. Activate.
  6. Monitor runs.

CLI:

bash
frankctl pipelines get <pipeline-id> --include-version
frankctl pipelines validate <pipeline-id> --timeout 600

Ontology operations

Before syncing a backing dataset:

  1. Entity type exists and is the intended version.
  2. Iceberg table exists and has expected columns.
  3. Property mappings include the primary key column.
  4. Relationship mappings include target type and target key.
  5. Health check passes.
  6. Sync history is reviewed after trigger.

API:

http
GET  /api/v1/backing-datasets/{id}/health
POST /api/v1/backing-datasets/{id}/sync
GET  /api/v1/backing-datasets/{id}/sync-history

Key environment variables

AreaVariables
API and authKEYCLOAK_URL, KEYCLOAK_REALM, KEYCLOAK_CLIENT_ID, KEYCLOAK_ISSUER, CORS_ALLOWED_ORIGINS
DatabasePOSTGRES_HOST, POSTGRES_PORT, POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD
Iceberg/S3ICEBERG_CATALOG_URI, ICEBERG_CATALOG, AWS_ENDPOINT_URL, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION
TemporalTEMPORAL_HOST, TEMPORAL_PORT, TEMPORAL_NAMESPACE, task queue variables
DagsterDAGSTER_URL
Logs/tracesLOKI_URL, LOKI_AUTH_TOKEN, OTEL_EXPORTER_OTLP_ENDPOINT
AIMARTHA_API_URL, MARTHA_KEYCLOAK_URL, MARTHA_CLIENT_ID, MARTHA_CLIENT_SECRET
OntologyONTOLOGY_ENABLED, ONTOLOGY_SERVICE_URL, ONTOLOGY_API_KEY, ONTOLOGY_TENANT_ID
Pattern registryPATTERN_WEBHOOK_SECRET, PATTERN_ADMIN_SECRET

Maintenance commands

bash
make up
make down
make status
make logs
make build
make build-no-cache
make init-iceberg
make init-db
make init-sdm
make test-iceberg

For API route-level checks:

bash
curl http://localhost:8002/health
curl http://localhost:8002/api/v1/status
curl http://localhost:8002/api/v1/services/health

Frank is built by aiaiai-pt.