Deployment Guide

Target platform

exodus-gw is an ASGI application which may be deployed using any ASGI-compliant web server. The development team’s recommended setup is summarized as:

  • Use OpenShift >= 4.x to deploy the service.

  • Use the exodus-gw images at https://quay.io/repository/exodus/exodus-gw to run the service. These images run the service using gunicorn & uvicorn on RHEL8.

    In general, the uvicorn deployment advice applies.

  • Deploy the service’s primary container behind a reverse-proxy implementing authentication according to your organization’s needs (see next section).

Authentication & Authorization

The exodus-gw service does not implement any authentication mechanism. It is instead designed to integrate with a reverse-proxy implementing any desired mechanism.

Warning

If exodus-gw is deployed without an authenticating reverse-proxy, the service must be considered completely unsecured - all users will be able to perform all operations.

This reverse-proxy must add an X-RhApiPlatform-CallContext header onto all incoming requests. This header must contain a base64-encoded form of the following JSON object:

{
  "client": {
    "roles": ["someRole", "anotherRole"],
    "authenticated": true,
    "serviceAccountId": "clientappname"
  },
  "user": {
    "roles": ["viewer"],
    "authenticated": true,
    "internalUsername": "someuser"
  }
}

The roles and authenticated fields influence whether an exodus-gw request will be permitted - the necessary roles are documented on relevant exodus-gw API endpoints. Other fields are unused or used only for logging.

The separate client and user fields can be used to separate service accounts (machine users) from human users, but this does not affect exodus-gw.

Within Red Hat, a container known as “platform-sidecar” is used as the reverse proxy - consult internal documentation for information on this component. In other contexts, any reverse proxy may be used as long as it produces headers according to the scheme documented above.

Database Migrations

The exodus-gw service uses a postgres database.

On startup, the service will run database migrations to ensure the DB implements the required schema.

It is a goal that migrations can be performed online with minimal disruption to the service, even with old and new versions of the service running simultaneously (for example, during an OpenShift rolling deployment).

Downgrading to an earlier version of the schema is not directly supported by the service. However, as exodus-gw is designed not to store any permanent state, dropping and recreating the exodus-gw database is a viable option if needed.

Settings

class exodus_gw.settings.Settings[source]

exodus-gw may be configured by the following settings.

Each settings value may be overridden using an environment variable of the same name, prefixed with EXODUS_GW_ (example: EXODUS_GW_CALL_CONTEXT_HEADER).

call_context_header: str

Name of the header from which to extract call context (for authentication and authorization).

upload_meta_fields: dict[str, str]

Permitted metadata field names for s3 uploads and their regex for validation. E.g., “exodus-migration-md5”: “^[0-9a-f]{32}$”

publish_paths: dict[str, dict[str, list[str]]]

A set of user or service accounts which are only authorized to publish to a particular set of path(s) in a given CDN environment and the regex(es) describing the paths to which the user or service account is authorized to publish. The user or service account will be prevented from publishing to any paths that do not match the defined regular expression(s). E.g., ‘{“pre”: {“fake-user”: [“^(/content)?/origin/files/sha256/[0-f]{2}/[0-f]{64}/[^/]{1,300}$”]}}’

Any user or service account not included in this configuration is considered to have unrestricted publish access (i.e., can publish to any path).

log_config: dict[str, Any]

Logging configuration in dictConfig schema.

ini_path: str | None

Path to an exodus-gw.ini config file with additional settings.

db_service_user: str

db service user name

db_service_pass: str

db service user password

db_service_host: str

db service host

db_service_port: str

db service port

db_url: str | None

Connection string for database. If set, overrides the db_service_* settings.

db_reset: bool

If set to True, drop all DB tables during startup.

This setting is intended for use during development.

db_migration_mode: MigrationMode

Adjusts the DB migration behavior when the exodus-gw service starts.

Valid values are:

upgrade (default)

Migrate the DB to db_migration_revision (default latest) when the service starts up.

This is the default setting and should be left enabled for typical production use.

model

Don’t use migrations. Instead, attempt to initialize the database from the current version of the internal sqlalchemy model.

This is intended for use during development while prototyping schema changes.

none

Don’t perform any DB initialization at all.

db_migration_revision: str

If db_migration_mode is upgrade, this setting can be used to override the target revision when migrating the DB.

db_session_max_tries: int

The maximum number of attempts to recreate a DB session within a request.

item_yield_size: int

Number of publish items to load from the service DB at one time.

write_batch_size: int

Maximum number of items to write to the DynamoDB table at one time.

write_max_tries: int

Maximum write attempts to the DynamoDB table.

write_max_workers: int

Maximum number of worker threads used in the DynamoDB batch writes.

write_queue_size: int

Maximum number of items the queue can hold at one time.

write_queue_timeout: int

Maximum amount of time (in seconds) to wait for queue items. Defaults to 10 minutes.

publish_timeout: int

Maximum amount of time (in hours) between updates to a pending publish before it will be considered abandoned. Defaults to one day.

history_timeout: int

Maximum amount of time (in hours) to retain historical data for publishes and tasks. Publishes and tasks in a terminal state will be erased after this time has passed. Defaults to two weeks.

path_history_timeout: int

Maximum amount of time (in days) to retain data on published paths for the purpose of cache flushing.

task_deadline: int

Maximum amount of time (in hours) a task should remain viable. Defaults to two hours.

actor_time_limit: int

Maximum amount of time (in milliseconds) actors may run. Defaults to 30 minutes.

actor_max_backoff: int

Maximum amount of time (in milliseconds) actors may use while backing off retries. Defaults to five (5) minutes.

entry_point_files: list[str]

List of file names that should be saved for last when publishing.

autoindex_filename: str

Filename for indexes automatically generated during publish.

Can be set to an empty string to disable generation of indexes.

autoindex_partial_excludes: list[str]

Background processing of autoindexes will be disabled for paths matching any of these values.

config_cache_ttl: int

Time (in minutes) config is expected to live in components that consume it.

Determines the delay for deployment task completion to allow for existing caches to expire and the newly deployed config to take effect.

worker_health_filepath: str

The path to a file used to verify healthiness of a worker. Intended to be used by OCP

worker_keepalive_timeout: int

Background worker keepalive timeout, in seconds. If a worker fails to update its status within this time period, it is assumed dead.

This setting affects how quickly the system can recover from issues such as a worker process being killed unexpectedly.

worker_keepalive_interval: int

How often, in seconds, should background workers update their status.

cron_cleanup: str

cron-style schedule for cleanup task.

exodus-gw will run a cleanup task approximately according to this schedule, removing old data from the system.

scheduler_interval: int

How often, in minutes, exodus-gw should check if a scheduled task is ready to run.

Note that the cron rules applied to each scheduled task are only as accurate as this interval allows, i.e. each rule may be triggered up to scheduler_interval minutes late.

scheduler_delay: int

Delay, in minutes, after exodus-gw workers start up before any scheduled tasks should run.

cdn_flush_on_commit: bool

Whether ‘commit’ tasks should automatically flush CDN cache for affected URLs.

Only takes effect for environments where cache flush credentials/settings have been configured.

cdn_listing_flush: bool

Whether listing paths in the config should be flushed while deploying the config.

Time (in seconds) cookies generated by cdn-redirect remain valid.

cdn_signature_timeout: int

Time (in seconds) signed URLs remain valid.

cdn_max_expire_days: int

Maximum permitted value for expire_days option on cdn-access endpoint.

Clients obtaining signed cookies for CDN using cdn-access will be forced to renew their cookies at least this frequently.

s3_pool_size: int

Number of S3 clients to cache

To enable per-environment configuration of exodus-gw, exodus-gw.ini is available to point the application at specific AWS resources and declare the AWS profile to use when interacting with those resources. Each environment must appear in its own section with the prefix “env.”.

[env.prod]
aws_profile = production
bucket = cdn-prod-s3
table = cdn-prod-db

Logger levels may also be configured via exodus-gw.ini. Under a section named “loglevels”, users may specify a logger name and the level at which to set said logger.

[loglevels]
root = NOTSET
exodus-gw = INFO
s3 = DEBUG
...

CDN cache flush

exodus-gw supports flushing the cache of an Akamai CDN edge via the Fast Purge API.

This feature is optional. If configuration is not provided, related APIs in exodus-gw will continue to function but will skip cache flush operations.

Enabling the feature requires the deployment of two sets of configuration.

Firstly, in exodus-gw.ini, define some cache flush rules under sections named [cache_flush.{rule_name}].

Each rule must define a list of URL/ARL templates for calculating the cache keys to flush. Rules may optionally define includes and excludes to select specific paths where the rule should be applied.

Once rules are defined, enable them for a specific environment by listing them in cache_flush_rules under that environment’s configuration. See the following example:

[env.live]
# Rule(s) to activate for this environment.
#
# This example supposes that there are two CDN hostnames in use,
# one of which exposes all content *except* a certain subtree
# and one which exposes *only* that subtree.
cache_flush_rules =
  cdn1
  cdn2

[cache_flush.cdn1]
# URL or ARL template(s) for which to flush cache.
#
# Templates can use placeholders:
# - path: path of a file under CDN root
# - ttl: a TTL value will be substituted
templates =
  https://cdn1.example.com
  S/=/123/22334455/{ttl}/cdn1.example.com/{path}

# Suppose that "/files" is restricted to cdn2, then the
# exclusion pattern here will avoid unnecessarily flushing
# cdn1 cache for paths underneath that subtree.
excludes =
  ^/files/

[cache_flush.cdn2]
templates =
  https://cdn2.example.com
  S/=/123/22334455/{ttl}/cdn2.example.com/{path}

# This rule only applies to this subtree, which was excluded
# from the other rule.
includes =
  ^/files/

Secondly, use environment variables to deploy credentials for the Fast Purge API, according to the below table. The fields here correspond to those used by the .edgerc file as found in Akamai’s documentation.

Note that “<env>” should be replaced with the specific corresponding environment name, e.g. EXODUS_GW_FASTPURGE_HOST_LIVE for a live environment.

Fast Purge credentials

Variable

.edgerc field

Example

EXODUS_GW_FASTPURGE_CLIENT_SECRET_<env>

client_secret

abcdEcSnaAt123FNkBxy456z25qx9Yp5CPUxlEfQeTDkfh4QA=I

EXODUS_GW_FASTPURGE_HOST_<env>

host

akab-lmn789n2k53w7qrs10cxy-nfkxaa4lfk3kd6ym.luna.akamaiapis.net

EXODUS_GW_FASTPURGE_ACCESS_TOKEN_<env>

access_token

akab-zyx987xa6osbli4k-e7jf5ikib5jknes3

EXODUS_GW_FASTPURGE_CLIENT_TOKEN_<env>

client_token

akab-nomoflavjuc4422-fa2xznerxrm3teg7