Deployment Guide¶
Target platform¶
exodus-gw is an ASGI application which may be deployed using any ASGI-compliant web server. The development team’s recommended setup is summarized as:
Use OpenShift >= 4.x to deploy the service.
Use the exodus-gw images at https://quay.io/repository/exodus/exodus-gw to run the service. These images run the service using gunicorn & uvicorn on RHEL8.
In general, the uvicorn deployment advice applies.
Deploy the service’s primary container behind a reverse-proxy implementing authentication according to your organization’s needs (see next section).
Database Migrations¶
The exodus-gw service uses a postgres database.
On startup, the service will run database migrations to ensure the DB implements the required schema.
It is a goal that migrations can be performed online with minimal disruption to the service, even with old and new versions of the service running simultaneously (for example, during an OpenShift rolling deployment).
Downgrading to an earlier version of the schema is not directly supported by the service. However, as exodus-gw is designed not to store any permanent state, dropping and recreating the exodus-gw database is a viable option if needed.
Settings¶
- class exodus_gw.settings.Settings[source]¶
exodus-gw may be configured by the following settings.
Each settings value may be overridden using an environment variable of the same name, prefixed with
EXODUS_GW_
(example:EXODUS_GW_CALL_CONTEXT_HEADER
).- call_context_header: str¶
Name of the header from which to extract call context (for authentication and authorization).
- upload_meta_fields: dict[str, str]¶
Permitted metadata field names for s3 uploads and their regex for validation. E.g., “exodus-migration-md5”: “^[0-9a-f]{32}$”
- publish_paths: dict[str, dict[str, list[str]]]¶
A set of user or service accounts which are only authorized to publish to a particular set of path(s) in a given CDN environment and the regex(es) describing the paths to which the user or service account is authorized to publish. The user or service account will be prevented from publishing to any paths that do not match the defined regular expression(s). E.g., ‘{“pre”: {“fake-user”: [“^(/content)?/origin/files/sha256/[0-f]{2}/[0-f]{64}/[^/]{1,300}$”]}}’
Any user or service account not included in this configuration is considered to have unrestricted publish access (i.e., can publish to any path).
- db_reset: bool¶
If set to True, drop all DB tables during startup.
This setting is intended for use during development.
- db_migration_mode: MigrationMode¶
Adjusts the DB migration behavior when the exodus-gw service starts.
Valid values are:
- upgrade (default)
Migrate the DB to
db_migration_revision
(default latest) when the service starts up.This is the default setting and should be left enabled for typical production use.
- model
Don’t use migrations. Instead, attempt to initialize the database from the current version of the internal sqlalchemy model.
This is intended for use during development while prototyping schema changes.
- none
Don’t perform any DB initialization at all.
- db_migration_revision: str¶
If
db_migration_mode
isupgrade
, this setting can be used to override the target revision when migrating the DB.
- db_session_max_tries: int¶
The maximum number of attempts to recreate a DB session within a request.
- write_queue_timeout: int¶
Maximum amount of time (in seconds) to wait for queue items. Defaults to 10 minutes.
- publish_timeout: int¶
Maximum amount of time (in hours) between updates to a pending publish before it will be considered abandoned. Defaults to one day.
- history_timeout: int¶
Maximum amount of time (in hours) to retain historical data for publishes and tasks. Publishes and tasks in a terminal state will be erased after this time has passed. Defaults to two weeks.
- path_history_timeout: int¶
Maximum amount of time (in days) to retain data on published paths for the purpose of cache flushing.
- task_deadline: int¶
Maximum amount of time (in hours) a task should remain viable. Defaults to two hours.
- actor_time_limit: int¶
Maximum amount of time (in milliseconds) actors may run. Defaults to 30 minutes.
- actor_max_backoff: int¶
Maximum amount of time (in milliseconds) actors may use while backing off retries. Defaults to five (5) minutes.
- phase2_patterns: list[Pattern[str]]¶
List of patterns which, if any have matched, force a path to be handled during phase 2 of commit.
These patterns are intended for use with repositories not cleanly separated between mutable entry points and immutable content.
For example, in-place updates to kickstart repositories may not only modify entry points such as extra_files.json but also arbitrary files referenced by that entry point, all of which should be processed during phase 2 of commit in order for updates to appear atomic.
- autoindex_filename: str¶
Filename for indexes automatically generated during publish.
Can be set to an empty string to disable generation of indexes.
- autoindex_partial_excludes: list[str]¶
Background processing of autoindexes will be disabled for paths matching any of these values.
- config_cache_ttl: int¶
Time (in minutes) config is expected to live in components that consume it.
Determines the delay for deployment task completion to allow for existing caches to expire and the newly deployed config to take effect.
- worker_health_filepath: str¶
The path to a file used to verify healthiness of a worker. Intended to be used by OCP
- worker_keepalive_timeout: int¶
Background worker keepalive timeout, in seconds. If a worker fails to update its status within this time period, it is assumed dead.
This setting affects how quickly the system can recover from issues such as a worker process being killed unexpectedly.
- worker_keepalive_interval: int¶
How often, in seconds, should background workers update their status.
- cron_cleanup: str¶
cron-style schedule for cleanup task.
exodus-gw will run a cleanup task approximately according to this schedule, removing old data from the system.
- scheduler_interval: int¶
How often, in minutes, exodus-gw should check if a scheduled task is ready to run.
Note that the cron rules applied to each scheduled task are only as accurate as this interval allows, i.e. each rule may be triggered up to
scheduler_interval
minutes late.
- scheduler_delay: int¶
Delay, in minutes, after exodus-gw workers start up before any scheduled tasks should run.
- cdn_flush_on_commit: bool¶
Whether ‘commit’ tasks should automatically flush CDN cache for affected URLs.
Only takes effect for environments where cache flush credentials/settings have been configured.
- cdn_listing_flush: bool¶
Whether listing paths in the config should be flushed while deploying the config.
To enable per-environment configuration of exodus-gw, exodus-gw.ini is available to point the application at specific AWS resources and declare the AWS profile to use when interacting with those resources. Each environment must appear in its own section with the prefix “env.”.
[env.prod]
aws_profile = production
bucket = cdn-prod-s3
table = cdn-prod-db
Logger levels may also be configured via exodus-gw.ini. Under a section named “loglevels”, users may specify a logger name and the level at which to set said logger.
[loglevels]
root = NOTSET
exodus-gw = INFO
s3 = DEBUG
...
CDN cache flush¶
exodus-gw supports flushing the cache of an Akamai CDN edge via the Fast Purge API.
This feature is optional. If configuration is not provided, related APIs in exodus-gw will continue to function but will skip cache flush operations.
Enabling the feature requires the deployment of two sets of configuration.
Firstly, in exodus-gw.ini
, define some cache flush rules under
sections named [cache_flush.{rule_name}]
.
Each rule must define a list of URL/ARL templates
for calculating
the cache keys to flush. Rules may optionally define includes
and
excludes
to select specific paths where the rule should be applied.
Once rules are defined, enable them for a specific environment by listing
them in cache_flush_rules
under that environment’s configuration.
See the following example:
[env.live]
# Rule(s) to activate for this environment.
#
# This example supposes that there are two CDN hostnames in use,
# one of which exposes all content *except* a certain subtree
# and one which exposes *only* that subtree.
cache_flush_rules =
cdn1
cdn2
[cache_flush.cdn1]
# URL or ARL template(s) for which to flush cache.
#
# Templates can use placeholders:
# - path: path of a file under CDN root
# - ttl: a TTL value will be substituted
templates =
https://cdn1.example.com
S/=/123/22334455/{ttl}/cdn1.example.com/{path}
# Suppose that "/files" is restricted to cdn2, then the
# exclusion pattern here will avoid unnecessarily flushing
# cdn1 cache for paths underneath that subtree.
excludes =
^/files/
[cache_flush.cdn2]
templates =
https://cdn2.example.com
S/=/123/22334455/{ttl}/cdn2.example.com/{path}
# This rule only applies to this subtree, which was excluded
# from the other rule.
includes =
^/files/
Secondly, use environment variables to deploy credentials for the Fast Purge API, according to the below table. The fields here correspond to those used by the .edgerc file as found in Akamai’s documentation.
Note that “<env>” should be replaced with the specific corresponding
environment name, e.g. EXODUS_GW_FASTPURGE_HOST_LIVE
for a live
environment.
Variable |
|
Example |
|
|
|
|
|
|
|
|
|
|
|
|