repo-autoindex

Minimal generator for HTML indexes of content repositories.

Overview

repo-autoindex is a minimal Python library and CLI for generating static HTML indexes for content repositories of various types. It supports:

  • yum repositories (repodata/repomd.xml)

  • pulp file repositories (PULP_MANIFEST)

  • kickstart tree repositories* (treeinfo, repodata/repomd.xml, extra_files.json)

repo-autoindex provides similar functionality to traditional server-generated directory indexes such as httpd’s mod_autoindex, with a few key differences:

  • The generated indexes are intentionally limited to show only the content present in repository metadata, rather than all content within a directory.

  • The method of obtaining the content for indexing can be customized, allowing the library to integrate with exotic scenarios such as repositories generated on demand or not stored within a traditional filesystem.

* repo-autoindex supports kickstart tree repositories satisfying certain conditions:

  • The kickstart repo contains exactly one yum repo

  • The yum repo is located in the root of the kickstart tree repo, at exactly .

Reference: CLI

Generate indexes for a repository accessed via HTTP(S)

usage: repo-autoindex [-h] [--index-filename FILENAME] [--debug] url

Positional Arguments

url

Base URL of repository to be indexed

Named Arguments

--index-filename

Basename of output file(s)

Default: “index.html”

--debug

Enable verbose logging

Default: False

Example

In the following example we generate indexes for a single Fedora yum repository. Note that the command generates multiple HTML files, reproducing the directory structure found in the repo.

REPO_URL=$(curl -s 'https://mirrors.fedoraproject.org/mirrorlist?repo=updates-released-f36&arch=x86_64' | egrep '^http' | head -n1)
repo-autoindex $REPO_URL
Fetching: https://fedora.mirror.digitalpacific.com.au/linux/updates/36/Everything/x86_64/repodata/repomd.xml
Fetching: https://fedora.mirror.digitalpacific.com.au/linux/updates/36/Everything/x86_64/repodata/32cf6191e4ef86045c9f34589d98f6378069359746b50def80a66e15fe5a906f-primary.xml.gz
Wrote ./index.html
Wrote repodata/index.html
Wrote Packages/index.html
Wrote Packages/z/index.html
Wrote Packages/y/index.html
Wrote Packages/x/index.html
Wrote Packages/w/index.html
Wrote Packages/v/index.html
Wrote Packages/u/index.html
Wrote Packages/t/index.html
Wrote Packages/s/index.html
Wrote Packages/r/index.html
Wrote Packages/q/index.html
Wrote Packages/p/index.html
Wrote Packages/o/index.html
Wrote Packages/n/index.html
Wrote Packages/m/index.html
Wrote Packages/l/index.html
Wrote Packages/k/index.html
Wrote Packages/j/index.html
Wrote Packages/i/index.html
Wrote Packages/h/index.html
Wrote Packages/g/index.html
Wrote Packages/f/index.html
Wrote Packages/e/index.html
Wrote Packages/d/index.html
Wrote Packages/c/index.html
Wrote Packages/b/index.html
Wrote Packages/a/index.html
Wrote Packages/3/index.html

Reference: API

exception repo_autoindex.ContentError

An error raised when indexed content appears to be invalid.

Errors of this type are raised when repo-autoindex is able to successfully retrieve content and determine a repository type but fails to parse repository metadata. For example, a corrupt yum repository may cause this error to be raised.

class repo_autoindex.GeneratedIndex(content: str, relative_dir: str = '.')

A single HTML index page generated by repo-autoindex.

content: str

The content of this index page (an HTML document).

relative_dir: str = '.'

The directory of this index page, relative to the root of the indexed repository.

async repo_autoindex.autoindex(url: str, *, fetcher: Optional[Callable[[str], Awaitable[Optional[Union[str, BinaryIO]]]]] = None, index_href_suffix: str = '') AsyncGenerator[GeneratedIndex, None]

Generate HTML indexes for a repository.

Parameters:
  • url – Base URL of repository to be indexed. The function will probe this URL for all supported repository types.

  • fetcher

    An optional callable to customize the retrieval method for content in the repository. Can be omitted to use a basic HTTP(S) fetcher.

    A valid implementation must satisfy this contract:

    • it will be called with the absolute URL of content which may or may not exist within the repository (e.g. “https://example.com/some-yum-repo/repodata/repomd.xml” when probing a yum repository)

    • if the fetcher can determine, without error, that the requested content does not exist: it must return None.

    • if the fetcher can retrieve the requested content, it must return the content at the given URL as a file-like object.

      Returning a str is also possible, but not recommended since it requires loading an entire file into memory at once, and some repositories contain very large files.

      Note that decompressing compressed files (such as bzipped XML in yum repositories) is the responsibility of the fetcher.

    • if the fetcher encounters an exception, it may allow the exception to propagate.

  • index_href_suffix

    Suffix added onto any links between one generated index and another.

    For example, if the caller intends to save each generated index page as autoindex.html, then index_href_suffix="autoindex.html" should be passed so that any links between one index and another will use a correct URL.

    On the other hand, if the caller intends to save each generated index page as index.html and serve them via a web server which automatically serves files named index.html within each directory, the suffix can be left blank.

Returns:

An async generator producing zero or more instances of GeneratedIndex.

Zero indexes may be produced if the given URL doesn’t represent a repository of any supported type.

Raises:
  • ContentError – Raised if indexed content appears to be invalid (for example, a yum repository has invalid repodata).

  • Exception – Any exception raised by fetcher will propagate (for example, I/O errors or HTTP request failures).