Include a Default Database for the Container Registry
Release notes
Include a Default Database for the Container Registry
The metadata database for the Container Registry is GA as of 17.3, this feature provides a default database which the Container registry can connect to by default. With this, the metadata database can be enabled by default without requiring users to provision and configure a database from scratch.
Problem to solve
As a Systems Administrator, I want to spin up a new small to moderately GitLab instance using the Container Registry without needing to research the metadata database feature and spin up a new database for the registry, so I can take advantage of the new features of the registry and avoid a required future migration to the database.
Intended users
User experience goal
A GitLab admin should be able to enable a new instance of the Container Registry with the database enabled that will be preconfigured to connect to an existing database.
Proposal
This proposal seeks to address the following concerns:
- Can the registry use a separate logical database on an existing database instance by default?
- What user recommendations can we provide as registry load scales?
- What impact does the default registry database have on HA/DR and Geo setups?
This proposal also seeks to consolidate the efforts from the following issues into a single space for discussion:
- Research and discussion: Provisioning Container... (container-registry#1102 - closed) • Unassigned • Backlog
- Provision a Container Registry database for Omn... (container-registry#1074 - closed) • Unassigned • Backlog • On track
- Spike: Add support for the new container regist... (gitlab-org/distribution/team-tasks#1107 - closed) • Unassigned • Next 4-6 releases • Needs attention
- Container Registry: Using a database to store m... (gitlab-org/distribution/team-tasks#606 - closed) • Unassigned
Further details
Geo implementation details
View #480742 (comment 2228622136) for more details about options that were considered.
Non-Geo setup migration to Geo setup
- Both primary and secondary sites must have a separate instance of PG for the metadata DB.
- Separate metadata DB from main rails DB into it's own instance on the primary site
- Use
pg_dump
pg_restore
for this. We must document the process for this.
- Use
Limitations
- No HA for metadata DB if the Linux package version is used. If managed DBs such as RDS are used their HA capabilities can be leveraged.
Availability & Testing
What risks does this change pose to our availability?
We are already using the metadata database on .com, so the risks are essentially zero.
How might it affect the quality of the product?
With the metadata database enabled new features, such as continuous online garbage collection, are available. Additionally, the metadata database unifies metadata features implemented across five separate storage drivers into a single implementation, greatly reducing the complexity of the registry.
What additional test coverage or changes to tests will be needed?
The registry itself already has excellent test coverage for this feature.
It's possible that the Test Platform Team may need to create smoke and performance tests, similar to the ones for the ones used to validate the reference archetectures
Available Tier
- Free
Feature Usage Metrics
We track the number users who have enabled the metadata database via usage ping.
What does success look like, and how can we measure that?
Is this a cross-stage feature?
Yes: groupgeo, groupdatabase, and groupdistribution are affected.
What is the competitive advantage or differentiation for this feature?
Running a container registry can be prohibitively expensive as container images consume object storage space and operations. Most container registry implementations, including ours when not using the database, require the registry to be taken offline in order to safely removed unused registry data. As registry usage increases, offline garbage collection begins to take prohibitively long periods to complete and the offline garbage collection process is, for the most part, not interruptible meaning that users are generally not able to terminate the process early and receive partial storage removal.
Online garbage collection, enabled by the metadata database, solves this problem. Garbage collection runs contentiously in the background enabling storage to be recouped without downtime.