[go: up one dir, main page]

Skip to content

Solution implementation for "users can ask documentation questions also on SM Chat" based on retrieving snippets from Vertex AI Search

Release notes

A popular capability of GitLab Duo Chat is answering questions about how to use GitLab. While Duo Chat offers various other capabilities, this particular functionality was previously only available on GitLab.com. With this release, we're making it accessible to self-managed deployments as well, ensuring a consistent experience across all GitLab deployments.

Whether you're a newcomer or an expert, you can now ask Duo Chat for help with queries like "How do I change my password in GitLab?" or "How do I connect a Kubernetes cluster to GitLab?". Duo Chat aims to provide helpful information to solve your problems more efficiently.

By bringing this intuitive support to self-managed users, we're aligning with our commitment to delivering a delightful experience across all deployment methods.

https://docs.gitlab.com/ee/user/gitlab_duo_chat.html#ask-about-gitlab

Problem to solve

Main problem:

This is very urgent, as we do not want our SM customers to feel that Duo Chat primarily works on .com. We want Duo Chat to be on par between .com and SM, and this is currently our only gap.

Side problems:

  • It can effectively resolves GitLab Duo Chat: Expanded GitLab Documentation ... (#440428) since the data store can be attached to multiple repositories.
  • The existing solution on .com uses a basic embeddings chunking method. We do not have tooling, yet, to know how good that works. Picking a solution that does a more elaborate and like better chunking would be a plus.

Architectural overview

Architecture Doc

Proposal

Tasks:

Notes:

  • This approach allows us to minimize code that we can't update on a customer's behalf, which means avoiding hard-coding AI-related logic in the GitLab monolith codebase. We can retain the flexibility to make changes in our product without asking customers to upgrade their GitLab version.
  • We can filter out irrelevant contents from GitLab-documentations e.g. contribution and development docs. See this comment for more information.
  • This solution doesn't meet the point of Technical solution should be extensible to cover future use cases where SM users can ask questions about private / dynamic data. (ie, issues, MRs, etc).

Alternative that where considered

  • The second runner up for a quick solution was based on operating Elastic Search to host docs vectors. The effort is about the same. The downside of this one is that we would still need to do the chunking ourselves and we are not experts on this, yet.
  • The other options considered were to operate Elastic Search or pgvector in the SM instances. One of these two are definitely the long term solution that will also cover many other use cases, such as embedding repos and projects for searching in them. But this will be a multi-month project and we don't want to wait that long to bring SM on par with .com regarding docs-related questions.

Customers waiting for this (not complete list)

Documentation

Links / references

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited by Mark Chao