[go: up one dir, main page]

Skip to content

Draft: PoC for unified Ai:UsageEvent approach

What does this MR do and why?

It build PoC for generalized storage of Ai::UsageEvent. Intended usage:

Event definition

Sample event definition

Framework users will be expected to define their events and corresponding additional data transformation logic in their files under ee/lib/gitlab/analytics/ai_usage_events/ directory.

In this files they will have to define unique ID of each event and (optionally) payload transformation which will be evaluated in scope of each event to retrieve additional context data to be stored.

Event trigger

# directly
Gitlab::Tracking::AiTracking.track_event(:troubleshoot_job, user: User.first, job: Ci::Build.first)
# through internal_events
Gitlab::InternalEvents.track_event(:code_suggestion_shown_in_ide, 
  user: User.first, 
  project: Project.first, 
  additional_properties: {unique_tracking_id: 'foo', timestamp: '2025-06-11', language: 'ruby', suggestion_size: 18})

Event Data retrieval

After background workers work off data will appear in PG\CH. You can query it with filters on payload field:

[1] pry(main)> ClickHouse::Client.select("SELECT * FROM ai_usage_events WHERE payload.project_id > 1", :main)
=> [{"user_id"=>1, "event"=>7, "timestamp"=>Wed, 11 Jun 2025 15:00:59.936000000 UTC +00:00, "namespace_path"=>"27/28/", "payload"=>{"foo"=>"bar", "job_id"=>"1", "pipeline_id"=>"1", "project_id"=>"4"}},
 {"user_id"=>1, "event"=>7, "timestamp"=>Wed, 11 Jun 2025 15:49:15.941000000 UTC +00:00, "namespace_path"=>"27/28/", "payload"=>{"foo"=>"bar", "job_id"=>"1", "pipeline_id"=>"1", "project_id"=>"4"}}]

Automatic GraphQL exposure

basic attributes of new events are automatically exposed to AiUsageData.all field. Developers can create their own endpoint or fields in AiUsageData for their events subgroups. E.g. AiUsageData.codeSuggestionEvents

image

Features

  1. Data will be stored to PG with 3 months retention rate. If a client enables CH the data will be backfilled to CH automatically.
  2. CH offers deep filtering and deep select on JSON columns as described above.
  3. Each team can create their own event definition files, creating separation from other events.
  4. Events can be triggered directly or integrated as part of Gitlab::InternalEvents.
  5. Raw events can be automatically exposed to AiUsageData GraphQL endpoint.

References

Screenshots or screen recordings

Before After

How to set up and validate locally

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #547686 (closed)

Edited by Pavel Shutsin

Merge request reports

Loading