diff --git a/doc/development/sec/vulnerability_management/adding_elasticsearch_filters.md b/doc/development/sec/vulnerability_management/adding_elasticsearch_filters.md new file mode 100644 index 0000000000000000000000000000000000000000..73dd053dbfaa8617e9893e69fc8213e85e417e63 --- /dev/null +++ b/doc/development/sec/vulnerability_management/adding_elasticsearch_filters.md @@ -0,0 +1,665 @@ +# Adding New Elasticsearch Filters for Vulnerability Management + +This guide covers how to add new filters using Elasticsearch for advanced vulnerability management features in GitLab. These features are part of the advanced vulnerability management capabilities and require Elasticsearch to be configured. + +## Overview + +Advanced vulnerability management features in GitLab use Elasticsearch to provide enhanced filtering capabilities that go beyond what's possible with PostgreSQL alone. This includes features like: + +- Advanced OWASP Top 10 filtering +- Identifier name searching +- Reachability filtering +- Complex aggregations and analytics + +## Prerequisites + +Before adding a new Elasticsearch filter, ensure you understand: + +- GitLab's Elasticsearch integration architecture +- Vulnerability data model and relationships +- GraphQL resolver patterns +- Database migration patterns +- Feature flag usage + +## Step-by-Step Implementation Guide + +### 1. GraphQL Resolver Updates + +First, add the new filter argument to the appropriate GraphQL resolver. Most vulnerability filters are added to `VulnerabilitiesResolver`. + +**File:** `ee/app/graphql/resolvers/vulnerabilities_resolver.rb` + +```ruby +argument :your_new_filter, GraphQL::Types::String, + required: false, + experiment: { milestone: '18.x' }, + description: 'Filter vulnerabilities by your new criteria. ' \ + 'To use this argument, you must have Elasticsearch configured and the ' \ + '`advanced_vulnerability_management` feature flag enabled. ' \ + 'Not supported on Instance Security Dashboard queries.' +``` + +**Key considerations:** +- Mark experimental features with `experiment: { milestone: 'X.Y' }` +- Include clear documentation about Elasticsearch requirements +- Note any limitations (e.g., Instance Security Dashboard support) + +### 2. Update VulnerabilityFilterable Module + +Add validation logic for your new filter in the `VulnerabilityFilterable` module. + +**File:** `ee/app/graphql/resolvers/vulnerability_filterable.rb` + +```ruby +# Add your filter to the ADVANCED_FILTERS constant if it requires ES +ADVANCED_FILTERS = [:owasp_top_10_2021, :identifier_name, :reachability, :your_new_filter].freeze + +# Add validation logic in the validate_filters method +def validate_filters(filters) + # ... existing validations ... + + validate_your_new_filter!(vulnerable) if filters[:your_new_filter].present? +end + +private + +def validate_your_new_filter!(vulnerable) + # Add any specific validation logic for your filter + # Example: check if required migrations are complete + return if Feature.enabled?(:your_new_filter_feature_flag, vulnerable) && + ::Elastic::DataMigrationService.migration_has_finished?(:add_your_field_to_vulnerability) + + raise ::Gitlab::Graphql::Errors::ArgumentError, + 'The \'your_new_filter\' argument is not currently supported. ' \ + 'This feature requires Elasticsearch and specific migrations to be complete.' +end +``` + +### 3. Use Elasticsearch Finder + +The `VulnerabilityElasticSearchFinder` automatically uses the Elasticsearch query builder. No changes are typically needed here unless you're adding a completely new finder pattern. + +**File:** `ee/app/finders/security/vulnerability_elastic_search_finder.rb` + +The finder delegates to `VulnerabilityQueryBuilder` which handles all filter logic. + +### 4. Elasticsearch Query Builder + +Add your filter logic to the vulnerability query builder. + +**File:** `ee/lib/search/elastic/vulnerability_query_builder.rb` + +```ruby +def build + # ... existing filters ... + + query_hash = ::Search::Elastic::VulnerabilityFilters.by_your_new_filter( + query_hash: query_hash, options: options) + + # ... rest of the method ... +end +``` + +### 5. Elasticsearch Vulnerability Filters + +Implement the actual filter logic in the VulnerabilityFilters module. + +**File:** `ee/lib/search/elastic/vulnerability_filters.rb` + +```ruby +def by_your_new_filter(query_hash:, options:) + your_filter_value = options[:your_new_filter] + return query_hash if your_filter_value.blank? + + # Add validation if needed + return query_hash unless valid_filter_value?(your_filter_value) + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { + terms: { + _name: context.name(:your_new_filter), + your_field_name: your_filter_value + } + } + end + end +end + +private + +def valid_filter_value?(value) + # Add validation logic specific to your filter + # Return true if valid, false otherwise +end +``` + +**Common filter patterns:** + +- **Term filter** (exact match): `{ term: { field: { value: value } } }` +- **Terms filter** (multiple values): `{ terms: { field: values } }` +- **Range filter**: `{ range: { field: { gte: min, lte: max } } }` +- **Boolean filter** (complex logic): `{ bool: { must: [...], should: [...] } }` + +### 6. Elasticsearch Schema and Migrations + +#### Define the Elasticsearch Schema + +Update the vulnerability Elasticsearch type to include your new field. + +**File:** `ee/lib/search/elastic/types/vulnerability.rb` + +```ruby +def base_mappings + { + # ... existing mappings ... + your_field_name: { type: 'keyword' }, # or appropriate type + # ... rest of mappings ... + } +end +``` + +**Common Elasticsearch field types:** +- `keyword`: For exact matching (IDs, enums, short strings) +- `text`: For full-text search +- `short`: For small integers (enums) +- `long`: For large integers +- `boolean`: For true/false values +- `date`: For timestamps + +#### Create Elasticsearch Migration + +Create a new Elasticsearch migration to add the field to existing indices. + +**File:** `ee/elastic/migrate/YYYYMMDDHHMMSS_add_your_field_to_vulnerability.rb` + +```ruby +# frozen_string_literal: true + +class AddYourFieldToVulnerability < Elastic::Migration + include ::Search::Elastic::MigrationUpdateMappingsHelper + + DOCUMENT_TYPE = Vulnerability + + private + + def index_name + ::Search::Elastic::Types::Vulnerability.index_name + end + + def new_mappings + { + your_field_name: { + type: 'keyword' # or appropriate type + } + } + end +end +``` + +#### Create Migration Documentation + +**File:** `ee/elastic/docs/YYYYMMDDHHMMSS_add_your_field_to_vulnerability.yml` + +```yaml +--- +name: AddYourFieldToVulnerability +version: 'YYYYMMDDHHMMSS' +description: Adds your_field_name field to the Vulnerability index for enhanced filtering capabilities. +group: group::security infrastructure +milestone: '18.x' +introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/XXXXX +obsolete: false +marked_obsolete_by_url: +marked_obsolete_in_milestone: +``` + +### 7. Keep Elasticsearch in Sync + +#### Update Vulnerability Reads Model + +Ensure your new field is tracked for Elasticsearch updates. + +**File:** `ee/app/models/vulnerabilities/read.rb` + +```ruby +# Add your field to the tracked fields constant +ELASTICSEARCH_TRACKED_FIELDS = ::Search::Elastic::References::Vulnerability::DIRECT_FIELDS + + ::Search::Elastic::References::Vulnerability::DIRECT_TYPECAST_FIELDS + + %w[traversal_ids your_field_name] +``` + +#### Manual Bookkeeping for ActiveRecord Models + +For fields that require complex calculations or come from related models, implement manual bookkeeping: + +```ruby +# In the model that changes and affects the vulnerability field +after_commit :update_vulnerability_elasticsearch_field, on: [:create, :update, :destroy] + +private + +def update_vulnerability_elasticsearch_field + return unless vulnerability_read&.use_elasticsearch? + + # Update the field value + vulnerability_read.update_column(:your_field_name, calculate_new_value) + + # Trigger ES update + vulnerability_read.maintain_elasticsearch_update(updated_attributes: ['your_field_name']) +end +``` + +#### Bulk Operations with BulkEsOperationService + +For operations that affect multiple vulnerability records outside of ActiveRecord models, use the `BulkEsOperationService` helper: + +**File:** `ee/app/services/vulnerabilities/bulk_es_operation_service.rb` + +```ruby +# Example usage in a service that updates multiple vulnerabilities +def update_multiple_vulnerabilities(vulnerability_relation) + Vulnerabilities::BulkEsOperationService.new(vulnerability_relation).execute do |relation| + # Perform your database updates here + relation.update_all(your_field_name: new_value) + end +end +``` + +**Key benefits of BulkEsOperationService:** +- Handles preloading of necessary associations +- Filters to only eligible vulnerabilities (those that should be indexed) +- Batches Elasticsearch updates efficiently +- Works with both `Vulnerability` and `Vulnerabilities::Read` relations + +#### Sync During Vulnerability Ingestion Framework + +For bulk operations that don't use ActiveRecord models (like vulnerability ingestion), perform manual sync: + +**Example from Security Ingestion:** + +```ruby +# ee/app/services/security/ingestion/tasks/ingest_remediations.rb +def update_vulnerability_reads + unfound_reads = Vulnerabilities::Read.by_vulnerabilities( + vulnerability_ids_for_remediations(unfound_remediations)) + ::Vulnerabilities::BulkEsOperationService.new(unfound_reads).execute do |relation| + relation.update_all(has_remediations: false) + end + + found_reads = Vulnerabilities::Read.by_vulnerabilities( + vulnerability_ids_for_remediations(found_remediations)) + ::Vulnerabilities::BulkEsOperationService.new(found_reads).execute do |relation| + relation.update_all(has_remediations: true) + end +end +``` + +**Example from SBOM Ingestion:** + +```ruby +# ee/app/services/sbom/ingestion/tasks/ingest_occurrences_vulnerabilities.rb +def after_ingest + return unless return_data.present? + + vulnerabilities_relation = Vulnerability.id_in(return_data.flatten) + sync_elasticsearch(vulnerabilities_relation) if vulnerabilities_relation.present? +end + +def sync_elasticsearch(vulnerabilities) + ::Vulnerabilities::BulkEsOperationService.new(vulnerabilities).execute(&:itself) +end +``` + +#### Sync During Project/Group Operations + +**Project Deletion:** + +```ruby +# ee/app/services/vulnerabilities/removal/remove_from_project_service.rb +def sync_elasticsearch + vulnerabilities_to_delete = Vulnerability.id_in(vulnerability_ids) + Vulnerabilities::BulkEsOperationService.new(vulnerabilities_to_delete).execute(&:itself) +end +``` + +**Project/Group Transfer:** + +When projects or groups are transferred, vulnerability traversal IDs need to be updated: + +```ruby +# Update traversal_ids for vulnerabilities when group structure changes +Vulnerabilities::UpdateTraversalIdsOfVulnerabilityReadsService.new( + group_id: new_group.id +).execute +``` + +#### Future: Change Data Capture (CDC) + +GitLab is planning to move synchronization outside of application logic using Change Data Capture from PostgreSQL. This work is tracked in [Epic 18520](https://gitlab.com/groups/gitlab-org/-/epics/18520). + +**Benefits of CDC approach:** +- Removes sync logic from application code +- Reduces application complexity +- Improves reliability and consistency +- Enables real-time synchronization +- Reduces performance impact on application operations + +**Current Status:** +- In planning/research phase +- Will gradually replace manual sync mechanisms +- Timeline and implementation details are being defined + +### 8. Elasticsearch Backfill Migration + +Create a background migration to populate the new field for existing records. + +**File:** `lib/gitlab/background_migration/backfill_your_field_to_vulnerability_reads.rb` + +```ruby +# frozen_string_literal: true + +module Gitlab + module BackgroundMigration + class BackfillYourFieldToVulnerabilityReads < BatchedMigrationJob + operation_name :backfill_your_field_to_vulnerability_reads + feature_category :vulnerability_management + + def perform + each_sub_batch do |sub_batch| + sub_batch.update_all(your_field_name: calculate_field_value) + end + end + + private + + def calculate_field_value + # Implement logic to calculate the field value + # This might involve joins or complex calculations + end + end + end +end +``` + +**Queue the migration:** + +**File:** `db/post_migrate/YYYYMMDDHHMMSS_queue_backfill_your_field_to_vulnerability_reads.rb` + +```ruby +# frozen_string_literal: true + +class QueueBackfillYourFieldToVulnerabilityReads < Gitlab::Database::Migration[2.2] + milestone '18.x' + + restrict_gitlab_migration gitlab_schema: :gitlab_main + + def up + queue_batched_background_migration( + :backfill_your_field_to_vulnerability_reads, + :vulnerability_reads, + :vulnerability_id, + job_class_name: 'BackfillYourFieldToVulnerabilityReads' + ) + end + + def down + delete_batched_background_migration( + :backfill_your_field_to_vulnerability_reads, + :vulnerability_reads, + :vulnerability_id, + 'BackfillYourFieldToVulnerabilityReads' + ) + end +end +``` + +### 9. Feature Flags + +#### Create Feature Flag + +**File:** `ee/config/feature_flags/beta/your_new_filter.yml` + +```yaml +--- +name: your_new_filter +description: Enable the new filter for vulnerability management +feature_issue_url: https://gitlab.com/groups/gitlab-org/-/epics/XXXXX +introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/XXXXX +rollout_issue_url: https://gitlab.com/gitlab-org/gitlab/-/issues/XXXXX +milestone: '18.x' +group: group::security infrastructure +type: beta +default_enabled: false +``` + +#### Use Feature Flag in Code + +```ruby +# In validation logic +return unless Feature.enabled?(:your_new_filter, vulnerable) + +# In filter logic +return query_hash unless Feature.enabled?(:your_new_filter, options[:current_user]) +``` + +### 10. UI Access Control + +The UI automatically respects the `access_advanced_vulnerability_management` ability. This ability is granted when: + +1. User has `:read_security_resource` permission +2. Vulnerability indexing is enabled (Elasticsearch configured) +3. `advanced_vulnerability_management` feature flag is enabled +4. Required Elasticsearch migrations are complete + +**File:** `ee/app/policies/vulnerabilities/advanced_vulnerability_management_policy.rb` + +The policy automatically handles access control. No changes needed unless you have specific requirements. + +**Frontend Integration:** + +In Vue components, check the ability: + +```javascript +// The ability is automatically pushed to frontend in vulnerability controllers +computed: { + canUseAdvancedFilters() { + return this.glFeatures.accessAdvancedVulnerabilityManagement; + } +} +``` + +## Testing + +### Unit Tests + +1. **GraphQL Resolver Tests**: Test argument validation and filter application +2. **Finder Tests**: Test Elasticsearch query generation +3. **Filter Tests**: Test individual filter logic +4. **Migration Tests**: Test background migration logic + +### Integration Tests + +1. **End-to-end GraphQL Tests**: Test complete filter functionality +2. **Elasticsearch Integration Tests**: Test actual ES queries +3. **Feature Flag Tests**: Test behavior with flags enabled/disabled + +## Best Practices + +### Performance Considerations + +1. **Index Optimization**: Choose appropriate field types for your use case +2. **Query Efficiency**: Use term filters for exact matches, avoid expensive operations +3. **Pagination**: Always implement proper pagination for large result sets +4. **Caching**: Consider caching expensive calculations + +### Security Considerations + +1. **Input Validation**: Always validate filter inputs +2. **Access Control**: Respect existing permission models +3. **Data Exposure**: Ensure filters don't expose sensitive information + +### Maintainability + +1. **Documentation**: Document complex filter logic +2. **Testing**: Comprehensive test coverage for all scenarios +3. **Monitoring**: Add metrics for filter usage and performance +4. **Backwards Compatibility**: Consider migration paths for existing data + +## Common Patterns + +### Enum-based Filters (Straightforward) + +Enum and boolean field filters are straightforward to implement since they map directly to database values: + +```ruby +def by_enum_field(query_hash:, options:) + enum_values = options[:enum_field] + return query_hash if enum_values.blank? + + # Validate enum values - this is straightforward since enums have fixed values + valid_values = YourModel.enum_field.values + return query_hash unless (enum_values - valid_values).empty? + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { terms: { enum_field: enum_values } } + end + end +end +``` + +### Boolean Filters (Straightforward) + +Boolean filters are also straightforward since they only accept true/false values: + +```ruby +def by_boolean_field(query_hash:, options:) + boolean_value = options[:boolean_field] + return query_hash if boolean_value.nil? + + # Simple validation for boolean values + return query_hash unless boolean_value.in?([true, false]) + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { term: { boolean_field: { value: boolean_value } } } + end + end +end +``` + +### Text Search Filters + +```ruby +def by_text_search(query_hash:, options:) + search_term = options[:text_search] + return query_hash if search_term.blank? + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { + simple_query_string: { + fields: ["field_name", "field_name.ngram"], + query: search_term, + lenient: true, + default_operator: :and + } + } + end + end +end +``` + +## Performance Monitoring + +### Using Performance Bar to Monitor Elasticsearch Queries + +GitLab provides a performance bar that shows Elasticsearch queries executed during GraphQL requests. This is invaluable for debugging and optimizing your filters. + +**Enabling the Performance Bar:** + +1. **For Administrators**: The performance bar is automatically available +2. **For Non-administrators**: Must be enabled in Admin settings + - Go to **Admin Area > Settings > Metrics and profiling** + - Select **Allow non-administrators access to the performance bar** + +**Monitoring Elasticsearch Queries:** + +1. **Enable the performance bar** by pressing `p` + `b` on any GitLab page +2. **Execute your GraphQL query** that uses the new filter +3. **Click on the `es` section** in the performance bar to see: + - Number of Elasticsearch queries executed + - Total duration of all queries + - Individual query details including: + - Query method (GET, POST, etc.) + - Query path and parameters + - Execution time for each query + +**Performance Bar Elasticsearch View:** + +The performance bar shows Elasticsearch metrics with color-coded warnings: +- **Green**: Normal performance +- **Yellow**: Approaching thresholds (5+ calls or 1000ms+ duration) +- **Red**: Exceeding performance thresholds + +**Example Performance Bar Output:** +``` +es: 3 (1.2s) +``` +This indicates 3 Elasticsearch queries taking 1.2 seconds total. + +**Detailed Query Information:** + +Click on the `es` count to see detailed information: +```json +{ + "method": "POST", + "path": "/gitlab-vulnerabilities/_search", + "params": { + "routing": "group_123", + "timeout": "30s" + }, + "duration": 450.2 +} +``` + +**Performance Thresholds:** + +The performance bar uses these default thresholds: +- **Calls**: 5 queries (warning threshold) +- **Duration**: 1000ms total (warning threshold) +- **Individual Call**: 1000ms per query (warning threshold) + +**Best Practices for Performance:** +- Keep the number of ES queries low (ideally 1-2 per request) +- Optimize query complexity to stay under duration thresholds +- Use appropriate field types (keyword vs text) for your use case +- Consider caching for frequently accessed data + +## Troubleshooting + +### Common Issues + +1. **Migration Not Complete**: Ensure Elasticsearch migrations are finished before enabling filters +2. **Permission Denied**: Verify user has required permissions and feature flags are enabled +3. **Query Performance**: Monitor Elasticsearch query performance and optimize as needed +4. **Data Inconsistency**: Ensure proper synchronization between PostgreSQL and Elasticsearch + +### Debugging + +1. **Enable Query Logging**: Use Elasticsearch query logging to debug issues +2. **Check Migration Status**: Verify migration completion status +3. **Test Feature Flags**: Ensure feature flags are properly configured +4. **Validate Data**: Check that data is properly indexed in Elasticsearch + +## Related Documentation + +- [Elasticsearch Integration](../../integration/advanced_search/elasticsearch.md) +- [GraphQL Development](../api_graphql_styleguide.md) +- [Background Migrations](../database/batched_background_migrations.md) +- [Feature Flags](../feature_flags/index.md) +- [Performance Bar](../../administration/monitoring/performance/performance_bar.md) +- [Vulnerability Management](./index.md) +- [SBOM Dependency Graph Ingestion](./sbom_dependency_graph_ingestion_overview.md) +- [Change Data Capture Epic](https://gitlab.com/groups/gitlab-org/-/epics/18520) \ No newline at end of file diff --git a/doc/development/sec/vulnerability_management/elastic_search_integration_architechture.md b/doc/development/sec/vulnerability_management/elastic_search_integration_architechture.md new file mode 100644 index 0000000000000000000000000000000000000000..132cbb74dbd4793fb9e8da4dc1b0ed8784875295 --- /dev/null +++ b/doc/development/sec/vulnerability_management/elastic_search_integration_architechture.md @@ -0,0 +1,159 @@ +--- +stage: SRM +group: Security Infrastructure +info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/development/development_processes/#development-guidelines-review. +title: Elasticsearch Integration Architecture for Vulnerability Management +--- + +This document describes the architecture and implementation details of GitLab's Elasticsearch integration for advanced vulnerability management features. + +## Overview + +The Elasticsearch integration for vulnerability management provides advanced querying capabilities including: + +- Complex filtering (OWASP Top 10, identifier names, reachability) +- Full-text search across vulnerability identifiers + +## Architecture Diagrams + +### High-Level Ingestion Architecture + +![High-Level Ingestion Architecture diagram](img/es_ingestion_high_level_arch.png) + +```mermaid +graph TB + subgraph "GitLab Application" + A[Security Scanner] --> B[Security Report Artifact] + B --> C[Security Report Ingestion] + C --> D[PostgreSQL Database] + C --> E[Vulnerabilities::Read Model] + end + + subgraph "Elasticsearch Integration" + E --> F[ES Indexing Service] + F --> G[Elasticsearch Cluster] + D --> H[Background Jobs] + H --> F + end + + subgraph "Query Layer" + I[GraphQL API] --> J[VulnerabilityElasticSearchFinder] + J --> K[VulnerabilityQueryBuilder] + K --> G + G --> L[Search Results] + L --> I + end + + subgraph "Data Synchronization" + M[ActiveRecord Callbacks] --> F + N[Background Migrations] --> F + O[Bulk Operations] --> P[BulkEsOperationService] + P --> F + end +``` + +### Detailed Ingestion and Querying Flow + +![Ingestion & Querying](img/es_ingestion_querying.png) + +```mermaid +flowchart TD + A[📥 Data Ingestion] --> B[💾 Backfill] + A --> C[🔄 Continuous Updates] + + B --> D[⬆️ Ingest Existing Data into ES] + + C --> E[⚡ Model Callback CUD Changes] + C --> F[💾 Bulk Updates without ActiveRecord] + + E --> G[⬆️ Update ES with CUD Changes] + F --> H[⬆️ Perform Bulk Updates in ES] + + I[🔍 Querying] --> J[🔍 Filtering/Searching Operations] + J --> K[📋 Fetch Vulnerability IDs from ES] + K --> L[🎯 Presentation Logic from PG with the fetched IDs] +``` + +```mermaid +sequenceDiagram + participant Scanner as Security Scanner + participant Pipeline as CI Pipeline + participant Ingestion as Report Ingestion + participant PG as PostgreSQL + participant ES as Elasticsearch + participant API as GraphQL API + participant UI as Frontend UI + + Note over Scanner, UI: Data Ingestion Flow + Scanner->>Pipeline: Generate Security Report + Pipeline->>Ingestion: Upload Report Artifact + Ingestion->>PG: Store Vulnerability Data + Ingestion->>ES: Index Vulnerability (via callbacks) + + Note over Scanner, UI: Background Synchronization + PG->>ES: Sync via Background Jobs + PG->>ES: Bulk Updates via BulkEsOperationService + + Note over Scanner, UI: Query Flow + UI->>API: GraphQL Query with Filters + API->>ES: Execute Elasticsearch Query + ES->>API: Return Search Results + API->>UI: Formatted Response + + Note over Scanner, UI: Fallback Mechanism + API->>PG: Fallback to PostgreSQL (if ES unavailable) + PG->>API: Basic Query Results +``` + +## Integration Stages + +It involves Data Ingestion and sync to ES. + +#### 1.1 Security Report Processing +- Security scanners generate vulnerability reports in JSON format +- Reports are uploaded as CI/CD job artifacts +- The security report ingestion framework processes these artifacts +- Vulnerability data is stored in PostgreSQL tables + +#### 1.2 Elasticsearch Bookkeeping +- Bookkeeping during the vulnerability ingestion flow. + - Refer code here. +- Also `Vulnerabilities::Read` and `Vulnerability` model acts as the indexing source + - ActiveRecord callbacks trigger Elasticsearch updates on data changes +- From the service classes which processes vulnerabilities we use + - `bulk_es_operation_service`. + +#### 1.3 Elasticsearch indexing +- The bookkeeping sync above adds references in Redis. +- We have the bulk indexer Sidekiq cron which processes the refs and indexes ES. + +**Key Components:** +- `ee/lib/gitlab/elastic/bulk_indexer.rb` - ES bulk_indexer + + + +## Troubleshooting + +### Common Issues +- **Sync Lag**: Check background job queues and ES cluster health +- **Query Failures**: Verify feature flags and migration status +- **Permission Errors**: Validate user permissions and access policies +- **Performance Issues**: Review query patterns and ES cluster resources + +### Debugging Tools +- Performance bar ES query monitoring +- ES query logging and profiling +- Background job monitoring +- Feature flag status checking + +## Related Documentation + +- [Adding New Elasticsearch Filters](adding_elasticsearch_filters.md) +- [Elasticsearch Vulnerability Data Model](elastic_search_vulnerability_data_model.md) +- [Security Report Ingestion Overview](../security_report_ingestion_overview.md) + +## References + +- [POC Issue #514697](https://gitlab.com/gitlab-org/gitlab/-/issues/514697) +- [Vulnerability Management Epic #13510](https://gitlab.com/groups/gitlab-org/-/epics/13510) +- [Change Data Capture Epic #18520](https://gitlab.com/groups/gitlab-org/-/epics/18520) \ No newline at end of file diff --git a/doc/development/sec/vulnerability_management/img/es_ingestion_high_level_arch.png b/doc/development/sec/vulnerability_management/img/es_ingestion_high_level_arch.png new file mode 100644 index 0000000000000000000000000000000000000000..79531b1a37d0add3f5cb0b18fed1a43d0edaa136 Binary files /dev/null and b/doc/development/sec/vulnerability_management/img/es_ingestion_high_level_arch.png differ diff --git a/doc/development/sec/vulnerability_management/img/es_ingestion_querying.png b/doc/development/sec/vulnerability_management/img/es_ingestion_querying.png new file mode 100644 index 0000000000000000000000000000000000000000..e17cad233274b12486fcaa3b27b948a5a71261b1 Binary files /dev/null and b/doc/development/sec/vulnerability_management/img/es_ingestion_querying.png differ diff --git a/doc/development/sec/vulnerability_management/local_setup.md b/doc/development/sec/vulnerability_management/local_setup.md new file mode 100644 index 0000000000000000000000000000000000000000..616dab600141f1b71dc3ebf7c9114cf4f9917250 --- /dev/null +++ b/doc/development/sec/vulnerability_management/local_setup.md @@ -0,0 +1,99 @@ +--- +stage: SRM +group: Security Infrastructure +info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/development/development_processes/#development-guidelines-review. +title: Local setup and useful debugging commands +--- + +This document describes the local setup notes related to setting up [advanced vulnerability management](https://docs.gitlab.com/user/application_security/vulnerability_report/#advanced-vulnerability-management) for the vulnerability reports. + +### Local setup notes: + +1. Follow steps from https://gitlab.com/gitlab-org/gitlab-development-kit/blob/main/doc/howto/elasticsearch.md#installation for GDK enabling of ES. + +1. Cross check that the UI option is enabled from the setup instructions in https://gitlab.com/gitlab-org/gitlab-development-kit/blob/main/doc/howto/elasticsearch.md#setup + +1. In Rails console run the below ES migrations + + ```ruby + > Elastic::DataMigrationService[20250408180015].migrate + + > Elastic::DataMigrationService[20250423184327].migrate + + > Elastic::DataMigrationService[20250611135718].migrate + ``` + +1. To test the vulnerabilities MVC features, we have to configure GDK to run in SASS mode, the easiest way is to add `export GITLAB_SIMULATE_SAAS=1` to `env.runit` file in GDK home. Other ways to enable SASS mode is listed in https://docs.gitlab.com/development/ee_features/#simulate-a-saas-instance + +1. In the master or corresponding feature branch start the GDK services `> gdk restart` + +1. `> gdk status elasticsearch` should return the status with a PID like `run: elasticsearch: (pid 79011) 274294s, normally down; (pid 78972) 274296s` + + +--- + +### Useful commands to debug + +1. The following are example commands to use for debugging in Rails console. + +```ruby +# To create index +::Gitlab::Elastic::Helper.default.create_standalone_indices(target_classes: [Vulnerability]) + +# To backfill records into Redis +Vulnerabilities::Read.all.each { |v| ::Elastic::ProcessBookkeepingService.track!(Search::Elastic::References::Vulnerability.new(v.vulnerability_id, "group_#{v.project.namespace.root_ancestor.id}")) } + +# To process Redis refs into ES +Elastic::ProcessBookkeepingService.new.execute + + +# Example search query +Gitlab::Elastic::Helper.default.client.search( + index: "gitlab-development-vulnerabilities", + body: { + query: { + match_all: {} # This returns all documents + }, + _source: true # This returns all fields + } +) + +# Another example to filter `identifier_names` +Gitlab::Elastic::Helper.default.client.search( + index: "gitlab-development-vulnerabilities", + body: { + query: { + wildcard: { identifier_names: "**" } + }, + _source: ["identifier_names"] + } +) + +# To get all records for a given project +query_body = { + "query": { + "bool": { + "filter": { + "term": { + "project_id": 71 + } + } + } + }, + "size": 100, + _source: [] +} +Gitlab::Elastic::Helper.default.client.search( + index: "gitlab-development-vulnerabilities", + body: query_body +) +``` + +### Debugging Tools +- [Performance bar](https://docs.gitlab.com/development/advanced_search/#performance-bar) to analyze ES calls +- [Elasticsearch Logs](https://docs.gitlab.com/development/advanced_search/#logs) +- [Debugging](https://docs.gitlab.com/development/advanced_search/#setting-up-your-development-environment) in rails console + + +### Related links +- https://docs.gitlab.com/development/advanced_search/#setting-up-your-development-environment \ No newline at end of file diff --git a/doc/development/vulnerability_management/adding_elasticsearch_filters.md b/doc/development/vulnerability_management/adding_elasticsearch_filters.md new file mode 100644 index 0000000000000000000000000000000000000000..73dd053dbfaa8617e9893e69fc8213e85e417e63 --- /dev/null +++ b/doc/development/vulnerability_management/adding_elasticsearch_filters.md @@ -0,0 +1,665 @@ +# Adding New Elasticsearch Filters for Vulnerability Management + +This guide covers how to add new filters using Elasticsearch for advanced vulnerability management features in GitLab. These features are part of the advanced vulnerability management capabilities and require Elasticsearch to be configured. + +## Overview + +Advanced vulnerability management features in GitLab use Elasticsearch to provide enhanced filtering capabilities that go beyond what's possible with PostgreSQL alone. This includes features like: + +- Advanced OWASP Top 10 filtering +- Identifier name searching +- Reachability filtering +- Complex aggregations and analytics + +## Prerequisites + +Before adding a new Elasticsearch filter, ensure you understand: + +- GitLab's Elasticsearch integration architecture +- Vulnerability data model and relationships +- GraphQL resolver patterns +- Database migration patterns +- Feature flag usage + +## Step-by-Step Implementation Guide + +### 1. GraphQL Resolver Updates + +First, add the new filter argument to the appropriate GraphQL resolver. Most vulnerability filters are added to `VulnerabilitiesResolver`. + +**File:** `ee/app/graphql/resolvers/vulnerabilities_resolver.rb` + +```ruby +argument :your_new_filter, GraphQL::Types::String, + required: false, + experiment: { milestone: '18.x' }, + description: 'Filter vulnerabilities by your new criteria. ' \ + 'To use this argument, you must have Elasticsearch configured and the ' \ + '`advanced_vulnerability_management` feature flag enabled. ' \ + 'Not supported on Instance Security Dashboard queries.' +``` + +**Key considerations:** +- Mark experimental features with `experiment: { milestone: 'X.Y' }` +- Include clear documentation about Elasticsearch requirements +- Note any limitations (e.g., Instance Security Dashboard support) + +### 2. Update VulnerabilityFilterable Module + +Add validation logic for your new filter in the `VulnerabilityFilterable` module. + +**File:** `ee/app/graphql/resolvers/vulnerability_filterable.rb` + +```ruby +# Add your filter to the ADVANCED_FILTERS constant if it requires ES +ADVANCED_FILTERS = [:owasp_top_10_2021, :identifier_name, :reachability, :your_new_filter].freeze + +# Add validation logic in the validate_filters method +def validate_filters(filters) + # ... existing validations ... + + validate_your_new_filter!(vulnerable) if filters[:your_new_filter].present? +end + +private + +def validate_your_new_filter!(vulnerable) + # Add any specific validation logic for your filter + # Example: check if required migrations are complete + return if Feature.enabled?(:your_new_filter_feature_flag, vulnerable) && + ::Elastic::DataMigrationService.migration_has_finished?(:add_your_field_to_vulnerability) + + raise ::Gitlab::Graphql::Errors::ArgumentError, + 'The \'your_new_filter\' argument is not currently supported. ' \ + 'This feature requires Elasticsearch and specific migrations to be complete.' +end +``` + +### 3. Use Elasticsearch Finder + +The `VulnerabilityElasticSearchFinder` automatically uses the Elasticsearch query builder. No changes are typically needed here unless you're adding a completely new finder pattern. + +**File:** `ee/app/finders/security/vulnerability_elastic_search_finder.rb` + +The finder delegates to `VulnerabilityQueryBuilder` which handles all filter logic. + +### 4. Elasticsearch Query Builder + +Add your filter logic to the vulnerability query builder. + +**File:** `ee/lib/search/elastic/vulnerability_query_builder.rb` + +```ruby +def build + # ... existing filters ... + + query_hash = ::Search::Elastic::VulnerabilityFilters.by_your_new_filter( + query_hash: query_hash, options: options) + + # ... rest of the method ... +end +``` + +### 5. Elasticsearch Vulnerability Filters + +Implement the actual filter logic in the VulnerabilityFilters module. + +**File:** `ee/lib/search/elastic/vulnerability_filters.rb` + +```ruby +def by_your_new_filter(query_hash:, options:) + your_filter_value = options[:your_new_filter] + return query_hash if your_filter_value.blank? + + # Add validation if needed + return query_hash unless valid_filter_value?(your_filter_value) + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { + terms: { + _name: context.name(:your_new_filter), + your_field_name: your_filter_value + } + } + end + end +end + +private + +def valid_filter_value?(value) + # Add validation logic specific to your filter + # Return true if valid, false otherwise +end +``` + +**Common filter patterns:** + +- **Term filter** (exact match): `{ term: { field: { value: value } } }` +- **Terms filter** (multiple values): `{ terms: { field: values } }` +- **Range filter**: `{ range: { field: { gte: min, lte: max } } }` +- **Boolean filter** (complex logic): `{ bool: { must: [...], should: [...] } }` + +### 6. Elasticsearch Schema and Migrations + +#### Define the Elasticsearch Schema + +Update the vulnerability Elasticsearch type to include your new field. + +**File:** `ee/lib/search/elastic/types/vulnerability.rb` + +```ruby +def base_mappings + { + # ... existing mappings ... + your_field_name: { type: 'keyword' }, # or appropriate type + # ... rest of mappings ... + } +end +``` + +**Common Elasticsearch field types:** +- `keyword`: For exact matching (IDs, enums, short strings) +- `text`: For full-text search +- `short`: For small integers (enums) +- `long`: For large integers +- `boolean`: For true/false values +- `date`: For timestamps + +#### Create Elasticsearch Migration + +Create a new Elasticsearch migration to add the field to existing indices. + +**File:** `ee/elastic/migrate/YYYYMMDDHHMMSS_add_your_field_to_vulnerability.rb` + +```ruby +# frozen_string_literal: true + +class AddYourFieldToVulnerability < Elastic::Migration + include ::Search::Elastic::MigrationUpdateMappingsHelper + + DOCUMENT_TYPE = Vulnerability + + private + + def index_name + ::Search::Elastic::Types::Vulnerability.index_name + end + + def new_mappings + { + your_field_name: { + type: 'keyword' # or appropriate type + } + } + end +end +``` + +#### Create Migration Documentation + +**File:** `ee/elastic/docs/YYYYMMDDHHMMSS_add_your_field_to_vulnerability.yml` + +```yaml +--- +name: AddYourFieldToVulnerability +version: 'YYYYMMDDHHMMSS' +description: Adds your_field_name field to the Vulnerability index for enhanced filtering capabilities. +group: group::security infrastructure +milestone: '18.x' +introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/XXXXX +obsolete: false +marked_obsolete_by_url: +marked_obsolete_in_milestone: +``` + +### 7. Keep Elasticsearch in Sync + +#### Update Vulnerability Reads Model + +Ensure your new field is tracked for Elasticsearch updates. + +**File:** `ee/app/models/vulnerabilities/read.rb` + +```ruby +# Add your field to the tracked fields constant +ELASTICSEARCH_TRACKED_FIELDS = ::Search::Elastic::References::Vulnerability::DIRECT_FIELDS + + ::Search::Elastic::References::Vulnerability::DIRECT_TYPECAST_FIELDS + + %w[traversal_ids your_field_name] +``` + +#### Manual Bookkeeping for ActiveRecord Models + +For fields that require complex calculations or come from related models, implement manual bookkeeping: + +```ruby +# In the model that changes and affects the vulnerability field +after_commit :update_vulnerability_elasticsearch_field, on: [:create, :update, :destroy] + +private + +def update_vulnerability_elasticsearch_field + return unless vulnerability_read&.use_elasticsearch? + + # Update the field value + vulnerability_read.update_column(:your_field_name, calculate_new_value) + + # Trigger ES update + vulnerability_read.maintain_elasticsearch_update(updated_attributes: ['your_field_name']) +end +``` + +#### Bulk Operations with BulkEsOperationService + +For operations that affect multiple vulnerability records outside of ActiveRecord models, use the `BulkEsOperationService` helper: + +**File:** `ee/app/services/vulnerabilities/bulk_es_operation_service.rb` + +```ruby +# Example usage in a service that updates multiple vulnerabilities +def update_multiple_vulnerabilities(vulnerability_relation) + Vulnerabilities::BulkEsOperationService.new(vulnerability_relation).execute do |relation| + # Perform your database updates here + relation.update_all(your_field_name: new_value) + end +end +``` + +**Key benefits of BulkEsOperationService:** +- Handles preloading of necessary associations +- Filters to only eligible vulnerabilities (those that should be indexed) +- Batches Elasticsearch updates efficiently +- Works with both `Vulnerability` and `Vulnerabilities::Read` relations + +#### Sync During Vulnerability Ingestion Framework + +For bulk operations that don't use ActiveRecord models (like vulnerability ingestion), perform manual sync: + +**Example from Security Ingestion:** + +```ruby +# ee/app/services/security/ingestion/tasks/ingest_remediations.rb +def update_vulnerability_reads + unfound_reads = Vulnerabilities::Read.by_vulnerabilities( + vulnerability_ids_for_remediations(unfound_remediations)) + ::Vulnerabilities::BulkEsOperationService.new(unfound_reads).execute do |relation| + relation.update_all(has_remediations: false) + end + + found_reads = Vulnerabilities::Read.by_vulnerabilities( + vulnerability_ids_for_remediations(found_remediations)) + ::Vulnerabilities::BulkEsOperationService.new(found_reads).execute do |relation| + relation.update_all(has_remediations: true) + end +end +``` + +**Example from SBOM Ingestion:** + +```ruby +# ee/app/services/sbom/ingestion/tasks/ingest_occurrences_vulnerabilities.rb +def after_ingest + return unless return_data.present? + + vulnerabilities_relation = Vulnerability.id_in(return_data.flatten) + sync_elasticsearch(vulnerabilities_relation) if vulnerabilities_relation.present? +end + +def sync_elasticsearch(vulnerabilities) + ::Vulnerabilities::BulkEsOperationService.new(vulnerabilities).execute(&:itself) +end +``` + +#### Sync During Project/Group Operations + +**Project Deletion:** + +```ruby +# ee/app/services/vulnerabilities/removal/remove_from_project_service.rb +def sync_elasticsearch + vulnerabilities_to_delete = Vulnerability.id_in(vulnerability_ids) + Vulnerabilities::BulkEsOperationService.new(vulnerabilities_to_delete).execute(&:itself) +end +``` + +**Project/Group Transfer:** + +When projects or groups are transferred, vulnerability traversal IDs need to be updated: + +```ruby +# Update traversal_ids for vulnerabilities when group structure changes +Vulnerabilities::UpdateTraversalIdsOfVulnerabilityReadsService.new( + group_id: new_group.id +).execute +``` + +#### Future: Change Data Capture (CDC) + +GitLab is planning to move synchronization outside of application logic using Change Data Capture from PostgreSQL. This work is tracked in [Epic 18520](https://gitlab.com/groups/gitlab-org/-/epics/18520). + +**Benefits of CDC approach:** +- Removes sync logic from application code +- Reduces application complexity +- Improves reliability and consistency +- Enables real-time synchronization +- Reduces performance impact on application operations + +**Current Status:** +- In planning/research phase +- Will gradually replace manual sync mechanisms +- Timeline and implementation details are being defined + +### 8. Elasticsearch Backfill Migration + +Create a background migration to populate the new field for existing records. + +**File:** `lib/gitlab/background_migration/backfill_your_field_to_vulnerability_reads.rb` + +```ruby +# frozen_string_literal: true + +module Gitlab + module BackgroundMigration + class BackfillYourFieldToVulnerabilityReads < BatchedMigrationJob + operation_name :backfill_your_field_to_vulnerability_reads + feature_category :vulnerability_management + + def perform + each_sub_batch do |sub_batch| + sub_batch.update_all(your_field_name: calculate_field_value) + end + end + + private + + def calculate_field_value + # Implement logic to calculate the field value + # This might involve joins or complex calculations + end + end + end +end +``` + +**Queue the migration:** + +**File:** `db/post_migrate/YYYYMMDDHHMMSS_queue_backfill_your_field_to_vulnerability_reads.rb` + +```ruby +# frozen_string_literal: true + +class QueueBackfillYourFieldToVulnerabilityReads < Gitlab::Database::Migration[2.2] + milestone '18.x' + + restrict_gitlab_migration gitlab_schema: :gitlab_main + + def up + queue_batched_background_migration( + :backfill_your_field_to_vulnerability_reads, + :vulnerability_reads, + :vulnerability_id, + job_class_name: 'BackfillYourFieldToVulnerabilityReads' + ) + end + + def down + delete_batched_background_migration( + :backfill_your_field_to_vulnerability_reads, + :vulnerability_reads, + :vulnerability_id, + 'BackfillYourFieldToVulnerabilityReads' + ) + end +end +``` + +### 9. Feature Flags + +#### Create Feature Flag + +**File:** `ee/config/feature_flags/beta/your_new_filter.yml` + +```yaml +--- +name: your_new_filter +description: Enable the new filter for vulnerability management +feature_issue_url: https://gitlab.com/groups/gitlab-org/-/epics/XXXXX +introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/XXXXX +rollout_issue_url: https://gitlab.com/gitlab-org/gitlab/-/issues/XXXXX +milestone: '18.x' +group: group::security infrastructure +type: beta +default_enabled: false +``` + +#### Use Feature Flag in Code + +```ruby +# In validation logic +return unless Feature.enabled?(:your_new_filter, vulnerable) + +# In filter logic +return query_hash unless Feature.enabled?(:your_new_filter, options[:current_user]) +``` + +### 10. UI Access Control + +The UI automatically respects the `access_advanced_vulnerability_management` ability. This ability is granted when: + +1. User has `:read_security_resource` permission +2. Vulnerability indexing is enabled (Elasticsearch configured) +3. `advanced_vulnerability_management` feature flag is enabled +4. Required Elasticsearch migrations are complete + +**File:** `ee/app/policies/vulnerabilities/advanced_vulnerability_management_policy.rb` + +The policy automatically handles access control. No changes needed unless you have specific requirements. + +**Frontend Integration:** + +In Vue components, check the ability: + +```javascript +// The ability is automatically pushed to frontend in vulnerability controllers +computed: { + canUseAdvancedFilters() { + return this.glFeatures.accessAdvancedVulnerabilityManagement; + } +} +``` + +## Testing + +### Unit Tests + +1. **GraphQL Resolver Tests**: Test argument validation and filter application +2. **Finder Tests**: Test Elasticsearch query generation +3. **Filter Tests**: Test individual filter logic +4. **Migration Tests**: Test background migration logic + +### Integration Tests + +1. **End-to-end GraphQL Tests**: Test complete filter functionality +2. **Elasticsearch Integration Tests**: Test actual ES queries +3. **Feature Flag Tests**: Test behavior with flags enabled/disabled + +## Best Practices + +### Performance Considerations + +1. **Index Optimization**: Choose appropriate field types for your use case +2. **Query Efficiency**: Use term filters for exact matches, avoid expensive operations +3. **Pagination**: Always implement proper pagination for large result sets +4. **Caching**: Consider caching expensive calculations + +### Security Considerations + +1. **Input Validation**: Always validate filter inputs +2. **Access Control**: Respect existing permission models +3. **Data Exposure**: Ensure filters don't expose sensitive information + +### Maintainability + +1. **Documentation**: Document complex filter logic +2. **Testing**: Comprehensive test coverage for all scenarios +3. **Monitoring**: Add metrics for filter usage and performance +4. **Backwards Compatibility**: Consider migration paths for existing data + +## Common Patterns + +### Enum-based Filters (Straightforward) + +Enum and boolean field filters are straightforward to implement since they map directly to database values: + +```ruby +def by_enum_field(query_hash:, options:) + enum_values = options[:enum_field] + return query_hash if enum_values.blank? + + # Validate enum values - this is straightforward since enums have fixed values + valid_values = YourModel.enum_field.values + return query_hash unless (enum_values - valid_values).empty? + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { terms: { enum_field: enum_values } } + end + end +end +``` + +### Boolean Filters (Straightforward) + +Boolean filters are also straightforward since they only accept true/false values: + +```ruby +def by_boolean_field(query_hash:, options:) + boolean_value = options[:boolean_field] + return query_hash if boolean_value.nil? + + # Simple validation for boolean values + return query_hash unless boolean_value.in?([true, false]) + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { term: { boolean_field: { value: boolean_value } } } + end + end +end +``` + +### Text Search Filters + +```ruby +def by_text_search(query_hash:, options:) + search_term = options[:text_search] + return query_hash if search_term.blank? + + context.name(:filters) do + add_filter(query_hash, :query, :bool, :filter) do + { + simple_query_string: { + fields: ["field_name", "field_name.ngram"], + query: search_term, + lenient: true, + default_operator: :and + } + } + end + end +end +``` + +## Performance Monitoring + +### Using Performance Bar to Monitor Elasticsearch Queries + +GitLab provides a performance bar that shows Elasticsearch queries executed during GraphQL requests. This is invaluable for debugging and optimizing your filters. + +**Enabling the Performance Bar:** + +1. **For Administrators**: The performance bar is automatically available +2. **For Non-administrators**: Must be enabled in Admin settings + - Go to **Admin Area > Settings > Metrics and profiling** + - Select **Allow non-administrators access to the performance bar** + +**Monitoring Elasticsearch Queries:** + +1. **Enable the performance bar** by pressing `p` + `b` on any GitLab page +2. **Execute your GraphQL query** that uses the new filter +3. **Click on the `es` section** in the performance bar to see: + - Number of Elasticsearch queries executed + - Total duration of all queries + - Individual query details including: + - Query method (GET, POST, etc.) + - Query path and parameters + - Execution time for each query + +**Performance Bar Elasticsearch View:** + +The performance bar shows Elasticsearch metrics with color-coded warnings: +- **Green**: Normal performance +- **Yellow**: Approaching thresholds (5+ calls or 1000ms+ duration) +- **Red**: Exceeding performance thresholds + +**Example Performance Bar Output:** +``` +es: 3 (1.2s) +``` +This indicates 3 Elasticsearch queries taking 1.2 seconds total. + +**Detailed Query Information:** + +Click on the `es` count to see detailed information: +```json +{ + "method": "POST", + "path": "/gitlab-vulnerabilities/_search", + "params": { + "routing": "group_123", + "timeout": "30s" + }, + "duration": 450.2 +} +``` + +**Performance Thresholds:** + +The performance bar uses these default thresholds: +- **Calls**: 5 queries (warning threshold) +- **Duration**: 1000ms total (warning threshold) +- **Individual Call**: 1000ms per query (warning threshold) + +**Best Practices for Performance:** +- Keep the number of ES queries low (ideally 1-2 per request) +- Optimize query complexity to stay under duration thresholds +- Use appropriate field types (keyword vs text) for your use case +- Consider caching for frequently accessed data + +## Troubleshooting + +### Common Issues + +1. **Migration Not Complete**: Ensure Elasticsearch migrations are finished before enabling filters +2. **Permission Denied**: Verify user has required permissions and feature flags are enabled +3. **Query Performance**: Monitor Elasticsearch query performance and optimize as needed +4. **Data Inconsistency**: Ensure proper synchronization between PostgreSQL and Elasticsearch + +### Debugging + +1. **Enable Query Logging**: Use Elasticsearch query logging to debug issues +2. **Check Migration Status**: Verify migration completion status +3. **Test Feature Flags**: Ensure feature flags are properly configured +4. **Validate Data**: Check that data is properly indexed in Elasticsearch + +## Related Documentation + +- [Elasticsearch Integration](../../integration/advanced_search/elasticsearch.md) +- [GraphQL Development](../api_graphql_styleguide.md) +- [Background Migrations](../database/batched_background_migrations.md) +- [Feature Flags](../feature_flags/index.md) +- [Performance Bar](../../administration/monitoring/performance/performance_bar.md) +- [Vulnerability Management](./index.md) +- [SBOM Dependency Graph Ingestion](./sbom_dependency_graph_ingestion_overview.md) +- [Change Data Capture Epic](https://gitlab.com/groups/gitlab-org/-/epics/18520) \ No newline at end of file diff --git a/doc/development/vulnerability_management/img/es_ingestion_high_level_arch.png b/doc/development/vulnerability_management/img/es_ingestion_high_level_arch.png new file mode 100644 index 0000000000000000000000000000000000000000..79531b1a37d0add3f5cb0b18fed1a43d0edaa136 Binary files /dev/null and b/doc/development/vulnerability_management/img/es_ingestion_high_level_arch.png differ diff --git a/doc/development/vulnerability_management/img/es_ingestion_querying.png b/doc/development/vulnerability_management/img/es_ingestion_querying.png new file mode 100644 index 0000000000000000000000000000000000000000..e17cad233274b12486fcaa3b27b948a5a71261b1 Binary files /dev/null and b/doc/development/vulnerability_management/img/es_ingestion_querying.png differ