Skip to content

Latest commit

 

History

History
233 lines (144 loc) · 12 KB

README.md

File metadata and controls

233 lines (144 loc) · 12 KB

Table of Contents

[[TOC]]

Global Code Search Service

Logging

Summary

Quick start

Currently we use Elasticsearch for code search within GitLab. Elasticsearch has turned out to be a poor fit for code search. In order to solve that we're rolling out new Code Search based on Zoekt to select number of customers as part of this epic.

How-to guides

Enabling/Disabling Zoekt search

You can prevent Gitlab from using Zoekt integration for searching by unchecking the checkbox Enable exact code search under the section Exact code search configuration found in the admin settings(accessed by admins only) Settings->Search, but leave the indexing integration itself enabled. An example of when this is useful is during an incident where users are experiencing slow searches or Zoekt is unresponsive.

Enabling/Disabling Zoekt search for specific namespaces

When we rollout Zoekt search for SaaS customers, it is enabled by default. But if a customer wish to get it disabled we can run the following chatops command to disable the Zoekt search specifically for a namespace.

  /chatops run feature set --group=root-group-path disable_zoekt_search_for_saas true --production

To re-enable it again we can run the following chatops command

  /chatops run feature set --group=root-group-path disable_zoekt_search_for_saas false --production

Evicting namespaces from a Zoekt node

Zoekt has an eviction task that runs on a defined schedule for GitLab.com. It detects nodes which are over the watermark limit for disk utilization and removes namespaces until the node is back under the watermark lower limit. Those namespaces are removed from the node. The eviction task is responsible for removing namespaces. The dot_com_rollout handles adding namespaces to nodes with capacity.

Note: The eviction task is currently behind a default enabled feature flag named zoekt_reallocation_task

If Zoekt search FF is disabled, but you still see that some nodes misbehave (OOM or disk usage too high for example), you can run the eviction task manually to evict some of the namespaces from the node:

  1. Execute the script in rails console

    ::Search::Zoekt::SchedulingService.execute(:eviction)

Removing a namespace from the zoekt node manually

If the eviction task returns false or does not relieve pressure on the node, you can remove a namespace from Zoekt manually.

  1. Execute the script in rails console

    # Find the offending node (gitlab-gitlab-zoekt-1 in this example)
    node = Search::Zoekt::Node.where("metadata @> ?", { name: 'gitlab-gitlab-zoekt-1' }.to_json).order(:last_seen_at).last
    
    # Find the namespaces and repository sizes on the node
    sizes = {}
    node.indices.each_batch do |batch|
       scope = Namespace.includes(:root_storage_statistics).by_parent(nil).id_in(batch.select(:namespace_id))
    
       scope.each do |group|
          sizes[group.id] = group.root_storage_statistics&.repository_size || 0
       end
    end
    sorted = sizes.to_a.sort_by { |_k, v| v }
    
    # Find the largest namespace
    namespace_id = sorted.last[0]
    namespace = Namespace.find(namespace_id)
    
    # Destroy all `::Search::Zoekt::Replica` records for the namespace
    zoekt_replicas = ::Search::Zoekt::Replica.for_namespace(namespace_id)
    zoekt_replicas.destroy_all
  2. Post namespace_ids on the incident issue as a private comment so there is a record. The Zoekt architecture will handle allocating the namespaces and projects to a new node.

Marking a zoekt node as lost

When a Zoekt node PVC is over 80% of usage and evicting or removing namespaces doesn't reduce the usage, you can quickly remove all namespaces from a Zoekt node by manually mark the node as lost. This is a safe operation because the lost node will reregister itself as a new node and the Zoekt Architecture will handle allocating all namespaces and projects.

Warning: The new UUID must not exist in the table.

node_name = 'gitlab-gitlab-zoekt-29'
uuid = SecureRandom.uuid

Search::Zoekt::Node.by_name(node_name).update_all(uuid: uuid, last_seen_at: 24.hours.ago)

When to add a Zoekt node

Increase the number of Zoekt replicas (nodes) by 20% of total capacity if all Zoekt nodes are above 65% of disk utilization. For example, if there are 22 nodes, add 4.4 (4 nodes).

Pausing Zoekt indexing

Zoekt indexing can be paused by checking the checkbox Pause indexing for exact code search under the section Exact code search configuration found in the admin settings(accessed by admins only) Settings->Search. The jobs are stored in a separate ZSET and re-enqueued when indexing is unpaused. An example of when this is useful is during an incident when there are a large number of indexing Sidekiq jobs failing.

Disabling Zoekt indexing

Zoekt indexing can be completely disabled by unchecking the checkbox Enable indexing for exact code search under the section Exact code search configuration found in the admin settings(accessed by admins only) Settings->Search. Pausing indexing is the preferred method to halt Zoekt indexing.

WARNING: Indexed data will be stale after indexing is re-enabled. Reindexing from scratch may be necessary to ensure up to date search results.

Limitations

  1. Multiple shards and replication are not supported yet. You can follow the progress in https://gitlab.com/groups/gitlab-org/-/epics/11382.

Architecture

How Zoekt is used

In order to index repositories and provide search functionality we use 1 binary from the Zoekt repository and 1 binary from the gitlab-org repository:

Zoekt API

Indexing

For gitlab-zoekt-indexer we use /indexer/index requests with repository URL and project ID:

curl -s -XPOST -d '{"CloneUrl":"https://gitlab.com/gitlab-org/gitlab.git","RepoId":278964, "FileSizeLimit": 2097152, "Timeout": "1h", "GitalyConnectionInfo": {"Address": "gitaly.address", "Storage": "default", "Path": "path/gitlab-org/gitlab.git"} }' -H 'Content-Type: application/json' https://zoekt-indexer.url/indexer/index

Delete

For gitlab-zoekt-indexer we use /indexer/index/:repoId requests with the project ID:

curl -s -XDELETE https://zoekt-indexer.url/indexer/index/278964

Searching

For zoekt-webserver we use the /api/search endpoint:

curl -s -XPOST -d '{"Q":"query","RepoIds":[278964],"Opts":{"TotalMaxMatchCount":20,"NumContextLines":1}}' 'https://zoekt-webserver.url/api/search'

Indexer

Overview

Indexing happens in two scenarios:

  • initial indexing - triggered by adding namespaces
  • new events (e.g. git push) - webserver schedules sidekiq jobs that run indexers

Triggering indexing

Gitlab application will be scheduling sidekiq jobs. Once a namespace is enabled sidekiq jobs will be scheduled for it. You can always manually retrigger a project to be indexed from the Rails console with Zoekt::IndexerWorker.perform_async(<project id>).

Sidekiq jobs

Examples of indexer jobs:

  • ee/app/workers/zoekt/indexer_worker.rb

Logs available in centralised logging, see Logging

Scalability

How much Zoekt storage do we need

Zoekt index takes about 2.8 times of the source code in the indexed branch (excluding binary files). We also store bare repos as an intermediate step for generating the index files. This is a significant storage overhead so we plan to optimize this in https://gitlab.com/gitlab-org/gitlab/-/issues/384722

Monitoring

Dashboards

There are a few dashboards to monitor Zoekt health:

Kibana logs

GitLab application has a dedicated zoekt.log file for Zoekt-related log entries. This will be handled by the standard logging infrastructure. You may also find indexing related errors in sidekiq.log and search related errors in production_json.log.

As for gitlab-zoekt-indexer and zoekt-webserver, they write logs to stdout.

Alerts

kube_persistent_volume_claim_disk_space

Zoekt architecture has logic which detects when nodes disk usage is over the limit. Projects will be removed from each node until it the node disk usage under the limit. If the disk space is not coming down quick enough, remove namespaces using the eviction task, remove namepaces manually, or mark the node as lost a last resort.

WARNING: The PVC disk size must not be increased manually. Zoekt nodes are sized with a specific PVC size and it must remain consistant across all nodes.