gitpacking(7) — Linux manual page

NAME | SYNOPSIS | DESCRIPTION | PSEUDO-MERGE BITMAPS | SEE ALSO | GIT | COLOPHON

GITPACKING(7)                   Git Manual                  GITPACKING(7)

NAME         top

       gitpacking - Advanced concepts related to packing in Git

SYNOPSIS         top

       gitpacking

DESCRIPTION         top

       This document aims to describe some advanced concepts related to
       packing in Git.

       Many concepts are currently described scattered between manual
       pages of various Git commands, including git-pack-objects(1),
       git-repack(1), and others, as well as gitformat-pack(5), and parts
       of the Documentation/technical tree.

       There are many aspects of packing in Git that are not covered in
       this document that instead live in the aforementioned areas. Over
       time, those scattered bits may coalesce into this document.

PSEUDO-MERGE BITMAPS         top

           Note

           Pseudo-merge bitmaps are considered an experimental feature,
           so the configuration and many of the ideas are subject to
           change.

   Background
       Reachability bitmaps are most efficient when we have on-disk
       stored bitmaps for one or more of the starting points of a
       traversal. For this reason, Git prefers storing bitmaps for
       commits at the tips of refs, because traversals tend to start with
       those points.

       But if you have a large number of refs, it’s not feasible to store
       a bitmap for every ref tip. It takes up space, and just OR-ing all
       of those bitmaps together is expensive.

       One way we can deal with that is to create bitmaps that represent
       groups of refs. When a traversal asks about the entire group, then
       we can use this single bitmap instead of considering each ref
       individually. Because these bitmaps represent the set of objects
       which would be reachable in a hypothetical merge of all of the
       commits, we call them pseudo-merge bitmaps.

   Overview
       A "pseudo-merge bitmap" is used to refer to a pair of bitmaps, as
       follows:

       Commit bitmap
           A bitmap whose set bits describe the set of commits included
           in the pseudo-merge’s "merge" bitmap (as below).

       Merge bitmap
           A bitmap whose set bits describe the reachability closure over
           the set of commits in the pseudo-merge’s "commits" bitmap (as
           above). An identical bitmap would be generated for an octopus
           merge with the same set of parents as described in the commits
           bitmap.

       Pseudo-merge bitmaps can accelerate bitmap traversals when all
       commits for a given pseudo-merge are listed on either side of the
       traversal, either directly (by explicitly asking for them as part
       of the HAVES or WANTS) or indirectly (by encountering them during
       a fill-in traversal).

   Use-cases
       For example, suppose there exists a pseudo-merge bitmap with a
       large number of commits, all of which are listed in the WANTS
       section of some bitmap traversal query. When pseudo-merge bitmaps
       are enabled, the bitmap machinery can quickly determine there is a
       pseudo-merge which satisfies some subset of the wanted objects on
       either side of the query. Then, we can inflate the EWAH-compressed
       bitmap, and OR it in to the resulting bitmap. By contrast, without
       pseudo-merge bitmaps, we would have to repeat the decompression
       and OR-ing step over a potentially large number of individual
       bitmaps, which can take proportionally more time.

       Another benefit of pseudo-merges arises when there is some
       combination of (a) a large number of references, with (b) poor
       bitmap coverage, and (c) deep, nested trees, making fill-in
       traversal relatively expensive. For example, suppose that there
       are a large enough number of tags where bitmapping each of the
       tags individually is infeasible. Without pseudo-merge bitmaps,
       computing the result of, say, git rev-list --use-bitmap-index
       --count --objects --tags would likely require a large amount of
       fill-in traversal. But when a large quantity of those tags are
       stored together in a pseudo-merge bitmap, the bitmap machinery can
       take advantage of the fact that we only care about the union of
       objects reachable from all of those tags, and answer the query
       much faster.

   Configuration
       Reference tips are grouped into different pseudo-merge groups
       according to two criteria. A reference name matches one or more of
       the defined pseudo-merge patterns, and optionally one or more
       capture groups within that pattern which further partition the
       group.

       Within a group, commits may be considered "stable", or "unstable"
       depending on their age. These are adjusted by setting the
       bitmapPseudoMerge.<name>.stableThreshold and
       bitmapPseudoMerge.<name>.threshold configuration values,
       respectively.

       All stable commits are grouped into pseudo-merges of equal size
       (bitmapPseudoMerge.<name>.stableSize). If the stableSize
       configuration is set to, say, 100, then the first 100 commits
       (ordered by committer date) which are older than the
       stableThreshold value will form one group, the next 100 commits
       will form another group, and so on.

       Among unstable commits, the pseudo-merge machinery will attempt to
       combine older commits into large groups as opposed to newer
       commits which will appear in smaller groups. This is based on the
       heuristic that references whose tip commit is older are less
       likely to be modified to point at a different commit than a
       reference whose tip commit is newer.

       The size of groups is determined by a power-law decay function,
       and the decay parameter roughly corresponds to "k" in f(n) =
       C*n^(-k/100), where f(n) describes the size of the n-th
       pseudo-merge group. The sample rate controls what percentage of
       eligible commits are considered as candidates. The threshold
       parameter indicates the minimum age (so as to avoid including
       too-recent commits in a pseudo-merge group, making it less likely
       to be valid). The "maxMerges" parameter sets an upper-bound on the
       number of pseudo-merge commits an individual group

       The "stable"-related parameters control "stable" pseudo-merge
       groups, comprised of a fixed number of commits which are older
       than the configured "stable threshold" value and may be grouped
       together in chunks of "stableSize" in order of age.

       The exact configuration for pseudo-merges is as follows:

           Note

           The configuration options in bitmapPseudoMerge.* are
           considered EXPERIMENTAL and may be subject to change or be
           removed entirely in the future. For more information about the
           pseudo-merge bitmap feature, see the "Pseudo-merge bitmaps"
           section of gitpacking(7).

       bitmapPseudoMerge.<name>.pattern
           Regular expression used to match reference names. Commits
           pointed to by references matching this pattern (and meeting
           the below criteria, like bitmapPseudoMerge.<name>.sampleRate
           and bitmapPseudoMerge.<name>.threshold) will be considered for
           inclusion in a pseudo-merge bitmap.

           Commits are grouped into pseudo-merge groups based on whether
           or not any reference(s) that point at a given commit match the
           pattern, which is an extended regular expression.

           Within a pseudo-merge group, commits may be further grouped
           into sub-groups based on the capture groups in the pattern.
           These sub-groupings are formed from the regular expressions by
           concatenating any capture groups from the regular expression,
           with a - dash in between.

           For example, if the pattern is refs/tags/, then all tags
           (provided they meet the below criteria) will be considered
           candidates for the same pseudo-merge group. However, if the
           pattern is instead refs/remotes/([0-9])+/tags/, then tags from
           different remotes will be grouped into separate pseudo-merge
           groups, based on the remote number.

       bitmapPseudoMerge.<name>.decay
           Determines the rate at which consecutive pseudo-merge bitmap
           groups decrease in size. Must be non-negative. This parameter
           can be thought of as k in the function f(n) = C * n^-k, where
           f(n) is the size of the `n`th group.

           Setting the decay rate equal to 0 will cause all groups to be
           the same size. Setting the decay rate equal to 1 will cause
           the n`th group to be `1/n the size of the initial group.
           Higher values of the decay rate cause consecutive groups to
           shrink at an increasing rate. The default is 1.

           If all groups are the same size, it is possible that groups
           containing newer commits will be able to be used less often
           than earlier groups, since it is more likely that the
           references pointing at newer commits will be updated more
           often than a reference pointing at an old commit.

       bitmapPseudoMerge.<name>.sampleRate
           Determines the proportion of non-bitmapped commits (among
           reference tips) which are selected for inclusion in an
           unstable pseudo-merge bitmap. Must be between 0 and 1
           (inclusive). The default is 1.

       bitmapPseudoMerge.<name>.threshold
           Determines the minimum age of non-bitmapped commits (among
           reference tips, as above) which are candidates for inclusion
           in an unstable pseudo-merge bitmap. The default is 1.week.ago.

       bitmapPseudoMerge.<name>.maxMerges
           Determines the maximum number of pseudo-merge commits among
           which commits may be distributed.

           For pseudo-merge groups whose pattern does not contain any
           capture groups, this setting is applied for all commits
           matching the regular expression. For patterns that have one or
           more capture groups, this setting is applied for each distinct
           capture group.

           For example, if your capture group is refs/tags/, then this
           setting will distribute all tags into a maximum of maxMerges
           pseudo-merge commits. However, if your capture group is, say,
           refs/remotes/([0-9]+)/tags/, then this setting will be applied
           to each remote’s set of tags individually.

           Must be non-negative. The default value is 64.

       bitmapPseudoMerge.<name>.stableThreshold
           Determines the minimum age of commits (among reference tips,
           as above, however stable commits are still considered
           candidates even when they have been covered by a bitmap) which
           are candidates for a stable a pseudo-merge bitmap. The default
           is 1.month.ago.

           Setting this threshold to a smaller value (e.g., 1.week.ago)
           will cause more stable groups to be generated (which impose a
           one-time generation cost) but those groups will likely become
           stale over time. Using a larger value incurs the opposite
           penalty (fewer stable groups which are more useful).

       bitmapPseudoMerge.<name>.stableSize
           Determines the size (in number of commits) of a stable
           psuedo-merge bitmap. The default is 512.

   Examples
       Suppose that you have a repository with a large number of
       references, and you want a bare-bones configuration of
       pseudo-merge bitmaps that will enhance bitmap coverage of the
       refs/ namespace. You may start with a configuration like so:

           [bitmapPseudoMerge "all"]
                   pattern = "refs/"
                   threshold = now
                   stableThreshold = never
                   sampleRate = 100
                   maxMerges = 64

       This will create pseudo-merge bitmaps for all references,
       regardless of their age, and group them into 64 pseudo-merge
       commits.

       If you wanted to separate tags from branches when generating
       pseudo-merge commits, you would instead define the pattern with a
       capture group, like so:

           [bitmapPseudoMerge "all"]
                   pattern = "refs/(heads/tags)/"

       Suppose instead that you are working in a fork-network repository,
       with each fork specified by some numeric ID, and whose refs reside
       in refs/virtual/NNN/ (where NNN is the numeric ID corresponding to
       some fork) in the network. In this instance, you may instead write
       something like:

           [bitmapPseudoMerge "all"]
                   pattern = "refs/virtual/([0-9]+)/(heads|tags)/"
                   threshold = now
                   stableThreshold = never
                   sampleRate = 100
                   maxMerges = 64

       Which would generate pseudo-merge group identifiers like
       "1234-heads", and "5678-tags" (for branches in fork "1234", and
       tags in remote "5678", respectively).

SEE ALSO         top

       git-pack-objects(1) git-repack(1)

GIT         top

       Part of the git(1) suite

COLOPHON         top

       This page is part of the git (Git distributed version control
       system) project.  Information about the project can be found at 
       ⟨http://git-scm.com/⟩.  If you have a bug report for this manual
       page, see ⟨http://git-scm.com/community⟩.  This page was obtained
       from the project's upstream Git repository
       ⟨https://github.com/git/git.git⟩ on 2025-02-02.  (At that time,
       the date of the most recent commit that was found in the
       repository was 2025-01-31.)  If you discover any rendering
       problems in this HTML version of the page, or you believe there is
       a better or more up-to-date source for the page, or you have
       corrections or improvements to the information in this COLOPHON
       (which is not part of the original manual page), send a mail to
       man-pages@man7.org

Git 2.48.1.166.g58b580          2025-01-31                  GITPACKING(7)

Pages that refer to this page: git-config(1)gitpacking(7)