Class TieredMergePolicy
LogByteSizeMergePolicy, except this merge
policy is able to merge non-adjacent segment, and
separates how many segments are merged at once (setMaxMergeAtOnce(int)) from how many segments are allowed
per tier (setSegmentsPerTier(double)). This merge
policy also does not over-merge (i.e. cascade merges).
For normal merging, this policy first computes a "budget" of how many segments are allowed to be in the index. If the index is over-budget, then the policy sorts segments by decreasing size (pro-rating by percent deletes), and then finds the least-cost merge. Merge cost is measured by a combination of the "skew" of the merge (size of largest segment divided by smallest segment), total merge size and percent deletes reclaimed, so that merges with lower skew, smaller size and those reclaiming more deletes, are favored.
If a merge will produce a segment that's larger than
setMaxMergedSegmentMB(double), then the policy will
merge fewer segments (down to 1 at once, if that one has
deletions) to keep the segment size under budget.
NOTE: this policy freely merges non-adjacent
segments; if this is a problem, use LogMergePolicy.
NOTE: This policy always merges by byte size of the segments, always pro-rates by percent deletes
NOTE Starting with Lucene 7.5, there are several changes: - findForcedMerges and findForcedDeletesMerges) respect the max segment size by default. - When findforcedmerges is called with maxSegmentCount other than 1, the resulting index is not guaranteed to have <= maxSegmentCount segments. Rather it is on a "best effort" basis. Specifically the theoretical ideal segment size is calculated and a "fudge factor" of 25% is added as the new maxSegmentSize, which is respected. - findForcedDeletesMerges will not produce segments greater than maxSegmentSize.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static enumprotected static classHolds score and explanation for a single candidate merge.private static classNested classes/interfaces inherited from class org.apache.lucene.index.MergePolicy
MergePolicy.MergeAbortedException, MergePolicy.MergeContext, MergePolicy.MergeException, MergePolicy.MergeReader, MergePolicy.MergeSpecification, MergePolicy.OneMerge, MergePolicy.OneMergeProgress -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final doubleDefault noCFSRatio.private doubleprivate longprivate doubleprivate intprivate intprivate longprivate doubleFields inherited from class org.apache.lucene.index.MergePolicy
DEFAULT_MAX_CFS_SEGMENT_SIZE, maxCFSSegmentSize, noCFSRatio -
Constructor Summary
ConstructorsConstructorDescriptionSole constructor, setting all settings to their defaults. -
Method Summary
Modifier and TypeMethodDescriptionprivate MergePolicy.MergeSpecificationdoFindMerges(List<TieredMergePolicy.SegmentSizeAndDocs> sortedEligibleInfos, long maxMergedSegmentBytes, int mergeFactor, int allowedSegCount, int allowedDelCount, TieredMergePolicy.MERGE_TYPE mergeType, MergePolicy.MergeContext mergeContext, boolean maxMergeIsRunning) findForcedDeletesMerges(SegmentInfos infos, MergePolicy.MergeContext mergeContext) Determine what set of merge operations is necessary in order to expunge all deletes from the index.findForcedMerges(SegmentInfos infos, int maxSegmentCount, Map<SegmentCommitInfo, Boolean> segmentsToMerge, MergePolicy.MergeContext mergeContext) Determine what set of merge operations is necessary in order to merge to<=the specified segment count.findMerges(MergeTrigger mergeTrigger, SegmentInfos infos, MergePolicy.MergeContext mergeContext) Determine what set of merge operations are now necessary on the index.private longfloorSize(long bytes) doubleReturns the current deletesPctAllowed setting.doubleReturns the current floorSegmentMB.doubleReturns the current forceMergeDeletesPctAllowed setting.intReturns the current maxMergeAtOnce setting.intReturns the current maxMergeAtOnceExplicit setting.doubleReturns the current maxMergedSegmentMB setting.doubleReturns the current segmentsPerTier setting.private List<TieredMergePolicy.SegmentSizeAndDocs> getSortedBySegmentSize(SegmentInfos infos, MergePolicy.MergeContext mergeContext) protected TieredMergePolicy.MergeScorescore(List<SegmentCommitInfo> candidate, boolean hitTooLarge, Map<SegmentCommitInfo, TieredMergePolicy.SegmentSizeAndDocs> segmentsSizes) Expert: scores one merge; subclasses can override.setDeletesPctAllowed(double v) Controls the maximum percentage of deleted documents that is tolerated in the index.setFloorSegmentMB(double v) Segments smaller than this are "rounded up" to this size, ie treated as equal (floor) size for merge selection.setForceMergeDeletesPctAllowed(double v) When forceMergeDeletes is called, we only merge away a segment if its delete percentage is over this threshold.setMaxMergeAtOnce(int v) Maximum number of segments to be merged at a time during "normal" merging.setMaxMergeAtOnceExplicit(int v) Maximum number of segments to be merged at a time, during forceMerge or forceMergeDeletes.setMaxMergedSegmentMB(double v) Maximum sized segment to produce during normal merging.setSegmentsPerTier(double v) Sets the allowed number of segments per tier.toString()Methods inherited from class org.apache.lucene.index.MergePolicy
assertDelCount, findFullFlushMerges, getMaxCFSSegmentSizeMB, getNoCFSRatio, isMerged, keepFullyDeletedSegment, message, numDeletesToMerge, segString, setMaxCFSSegmentSizeMB, setNoCFSRatio, size, useCompoundFile, verbose
-
Field Details
-
DEFAULT_NO_CFS_RATIO
public static final double DEFAULT_NO_CFS_RATIODefault noCFSRatio. If a merge's size is>= 10%of the index, then we disable compound file for it.- See Also:
-
maxMergeAtOnce
private int maxMergeAtOnce -
maxMergedSegmentBytes
private long maxMergedSegmentBytes -
maxMergeAtOnceExplicit
private int maxMergeAtOnceExplicit -
floorSegmentBytes
private long floorSegmentBytes -
segsPerTier
private double segsPerTier -
forceMergeDeletesPctAllowed
private double forceMergeDeletesPctAllowed -
deletesPctAllowed
private double deletesPctAllowed
-
-
Constructor Details
-
TieredMergePolicy
public TieredMergePolicy()Sole constructor, setting all settings to their defaults.
-
-
Method Details
-
setMaxMergeAtOnce
Maximum number of segments to be merged at a time during "normal" merging. For explicit merging (eg, forceMerge or forceMergeDeletes was called), seesetMaxMergeAtOnceExplicit(int). Default is 10. -
getMaxMergeAtOnce
public int getMaxMergeAtOnce()Returns the current maxMergeAtOnce setting.- See Also:
-
setMaxMergeAtOnceExplicit
Maximum number of segments to be merged at a time, during forceMerge or forceMergeDeletes. Default is 30. -
getMaxMergeAtOnceExplicit
public int getMaxMergeAtOnceExplicit()Returns the current maxMergeAtOnceExplicit setting.- See Also:
-
setMaxMergedSegmentMB
Maximum sized segment to produce during normal merging. This setting is approximate: the estimate of the merged segment size is made by summing sizes of to-be-merged segments (compensating for percent deleted docs). Default is 5 GB. -
getMaxMergedSegmentMB
public double getMaxMergedSegmentMB()Returns the current maxMergedSegmentMB setting.- See Also:
-
setDeletesPctAllowed
Controls the maximum percentage of deleted documents that is tolerated in the index. Lower values make the index more space efficient at the expense of increased CPU and I/O activity. Values must be between 20 and 50. Default value is 33. -
getDeletesPctAllowed
public double getDeletesPctAllowed()Returns the current deletesPctAllowed setting.- See Also:
-
setFloorSegmentMB
Segments smaller than this are "rounded up" to this size, ie treated as equal (floor) size for merge selection. This is to prevent frequent flushing of tiny segments from allowing a long tail in the index. Default is 2 MB. -
getFloorSegmentMB
public double getFloorSegmentMB()Returns the current floorSegmentMB.- See Also:
-
setForceMergeDeletesPctAllowed
When forceMergeDeletes is called, we only merge away a segment if its delete percentage is over this threshold. Default is 10%. -
getForceMergeDeletesPctAllowed
public double getForceMergeDeletesPctAllowed()Returns the current forceMergeDeletesPctAllowed setting.- See Also:
-
setSegmentsPerTier
Sets the allowed number of segments per tier. Smaller values mean more merging but fewer segments.Default is 10.0.
-
getSegmentsPerTier
public double getSegmentsPerTier()Returns the current segmentsPerTier setting.- See Also:
-
getSortedBySegmentSize
private List<TieredMergePolicy.SegmentSizeAndDocs> getSortedBySegmentSize(SegmentInfos infos, MergePolicy.MergeContext mergeContext) throws IOException - Throws:
IOException
-
findMerges
public MergePolicy.MergeSpecification findMerges(MergeTrigger mergeTrigger, SegmentInfos infos, MergePolicy.MergeContext mergeContext) throws IOException Description copied from class:MergePolicyDetermine what set of merge operations are now necessary on the index.IndexWritercalls this whenever there is a change to the segments. This call is always synchronized on theIndexWriterinstance so only one thread at a time will call this method.- Specified by:
findMergesin classMergePolicy- Parameters:
mergeTrigger- the event that triggered the mergeinfos- the total set of segments in the indexmergeContext- the IndexWriter to find the merges on- Throws:
IOException
-
doFindMerges
private MergePolicy.MergeSpecification doFindMerges(List<TieredMergePolicy.SegmentSizeAndDocs> sortedEligibleInfos, long maxMergedSegmentBytes, int mergeFactor, int allowedSegCount, int allowedDelCount, TieredMergePolicy.MERGE_TYPE mergeType, MergePolicy.MergeContext mergeContext, boolean maxMergeIsRunning) throws IOException - Throws:
IOException
-
score
protected TieredMergePolicy.MergeScore score(List<SegmentCommitInfo> candidate, boolean hitTooLarge, Map<SegmentCommitInfo, TieredMergePolicy.SegmentSizeAndDocs> segmentsSizes) throws IOExceptionExpert: scores one merge; subclasses can override.- Throws:
IOException
-
findForcedMerges
public MergePolicy.MergeSpecification findForcedMerges(SegmentInfos infos, int maxSegmentCount, Map<SegmentCommitInfo, Boolean> segmentsToMerge, MergePolicy.MergeContext mergeContext) throws IOExceptionDescription copied from class:MergePolicyDetermine what set of merge operations is necessary in order to merge to<=the specified segment count.IndexWritercalls this when itsIndexWriter.forceMerge(int)method is called. This call is always synchronized on theIndexWriterinstance so only one thread at a time will call this method.- Specified by:
findForcedMergesin classMergePolicy- Parameters:
infos- the total set of segments in the indexmaxSegmentCount- requested maximum number of segments in the index (currently this is always 1)segmentsToMerge- contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is True for a given SegmentInfo, that means this segment was an original segment present in the to-be-merged index; else, it was a segment produced by a cascaded merge.mergeContext- the MergeContext to find the merges on- Throws:
IOException
-
findForcedDeletesMerges
public MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos infos, MergePolicy.MergeContext mergeContext) throws IOException Description copied from class:MergePolicyDetermine what set of merge operations is necessary in order to expunge all deletes from the index.- Specified by:
findForcedDeletesMergesin classMergePolicy- Parameters:
infos- the total set of segments in the indexmergeContext- the MergeContext to find the merges on- Throws:
IOException
-
floorSize
private long floorSize(long bytes) -
toString
-