Class DocValuesTermsQuery
- All Implemented Interfaces:
Accountable
Query that only accepts documents whose
term value in the specified field is contained in the
provided set of allowed terms.
This is the same functionality as TermsQuery (from queries/), but because of drastically different implementations, they also have different performance characteristics, as described below.
NOTE: be very careful using this query: it is
typically much slower than using TermsQuery,
but in certain specialized cases may be faster.
With each search, this query translates the specified
set of Terms into a private LongBitSet keyed by
term number per unique IndexReader (normally one
reader per segment). Then, during matching, the term
number for each docID is retrieved from the cache and
then checked for inclusion using the LongBitSet.
Since all testing is done using RAM resident data
structures, performance should be very fast, most likely
fast enough to not require further caching of the
DocIdSet for each possible combination of terms.
However, because docIDs are simply scanned linearly, an
index with a great many small documents may find this
linear scan too costly.
In contrast, TermsQuery builds up an FixedBitSet,
keyed by docID, every time it's created, by enumerating
through all matching docs using PostingsEnum to seek
and scan through each term's docID list. While there is
no linear scan of all docIDs, besides the allocation of
the underlying array in the FixedBitSet, this
approach requires a number of "disk seeks" in proportion
to the number of terms, which can be exceptionally costly
when there are cache misses in the OS's IO cache.
Generally, this filter will be slower on the first invocation for a given field, but subsequent invocations, even if you change the allowed set of Terms, should be faster than TermsQuery, especially as the number of Terms being matched increases. If you are matching only a very small number of terms, and those terms in turn match a very small number of documents, TermsQuery may perform faster.
Which query is best is very application dependent.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final longprivate final Stringprivate final PrefixCodedTermsprivate final intFields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE -
Constructor Summary
ConstructorsConstructorDescriptionDocValuesTermsQuery(String field, String... terms) DocValuesTermsQuery(String field, Collection<BytesRef> terms) DocValuesTermsQuery(String field, BytesRef... terms) -
Method Summary
Modifier and TypeMethodDescriptioncreateWeight(IndexSearcher searcher, ScoreMode scoreMode, float boost) Expert: Constructs an appropriate Weight implementation for this query.booleanOverride and implement query instance equivalence properly in a subclass.private booleanequalsTo(DocValuesTermsQuery other) getField()getTerms()inthashCode()Override and implement query hash code properly in a subclass.longReturn the memory usage of this object in bytes.Prints a query to a string, withfieldassumed to be the default field and omitted.voidvisit(QueryVisitor visitor) Recurse through the query tree, visiting any child queriesMethods inherited from class org.apache.lucene.search.Query
classHash, rewrite, sameClassAs, toStringMethods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
BASE_RAM_BYTES
private static final long BASE_RAM_BYTES -
field
-
termData
-
termDataHashCode
private final int termDataHashCode
-
-
Constructor Details
-
DocValuesTermsQuery
-
DocValuesTermsQuery
-
DocValuesTermsQuery
-
-
Method Details
-
equals
Description copied from class:QueryOverride and implement query instance equivalence properly in a subclass. This is required so thatQueryCacheworks properly. Typically a query will be equal to another only if it's an instance of the same class and its document-filtering properties are identical that other instance. Utility methods are provided for certain repetitive code. -
equalsTo
-
hashCode
public int hashCode()Description copied from class:QueryOverride and implement query hash code properly in a subclass. This is required so thatQueryCacheworks properly. -
toString
Description copied from class:QueryPrints a query to a string, withfieldassumed to be the default field and omitted. -
getField
- Returns:
- the name of the field searched by this query.
-
getTerms
- Returns:
- the terms looked up by this query, prefix-encoded.
-
ramBytesUsed
public long ramBytesUsed()Description copied from interface:AccountableReturn the memory usage of this object in bytes. Negative values are illegal.- Specified by:
ramBytesUsedin interfaceAccountable
-
visit
Description copied from class:QueryRecurse through the query tree, visiting any child queries -
createWeight
public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float boost) throws IOException Description copied from class:QueryExpert: Constructs an appropriate Weight implementation for this query.Only implemented by primitive queries, which re-write to themselves.
- Overrides:
createWeightin classQuery- Parameters:
scoreMode- How the produced scorers will be consumed.boost- The boost that is propagated by the parent queries.- Throws:
IOException
-