Package org.apache.lucene.codecs
The Codec API allows you to customise the way the following pieces of index information are stored:
- Postings lists - see
PostingsFormat - DocValues - see
DocValuesFormat - Stored fields - see
StoredFieldsFormat - Term vectors - see
TermVectorsFormat - Points - see
PointsFormat - FieldInfos - see
FieldInfosFormat - SegmentInfo - see
SegmentInfoFormat - Norms - see
NormsFormat - Live documents - see
LiveDocsFormat
Codecs are identified by name through the Java Service Provider Interface. To create your own codec, extend
Codec and pass the new codec's name to the super() constructor:
public class MyCodec extends Codec {
public MyCodec() {
super("MyCodecName");
}
...
}
You will need to register the Codec class so that the ServiceLoader can find it, by including a
META-INF/services/org.apache.lucene.codecs.Codec file on your classpath that contains the package-qualified
name of your codec.
If you just want to customise the PostingsFormat, or use different postings
formats for different fields, then you can register your custom postings format in the same way (in
META-INF/services/org.apache.lucene.codecs.PostingsFormat), and then extend the default
codec and override
org.apache.lucene.codecs.luceneMN.LuceneMNCodec#getPostingsFormatForField(String) to return your custom
postings format.
Similarly, if you just want to customise the DocValuesFormat per-field, have
a look at LuceneMNCodec.getDocValuesFormatForField(String).
-
ClassesClassDescriptionHolds all state required for
PostingsReaderBaseto produce aPostingsEnumwithout re-seeking the terms dict.Encodes/decodes an inverted index segment.This static holder class prevents classloading deadlock by delaying init of default codecs and available codecs until needed.Utility class for reading and writing versioned headers.This class accumulates the (freq, norm) pairs that may produce competitive scores.A read-onlyDirectorythat consists of a view over a compound file.Encodes/decodes compound filesAbstract API that consumes numeric, binary and sorted docvalues.Tracks state of one binary sub-reader that we are mergingA mergedTermsEnum.Tracks state of one numeric sub-reader that we are mergingTracks state of one sorted sub-reader that we are mergingTracks state of one sorted numeric sub-reader that we are mergingTracks state of one sorted set sub-reader that we are mergingEncodes/decodes per-document values.This static holder class prevents classloading deadlock by delaying init of doc values formats until needed.Abstract API that produces numeric, binary, sorted, sortedset, and sortednumeric docvalues.Encodes/decodesFieldInfosAbstract API that consumes terms, doc, freq, prox, offset and payloads postings.Abstract API that produces terms, doc, freq, prox, offset and payloads postings.A codec that forwards all its method calls to another codec.Format for live/deleted documentsThis abstract class reads skip lists with multiple levels.used to buffer the top skip levelsThis abstract class writes skip lists with multiple levels.PointValueswhose order of points can be changed.Abstract API that consumes normalization values.Tracks state of one numeric sub-reader that we are mergingEncodes/decodes per-document score normalization values.Abstract API that produces field normalization valuesRemove this file when adding back compat codecsEncodes/decodes indexed points.Abstract API to visit point values.Abstract API to write pointsEncodes/decodes terms, postings, and proximity data.This static holder class prevents classloading deadlock by delaying init of postings formats until needed.The core terms dictionaries (BlockTermsReader, BlockTreeTermsReader) interact with a single instance of this class to manage creation ofPostingsEnumandPostingsEnuminstances.Class that plugs into term dictionaries, such asBlockTreeTermsWriter, and handles writing postings.Extension ofPostingsWriterBase, adding a push API for writing each element of the postings.Expert: Controls the format of theSegmentInfo(segment metadata file).Controls the format of stored fieldsCodec API for reading stored fields.Codec API for writing stored fields: For every document,StoredFieldsWriter.startDocument()is called, informing the Codec that a new document has started.Holder for per-term statistics.Controls the format of term vectorsCodec API for reading term vectors:Codec API for writing term vectors: For every document,TermVectorsWriter.startDocument(int)is called, informing the Codec how many fields will be written.