Package org.apache.lucene.document
Document for indexing and searching.
The document package provides the user level logical representation of content to be indexed and searched. The
package also provides utilities for working with Documents and IndexableFields.
Document and IndexableField
A Document is a collection of IndexableFields. A
IndexableField is a logical representation of a user's content that needs to be indexed or stored.
IndexableFields have a number of properties that tell Lucene how to treat the content (like indexed, tokenized,
stored, etc.) See the Field implementation of IndexableField
for specifics on these properties.
Note: it is common to refer to Documents having Fields, even though technically they have
IndexableFields.
Working with Documents
First and foremost, a Document is something created by the user application. It is your job
to create Documents based on the content of the files you are working with in your application (Word, txt, PDF, Excel or any other format.)
How this is done is completely up to you. That being said, there are many tools available in other projects that can make
the process of taking a file and converting it into a Lucene Document.
The DateTools is a utility class to make dates and times searchable. IntPoint, LongPoint,
FloatPoint and DoublePoint enable indexing
of numeric values (and also dates) for fast range queries using PointRangeQuery
-
ClassDescriptionAn indexed 128-bit
BigIntegerfield.Field that stores a per-documentBytesRefvalue.An indexed binary field for fast range filters.Provides support for converting dates to strings and vice-versa.Specifies the time granularity.Documents are the unit of indexing and search.AStoredFieldVisitorthat creates aDocumentfrom stored fields.Syntactic sugar for encoding doubles as NumericDocValues viaDouble.doubleToRawLongBits(double).An indexeddoublefield for fast range filters.Builder for multi range queries for DoublePointsAn indexed Double Range field.DocValues field for DoubleRange.ADoubleValuesSourceinstance which can be used to read the values of a feature from aFeatureFieldfor documents.Fieldthat can be used to store static scoring factors into documents.Sorts using the value of a specified feature name from aFeatureField.Expert: directly create a field for a document.Specifies whether and how a field should be stored.Describes the properties of a field.Syntactic sugar for encoding floats as NumericDocValues viaFloat.floatToRawIntBits(float).An indexedfloatfield for fast range filters.Builder for multi range queries for FloatPointsKNN search on top of N dimensional indexed float points.An indexed Float Range field.DocValues field for FloatRange.An indexedhalf-floatfield for fast range filters.An indexed 128-bitInetAddressfield.An indexed InetAddress Range FieldAn indexedintfield for fast range filters.Builder for multi range queries for IntPointsAn indexed Integer Range field.DocValues field for IntRange.An indexed 2-Dimension Bounding Box field for the Geospatial Lat/Lon Coordinate systemDistance query forLatLonDocValuesField.An per-document location field.Finds all previously indexed geo points that comply the givenShapeField.QueryRelationwith the specified array ofLatLonGeometry.An indexed location field.Compares documents by distance from an origin pointDistance query forLatLonPoint.Finds all previously indexed geo points that comply the givenShapeField.QueryRelationwith the specified array ofLatLonGeometry.Sorts by distance from an origin location.An geo shape utility class for indexing and searching gis geometries whose vertices are latitude, longitude values (in decimal degrees).Finds all previously indexed geo shapes that intersect the specified bounding box.Holds spatial logic for a bounding box that works in the encoded spaceFinds all previously indexed geo shapes that comply the givenShapeField.QueryRelationwith the specified array ofLatLonGeometry.Defers actually loading a field's value until you ask for it.An indexedlongfield for fast range filters.Builder for multi range queries for LongPointsAn indexed Long Range field.DocValues field for LongRange.Field that stores a per-documentlongvalue for scoring, sorting or value retrieval.Query class for searchingRangeFieldtypes by a definedPointValues.Relation.Used byRangeFieldQueryto check how each internal or leaf node relates to the query.A base shape utility class used for both LatLon (spherical) and XY (cartesian) shape fields.Represents a encoded triangle usingShapeField.decodeTriangle(byte[], DecodedTriangle).type of triangleQuery Relation Typespolygons are decomposed into tessellated triangles usingTessellatorthese triangles are encoded and inserted as separate indexed POINT fieldsField that stores a per-documentBytesRefvalue, indexed for sorting.Field that stores a per-documentlongvalues for scoring, sorting or value retrieval.Field that stores a set of per-documentBytesRefvalues, indexed for faceting,grouping,joining.utility class for implementing constant score logic specific to INTERSECT, WITHIN, and DISJOINTVisitor used for walking the BKD tree.A field whose value is stored so thatIndexSearcher.doc(int)andIndexReader.document()will return the field and its value.A field that is indexed but not tokenized: the entire String value is indexed as a single token.A field that is indexed and tokenized, without term vectors.An per-document location field.XYGeometry query forXYDocValuesField.Compares documents by distance from an origin pointAn indexed XY position field.Finds all previously indexed points that fall within the specified XY geometries.Sorts by distance from an origin location.A cartesian shape utility class for indexing and searching geometries whose vertices are unitless x, y values.Finds all previously indexed cartesian shapes that comply the givenShapeField.QueryRelationwith the specified array ofXYGeometry.