Module org.apache.lucene.sandbox
Class IDVersionPostingsFormat
java.lang.Object
org.apache.lucene.codecs.PostingsFormat
org.apache.lucene.sandbox.codecs.idversion.IDVersionPostingsFormat
- All Implemented Interfaces:
NamedSPILoader.NamedSPI
A PostingsFormat optimized for primary-key (ID) fields that also record a version (long) for each
ID, delivered as a payload created by
longToBytes(long, org.apache.lucene.util.BytesRef) during indexing. At search time, the
TermsEnum implementation IDVersionSegmentTermsEnum enables fast (using only the terms
index when possible) lookup for whether a given ID was previously indexed with version > N
(see IDVersionSegmentTermsEnum.seekExact(BytesRef,long).
This is most effective if the app assigns monotonically increasing global version to each
indexed doc. Then, during indexing, use IDVersionSegmentTermsEnum.seekExact(BytesRef,long) (along with LiveFieldValues) to
decide whether the document you are about to index was already indexed with a higher version, and
skip it if so.
The field is effectively indexed as DOCS_ONLY and the docID is pulsed into the terms dictionary, but the user must feed in the version as a payload on the first token.
NOTE: term vectors cannot be indexed with this field (not that you should really ever want to do this).
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final longversion must be <= this, because we encode with ZigZag.private final intstatic final longversion must be >= this.private final intFields inherited from class org.apache.lucene.codecs.PostingsFormat
EMPTY -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic longbytesToLong(BytesRef bytes) fieldsConsumer(SegmentWriteState state) Writes a new segmentfieldsProducer(SegmentReadState state) Reads a segment.static voidlongToBytes(long v, BytesRef bytes) Methods inherited from class org.apache.lucene.codecs.PostingsFormat
availablePostingsFormats, forName, getName, reloadPostingsFormats, toString
-
Field Details
-
MIN_VERSION
public static final long MIN_VERSIONversion must be >= this.- See Also:
-
MAX_VERSION
public static final long MAX_VERSIONversion must be <= this, because we encode with ZigZag.- See Also:
-
minTermsInBlock
private final int minTermsInBlock -
maxTermsInBlock
private final int maxTermsInBlock
-
-
Constructor Details
-
IDVersionPostingsFormat
public IDVersionPostingsFormat() -
IDVersionPostingsFormat
public IDVersionPostingsFormat(int minTermsInBlock, int maxTermsInBlock)
-
-
Method Details
-
fieldsConsumer
Description copied from class:PostingsFormatWrites a new segment- Specified by:
fieldsConsumerin classPostingsFormat- Throws:
IOException
-
fieldsProducer
Description copied from class:PostingsFormatReads a segment. NOTE: by the time this call returns, it must hold open any files it will need to use; else, those files may be deleted. Additionally, required files may be deleted during the execution of this call before there is a chance to open them. Under these circumstances an IOException should be thrown by the implementation. IOExceptions are expected and will automatically cause a retry of the segment opening logic with the newly revised segments.- Specified by:
fieldsProducerin classPostingsFormat- Throws:
IOException
-
bytesToLong
-
longToBytes
-