Class FSTCompletion

java.lang.Object
org.apache.lucene.search.suggest.fst.FSTCompletion

public class FSTCompletion extends Object
Finite state automata based implementation of "autocomplete" functionality.
See Also:
  • Field Details

    • DEFAULT_BUCKETS

      public static final int DEFAULT_BUCKETS
      Default number of buckets.
      See Also:
    • EMPTY_RESULT

      private static final ArrayList<FSTCompletion.Completion> EMPTY_RESULT
      An empty result. Keep this an ArrayList to keep all the returned lists of single type (monomorphic calls).
    • automaton

      private final FST<Object> automaton
      Finite state automaton encoding all the lookup terms. See class notes for details.
    • rootArcs

      private final FST.Arc<Object>[] rootArcs
      An array of arcs leaving the root automaton state and encoding weights of all completions in their sub-trees.
    • exactFirst

      private boolean exactFirst
      See Also:
    • higherWeightsFirst

      private boolean higherWeightsFirst
      See Also:
  • Constructor Details

    • FSTCompletion

      public FSTCompletion(FST<Object> automaton, boolean higherWeightsFirst, boolean exactFirst)
      Constructs an FSTCompletion, specifying higherWeightsFirst and exactFirst.
      Parameters:
      automaton - Automaton with completions. See FSTCompletionBuilder.
      higherWeightsFirst - Return most popular suggestions first. This is the default behavior for this implementation. Setting it to false has no effect (use constant term weights to sort alphabetically only).
      exactFirst - Find and push an exact match to the first position of the result list if found.
    • FSTCompletion

      public FSTCompletion(FST<Object> automaton)
      Defaults to higher weights first and exact first.
      See Also:
  • Method Details

    • cacheRootArcs

      private static FST.Arc<Object>[] cacheRootArcs(FST<Object> automaton)
      Cache the root node's output arcs starting with completions with the highest weights.
    • getExactMatchStartingFromRootArc

      private int getExactMatchStartingFromRootArc(int rootArcIndex, BytesRef utf8)
      Returns the first exact match by traversing root arcs, starting from the arc rootArcIndex.
      Parameters:
      rootArcIndex - The first root arc index in rootArcs to consider when matching.
      utf8 - The sequence of utf8 bytes to follow.
      Returns:
      Returns the bucket number of the match or -1 if no match was found.
    • lookup

      public List<FSTCompletion.Completion> lookup(CharSequence key, int num)
      Lookup suggestions to key.
      Parameters:
      key - The prefix to which suggestions should be sought.
      num - At most this number of suggestions will be returned.
      Returns:
      Returns the suggestions, sorted by their approximated weight first (decreasing) and then alphabetically (UTF-8 codepoint order).
    • lookupSortedAlphabetically

      private List<FSTCompletion.Completion> lookupSortedAlphabetically(BytesRef key, int num) throws IOException
      Lookup suggestions sorted alphabetically if weights are not constant. This is a workaround: in general, use constant weights for alphabetically sorted result.
      Throws:
      IOException
    • lookupSortedByWeight

      private ArrayList<FSTCompletion.Completion> lookupSortedByWeight(BytesRef key, int num, boolean collectAll) throws IOException
      Lookup suggestions sorted by weight (descending order).
      Parameters:
      collectAll - If true, the routine terminates immediately when num suggestions have been collected. If false, it will collect suggestions from all weight arcs (needed for lookupSortedAlphabetically(org.apache.lucene.util.BytesRef, int).
      Throws:
      IOException
    • checkExistingAndReorder

      private boolean checkExistingAndReorder(ArrayList<FSTCompletion.Completion> list, BytesRef key)
      Checks if the list of Lookup.LookupResults already has a key. If so, reorders that Lookup.LookupResult to the first position.
      Returns:
      Returns true if and only if list contained key.
    • descendWithPrefix

      private boolean descendWithPrefix(FST.Arc<Object> arc, BytesRef utf8) throws IOException
      Descend along the path starting at arc and going through bytes in the argument.
      Parameters:
      arc - The starting arc. This argument is modified in-place.
      utf8 - The term to descend along.
      Returns:
      If true, arc will be set to the arc matching last byte of term. false is returned if no such prefix exists.
      Throws:
      IOException
    • collect

      private boolean collect(List<FSTCompletion.Completion> res, int num, int bucket, BytesRef output, FST.Arc<Object> arc) throws IOException
      Recursive collect lookup results from the automaton subgraph starting at arc.
      Parameters:
      num - Maximum number of results needed (early termination).
      Throws:
      IOException
    • getBucketCount

      public int getBucketCount()
      Returns the bucket count (discretization thresholds).
    • getBucket

      public int getBucket(CharSequence key)
      Returns the bucket assigned to a given key (if found) or -1 if no exact match exists.
    • getFST

      public FST<Object> getFST()
      Returns the internal automaton.