Class DoubleMetaphone

java.lang.Object
org.apache.commons.codec.language.DoubleMetaphone
All Implemented Interfaces:
Encoder, StringEncoder

public class DoubleMetaphone extends Object implements StringEncoder
Encodes a string into a Double Metaphone value. This Implementation is based on the algorithm by Lawrence Philips.

This class is conditionally thread-safe. The instance field for the maximum code length is mutable setMaxCodeLen(int) but is not volatile, and accesses are not synchronized. If an instance of the class is shared between threads, the caller needs to ensure that suitable synchronization is used to ensure safe publication of the value between threads, and must not invoke setMaxCodeLen(int) after initial setup.

See Also:
  • Field Details

    • VOWELS

      private static final String VOWELS
      "Vowels" to test.
      See Also:
    • SILENT_START

      private static final String[] SILENT_START
      Prefixes when present which are not pronounced.
    • L_R_N_M_B_H_F_V_W_SPACE

      private static final String[] L_R_N_M_B_H_F_V_W_SPACE
    • ES_EP_EB_EL_EY_IB_IL_IN_IE_EI_ER

      private static final String[] ES_EP_EB_EL_EY_IB_IL_IN_IE_EI_ER
    • L_T_K_S_N_M_B_Z

      private static final String[] L_T_K_S_N_M_B_Z
    • maxCodeLen

      private int maxCodeLen
      Maximum length of an encoding, default is 4.
  • Constructor Details

    • DoubleMetaphone

      public DoubleMetaphone()
      Constructs a new instance.
  • Method Details

    • contains

      protected static boolean contains(String value, int start, int length, String... criteria)
      Tests whether value contains any of the criteria starting at index start and matching up to length length.
      Parameters:
      value - The value to test.
      start - Where in value to start testing.
      length - How many to test.
      criteria - The search criteria.
      Returns:
      Whether there was a match.
    • charAt

      protected char charAt(String value, int index)
      Gets the character at index index if available, or Character.MIN_VALUE if out of bounds.
      Parameters:
      value - The String to query.
      index - A string index.
      Returns:
      The character at the index or Character.MIN_VALUE if out of bounds.
    • cleanInput

      private String cleanInput(String input)
      Cleans the input.
    • conditionC0

      private boolean conditionC0(String value, int index)
      Complex condition 0 for 'C'.
    • conditionCH0

      private boolean conditionCH0(String value, int index)
      Complex condition 0 for 'CH'.
    • conditionCH1

      private boolean conditionCH1(String value, int index)
      Complex condition 1 for 'CH'.
    • conditionL0

      private boolean conditionL0(String value, int index)
      Complex condition 0 for 'L'.
    • conditionM0

      private boolean conditionM0(String value, int index)
      Complex condition 0 for 'M'.
    • doubleMetaphone

      public String doubleMetaphone(String value)
      Encodes a value with Double Metaphone.
      Parameters:
      value - String to encode.
      Returns:
      an encoded string.
    • doubleMetaphone

      public String doubleMetaphone(String value, boolean alternate)
      Encodes a value with Double Metaphone, optionally using the alternate encoding.
      Parameters:
      value - String to encode.
      alternate - use alternate encode.
      Returns:
      an encoded string.
    • encode

      public Object encode(Object obj) throws EncoderException
      Encodes the value using DoubleMetaphone. It will only work if obj is a String (like Metaphone).
      Specified by:
      encode in interface Encoder
      Parameters:
      obj - Object to encode (should be of type String).
      Returns:
      An encoded Object (will be of type String).
      Throws:
      EncoderException - encode parameter is not of type String.
    • encode

      public String encode(String value)
      Encodes the value using DoubleMetaphone.
      Specified by:
      encode in interface StringEncoder
      Parameters:
      value - String to encode.
      Returns:
      An encoded String.
    • getMaxCodeLen

      public int getMaxCodeLen()
      Gets the maxCodeLen.
      Returns:
      the maxCodeLen.
    • handleAEIOUY

      private int handleAEIOUY(DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'A', 'E', 'I', 'O', 'U', and 'Y' cases.
    • handleC

      private int handleC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'C' cases.
    • handleCC

      private int handleCC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'CC' cases.
    • handleCH

      private int handleCH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'CH' cases.
    • handleD

      private int handleD(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'D' cases.
    • handleG

      private int handleG(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
      Handles 'G' cases.
    • handleGH

      private int handleGH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'GH' cases.
    • handleH

      private int handleH(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'H' cases.
    • handleJ

      private int handleJ(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
      Handles 'J' cases.
    • handleL

      private int handleL(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'L' cases.
    • handleP

      private int handleP(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'P' cases.
    • handleR

      private int handleR(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
      Handles 'R' cases.
    • handleS

      private int handleS(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
      Handles 'S' cases.
    • handleSC

      private int handleSC(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'SC' cases.
    • handleT

      private int handleT(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'T' cases.
    • handleW

      private int handleW(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'W' cases.
    • handleX

      private int handleX(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index)
      Handles 'X' cases.
    • handleZ

      private int handleZ(String value, DoubleMetaphone.DoubleMetaphoneResult result, int index, boolean slavoGermanic)
      Handles 'Z' cases.
    • isDoubleMetaphoneEqual

      public boolean isDoubleMetaphoneEqual(String value1, String value2)
      Tests whether the Double Metaphone values of two String values are equal.
      Parameters:
      value1 - The left-hand side of the encoded String.equals(Object).
      value2 - The right-hand side of the encoded String.equals(Object).
      Returns:
      true if the encoded Strings are equal; false otherwise.
      See Also:
    • isDoubleMetaphoneEqual

      public boolean isDoubleMetaphoneEqual(String value1, String value2, boolean alternate)
      Tests whether the Double Metaphone values of two String values are equal, optionally using the alternate value.
      Parameters:
      value1 - The left-hand side of the encoded String.equals(Object).
      value2 - The right-hand side of the encoded String.equals(Object).
      alternate - use the alternate value if true.
      Returns:
      true if the encoded Strings are equal; false otherwise.
    • isSilentStart

      private boolean isSilentStart(String value)
      Tests whether or not the value starts with a silent letter. It will return true if the value starts with any of 'GN', 'KN', 'PN', 'WR' or 'PS'.
    • isSlavoGermanic

      private boolean isSlavoGermanic(String value)
      Tests whether or not a value is of slavo-germanic origin. A value is of Slavo-Germanic origin if it contains any of 'W', 'K', 'CZ', or 'WITZ'.
    • isVowel

      private boolean isVowel(char ch)
      Tests whether or not a character is a vowel or not.
    • setMaxCodeLen

      public void setMaxCodeLen(int maxCodeLen)
      Sets the maxCodeLen.
      Parameters:
      maxCodeLen - The maxCodeLen to set.