Class PorterStemmer
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.services.nlp.PorterStemmer
PorterStemmer, implementing the Porter Stemming Algorithm
The PorterStemmer class transforms a word into its root form. The input
word can be provided a character at time (by calling add()), or at once
by calling one of the various stem(something) methods.
-
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
add
(char ch) Add a character to the word being stemmed.void
add
(char[] w, int wLen) Adds wLen characters to the word being stemmed contained in a portion of a char[] array.private final boolean
cons
(int i) private final boolean
cvc
(int i) private final boolean
doublec
(int j) private final boolean
char[]
Returns a reference to a character buffer containing the results of the stemming process.int
Returns the length of the word resulting from the stemming process.private final int
m()
private final void
private final void
void
stem()
Stem the word placed into the PorterStemmer buffer through calls to add().private final void
step1()
private final void
step2()
private final void
step3()
private final void
step4()
private final void
step5()
private final void
step6()
toString()
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)private final boolean
-
Field Details
-
b
private char[] b -
i
private int i -
i_end
private int i_end -
j
private int j -
k
private int k -
INC
private static final int INC- See Also:
-
-
Constructor Details
-
PorterStemmer
public PorterStemmer()
-
-
Method Details
-
add
public void add(char ch) Add a character to the word being stemmed. When you are finished adding characters, you can call stem(void) to stem the word.- Parameters:
ch
- Char.
-
add
public void add(char[] w, int wLen) Adds wLen characters to the word being stemmed contained in a portion of a char[] array. This is like repeated calls of add(char ch), but faster.- Parameters:
w
- wwLen
- wLen
-
toString
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.) -
getResultLength
public int getResultLength()Returns the length of the word resulting from the stemming process.- Returns:
- int
-
getResultBuffer
public char[] getResultBuffer()Returns a reference to a character buffer containing the results of the stemming process. You also need to consult getResultLength() to determine the length of the result.- Returns:
- b (char[])
-
cons
private final boolean cons(int i) -
m
private final int m() -
vowelinstem
private final boolean vowelinstem() -
doublec
private final boolean doublec(int j) -
cvc
private final boolean cvc(int i) -
ends
-
setto
-
r
-
step1
private final void step1() -
step2
private final void step2() -
step3
private final void step3() -
step4
private final void step4() -
step5
private final void step5() -
step6
private final void step6() -
stem
public void stem()Stem the word placed into the PorterStemmer buffer through calls to add(). Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
-