Package smile.nlp.collocation
Class Bigram
java.lang.Object
smile.nlp.Bigram
smile.nlp.collocation.Bigram
- All Implemented Interfaces:
 Comparable<Bigram>
Collocations are expressions of multiple words which commonly co-occur.
 A bigram collocation is a pair of words w1 w2 that appear together with
 statistically significance.
- 
Field Summary
FieldsModifier and TypeFieldDescriptionfinal intThe frequency of bigram in the corpus.final doubleThe chi-square statistical score of the collocation. - 
Constructor Summary
Constructors - 
Method Summary
 
- 
Field Details
- 
count
public final int countThe frequency of bigram in the corpus. - 
score
public final double scoreThe chi-square statistical score of the collocation. 
 - 
 - 
Constructor Details
- 
Bigram
Constructor.- Parameters:
 w1- the first word of bigram.w2- the second word of bigram.count- the frequency of bigram in the corpus.score- the chi-square statistical score of collocation in a corpus.
 
 - 
 - 
Method Details
- 
toString
 - 
compareTo
- Specified by:
 compareToin interfaceComparable<Bigram>
 - 
of
Finds top k bigram collocations in the given corpus.- Parameters:
 corpus- the corpus.k- the top k bigram to compute.minFrequency- The minimum frequency of bigram in the corpus.- Returns:
 - the significant bigram collocations in the descending order of likelihood ratio.
 
 - 
of
Finds bigram collocations in the given corpus whose p-value is less than the given threshold.- Parameters:
 corpus- the corpus.p- the p-value thresholdminFrequency- The minimum frequency of bigram in the corpus.- Returns:
 - the significant bigram collocations in descending order of likelihood ratio.
 
 
 -