Class Word2VecConfiguration
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_ml.python.Word2VecConfiguration
The configuration for the word2vec calculation.
-
Field Summary
Modifier and TypeFieldDescriptionprivate int
The epochs to be used for the training.static final int
Default forepochs
parameter.private int
Iterations during the word2vec training.static final int
Default value for parameteriterations
.private static org.slf4j.Logger
Default logger.static final int
Default for parameterminCount
private int
The minimum count for the word2vec training.private int
The number of negatives during the word2vec training.static final int
Default value for parameternegatives
.private int
The number of threads to be used for the computation.private double
Documentation of parameter from the gensim documentation: "The threshold for configuring which higher-frequency words are randomly downsampled, useful range is (0, 1e-5)."static final double
Default forsample
parameter.private Word2VecType
Model type.static final int
Default value for parametervectorDimension
.private int
Size of the vector.static final int
Default value for parameterwindowSize
.private int
The size of the window during the word2vec training. -
Constructor Summary
ConstructorDescriptionDefault Constructor.ConstructorWord2VecConfiguration
(Word2VecType type, int vectorDimension) ConstructorWord2VecConfiguration
(Word2VecType type, int vectorDimension, int iterations) Constructor -
Method Summary
Modifier and TypeMethodDescriptionint
int
int
int
int
double
getType()
int
int
void
setEpochs
(int epochs) void
setIterations
(int iterations) void
setMinCount
(int minCount) void
setNegatives
(int negatives) void
setNumberOfThreads
(int numberOfThreads) void
setSample
(double sample) void
setType
(Word2VecType type) void
setVectorDimension
(int vectorDimension) void
setWindowSize
(int windowSize)
-
Field Details
-
LOGGER
private static org.slf4j.Logger LOGGERDefault logger. -
type
Model type. Default type: SG. -
vectorDimension
private int vectorDimensionSize of the vector. Default: 200. -
VECTOR_DIMENSION_DEFAULT
public static final int VECTOR_DIMENSION_DEFAULTDefault value for parametervectorDimension
.- See Also:
-
windowSize
private int windowSizeThe size of the window during the word2vec training. Default: 5. -
WINDOW_SIZE_DEFAULT
public static final int WINDOW_SIZE_DEFAULTDefault value for parameterwindowSize
.- See Also:
-
iterations
private int iterationsIterations during the word2vec training. -
ITERATIONS_DEFAULT
public static final int ITERATIONS_DEFAULTDefault value for parameteriterations
.- See Also:
-
negatives
private int negativesThe number of negatives during the word2vec training. Default 5. -
NEGATIVES_DEFAULT
public static final int NEGATIVES_DEFAULTDefault value for parameternegatives
.- See Also:
-
minCount
private int minCountThe minimum count for the word2vec training. -
MIN_COUNT_DEFAULT
public static final int MIN_COUNT_DEFAULTDefault for parameterminCount
- See Also:
-
numberOfThreads
private int numberOfThreadsThe number of threads to be used for the computation. -
sample
private double sampleDocumentation of parameter from the gensim documentation: "The threshold for configuring which higher-frequency words are randomly downsampled, useful range is (0, 1e-5)." -
SAMPLE_DEFAULT
public static final double SAMPLE_DEFAULTDefault forsample
parameter.- See Also:
-
epochs
private int epochsThe epochs to be used for the training. -
EPOCHS_DEFAULT
public static final int EPOCHS_DEFAULTDefault forepochs
parameter.- See Also:
-
-
Constructor Details
-
Word2VecConfiguration
public Word2VecConfiguration()Default Constructor. Many parameters are assumed such as training type SG. -
Word2VecConfiguration
Constructor- Parameters:
type
- Training type (SG/CBOW).
-
Word2VecConfiguration
Constructor- Parameters:
type
- Training type (SG/CBOW).vectorDimension
- ize of the vectors (number of elements).
-
Word2VecConfiguration
Constructor- Parameters:
type
- Training type (SG/CBOW).vectorDimension
- Size of the vectors (number of elements).iterations
- aka epochs
-
-
Method Details
-
getNumberOfThreads
public int getNumberOfThreads() -
setNumberOfThreads
public void setNumberOfThreads(int numberOfThreads) -
getNegatives
public int getNegatives() -
setNegatives
public void setNegatives(int negatives) -
getIterations
public int getIterations() -
setIterations
public void setIterations(int iterations) -
getWindowSize
public int getWindowSize() -
setWindowSize
public void setWindowSize(int windowSize) -
getVectorDimension
public int getVectorDimension() -
setVectorDimension
public void setVectorDimension(int vectorDimension) -
getMinCount
public int getMinCount() -
setMinCount
public void setMinCount(int minCount) -
getType
-
setType
-
getSample
public double getSample() -
setSample
public void setSample(double sample) -
getEpochs
public int getEpochs() -
setEpochs
public void setEpochs(int epochs)
-