All Implemented Interfaces:
Filter, IMatcher<org.apache.jena.ontology.OntModel,Alignment,Properties>, eu.sealsproject.platform.res.domain.omt.IOntologyMatchingToolBridge, eu.sealsproject.platform.res.tool.api.IPlugin, eu.sealsproject.platform.res.tool.api.IToolBridge
Direct Known Subclasses:
TransformersFineTunerHpSearch

public class TransformersFineTuner extends TransformersBaseFineTuner implements Filter
This class is used to fine-tune a transformer model based on a generated dataset. In every call to the match method, the training data will be generated and appended to a temporary file. When you call the TransformersBaseFineTuner.finetuneModel() method, then a model is fine-tuned and the training file is deleted.
  • Field Details

    • LOGGER

      private static final org.slf4j.Logger LOGGER
    • batchSizeOptimization

      protected BatchSizeOptimization batchSizeOptimization
  • Constructor Details

  • Method Details

    • finetuneModel

      public File finetuneModel(File trainingFile) throws Exception
      Description copied from class: TransformersBaseFineTuner
      Finetune a given model with the provided text in the csv file (three columns: first text, second text, label(0/1))
      Specified by:
      finetuneModel in class TransformersBaseFineTuner
      Parameters:
      trainingFile - csv file with three columns: first text, second text, label(0/1) (can be generated with TransformersBaseFineTuner.createTrainingFile(OntModel, OntModel, Alignment) )
      Returns:
      the final location (directory) of the finetuned model (which is also given in the constructor)
      Throws:
      Exception - in case of any error
    • getMaximumPerDeviceTrainBatchSize

      public int getMaximumPerDeviceTrainBatchSize()
      This functions tries to execute the training with one step to check which maximum per_device_train_batch_size is possible.It will start with 2 and checks only powers of 2. It uses the data collected by running this fine tuner on testcases. If you have a file (comma separated) then you can use getMaximumPerDeviceTrainBatchSize(java.io.File).
      Returns:
      the maximum per_device_train_batch_size with the current configuration
    • getMaximumPerDeviceTrainBatchSize

      public int getMaximumPerDeviceTrainBatchSize(File trainingFile)
      This functions tries to execute the training with one step to check which maximum per_device_train_batch_size is possible.It will start with 2 and checks only powers of 2. It uses the data collected by running this fine tuner on testcases.
      Parameters:
      trainingFile - the training file to use
      Returns:
      the maximum per_device_train_batch_size with the current configuration
    • addTrainingParameterToMakeTrainingFaster

      public void addTrainingParameterToMakeTrainingFaster()
      This will add (potencially multiple) training parameters to the current trainingArguments to make the training faster. Thus do not change the trainingArguments object afterwards (with setTrainingArguments ). What you can do is to add more training arguments via getTrainingArguments.addParameter (as long as you do not modify any parameters added by this method). The following parameters are set:
      • fp16
      See Also:
    • isAdjustMaxBatchSize

      public boolean isAdjustMaxBatchSize()
      Returns the value if max batch size is adjusted or not.
      Returns:
      true if the batch size is modified.
    • setAdjustMaxBatchSize

      public void setAdjustMaxBatchSize(boolean adjustMaxBatchSize)
      If set to true, then it will set the max value of the search space for the training batch size to the maximum which is possible with the current GPU/CPU.
      Parameters:
      adjustMaxBatchSize - true to enable the adjustment
    • getBatchSizeOptimization

      public BatchSizeOptimization getBatchSizeOptimization()
    • setBatchSizeOptimization

      public void setBatchSizeOptimization(BatchSizeOptimization batchSizeOptimization)