mwptoolkit.model.PreTrain.robertagen¶
- class mwptoolkit.model.PreTrain.robertagen.RobertaGen(config, dataset)[source]¶
Bases:
Module
- Reference:
Liu et al. “RoBERTa: A Robustly Optimized BERT Pretraining Approach”.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- calculate_loss(batch_data: dict) float [source]¶
Finish forward-propagating, calculating loss and back-propagation.
- Parameters
batch_data (dict) – one batch data.
- Returns
loss value.
- Return type
float
- forward(seq, target=None, output_all_layers=False) Tuple[Tensor, Tensor, Dict[str, Any]] [source]¶
- Parameters
seq (torch.Tensor) – input sequence, shape: [batch_size, seq_length].
target (torch.Tensor | None) – target, shape: [batch_size,target_length].
output_all_layers (bool) – return output of all layers if output_all_layers is True, default False.
- Returns
token_logits: [batch_size, output_length, output_size], symbol_outputs: [batch_size,output_length], model_all_outputs.
- Return type
tuple(torch.Tensor, torch.Tensor, dict)
- model_test(batch_data: dict) tuple [source]¶
Model test.
- Parameters
batch_data (dict) – one batch data.
- Returns
predicted equation, target equation.
- Return type
tuple(list,list)
- predict(batch_data: dict, output_all_layers=False)[source]¶
predict samples without target.
- Parameters
batch_data (dict) – one batch data.
output_all_layers (bool) – return all layer outputs of model.
- Returns
token_logits, symbol_outputs, all_layer_outputs
- training: bool¶