mwptoolkit.model.PreTrain.bertgen

class mwptoolkit.model.PreTrain.bertgen.BERTGen(config, dataset)[source]

Bases: Module

Reference:

Devlin et al. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

calculate_loss(batch_data: dict) float[source]

Finish forward-propagating, calculating loss and back-propagation.

Parameters

batch_data (dict) – one batch data.

Returns

loss value.

Return type

float

convert_idx2symbol(outputs, num_lists)[source]
decode(output)[source]
decode_(outputs)[source]
decoder_forward(encoder_outputs, source_padding_mask, target=None, output_all_layers=None)[source]
encoder_forward(seq, output_all_layers=False)[source]
forward(seq, target=None, output_all_layers=False) Tuple[Tensor, Tensor, Dict[str, Any]][source]
Parameters
  • seq (torch.Tensor) – input sequence, shape: [batch_size, seq_length].

  • target (torch.Tensor | None) – target, shape: [batch_size,target_length].

  • output_all_layers (bool) – return output of all layers if output_all_layers is True, default False.

Returns

token_logits: [batch_size, output_length, output_size], symbol_outputs: [batch_size,output_length], model_all_outputs.

Return type

tuple(torch.Tensor, torch.Tensor, dict)

model_test(batch_data: dict) tuple[source]

Model test.

Parameters

batch_data (dict) – one batch data.

Returns

predicted equation, target equation.

Return type

tuple(list,list)

predict(batch_data: dict, output_all_layers=False)[source]

predict samples without target.

Parameters
  • batch_data (dict) – one batch data.

  • output_all_layers (bool) – return all layer outputs of model.

Returns

token_logits, symbol_outputs, all_layer_outputs

training: bool