mwptoolkit.model.Seq2Seq.ept

class mwptoolkit.model.Seq2Seq.ept.EPT(config, dataset)[source]

Bases: Module

Reference:

Kim et al. “Point to the Expression: Solving Algebraic Word Problems using the Expression-Pointer Transformer Model” in EMNLP 2020.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

calculate_loss(batch_data: dict) float[source]

Finish forward-propagating, calculating loss and back-propagation.

Parameters

batch_data – one batch data.

Returns

loss value.

batch_data should include keywords ‘question’, ‘ques len’, ‘equation’,’ques mask’, ‘num pos’, ‘num size’ and ‘max numbers’.

convert_idx2symbol(output, num_list)[source]

batch_size=1

decode(output)[source]
decoder_forward(encoder_output, text_num, text_numpad, src_mask, target=None, output_all_layers=False)[source]
encoder_forward(src, src_mask, output_all_layers=False)[source]
forward(src, src_mask, num_pos, num_size, target=None, output_all_layers=False)[source]
Parameters
  • src (torch.Tensor) – input sequence.

  • src_mask (list) – mask of input sequence.

  • num_pos (list) – number position of input sequence.

  • num_size (list) – number of numbers of input sequence.

  • target (torch.Tensor) – target, default None.

  • output_all_layers (bool) – return output of all layers if output_all_layers is True, default False.

Returns

token_logits:[batch_size, output_length, output_size], symbol_outputs:[batch_size,output_length], model_all_outputs.

gather_vectors(hidden: Tensor, mask: Tensor, max_len: int = 1)[source]

Gather hidden states of indicated positions.

Parameters
  • hidden (torch.Tensor) – Float Tensor of hidden states. Shape [B, S, H], where B = batch size, S = length of sequence, and H = hidden dimension

  • mask (torch.Tensor) – Long Tensor which indicates number indices that we’re interested in. Shape [B, S].

  • max_len (int) – Expected maximum length of vectors per batch. 1 by default.

Return type

Tuple[torch.Tensor, torch.Tensor]

Returns

Tuple of Tensors: - [0]: Float Tensor of indicated hidden states.

Shape [B, N, H], where N = max(number of interested positions, max_len)

  • [1]: Bool Tensor of padded positions.

    Shape [B, N].

model_test(batch_data: dict) tuple[source]

Model test.

Parameters

batch_data – one batch data.

Returns

predicted equation, target equation.

batch_data should include keywords ‘question’, ‘equation’,’ques mask’, ‘num pos’, ‘num size’.

out_expression_expr(item, num_list)[source]
out_expression_op(item, num_list)[source]
predict(batch_data: dict, output_all_layers=False)[source]

predict samples without target.

Parameters
  • batch_data (dict) – one batch data.

  • output_all_layers (bool) – return all layer outputs of model.

Returns

token_logits, symbol_outputs, all_layer_outputs

shift_target(target: Tensor, fill_value=-1) Tensor[source]

Shift matrix to build generation targets.

Parameters
  • target (torch.Tensor) – Target tensor to build generation targets. Shape [B, T]

  • fill_value – Value to be filled at the padded positions.

Return type

torch.Tensor

Returns

Tensor with shape [B, T], where (i, j)-entries are (i, j+1) entry of target tensor.

training: bool
mwptoolkit.model.Seq2Seq.ept.Submodule_types(decoder_type)[source]