mwptoolkit.model.Seq2Seq.ept¶
- class mwptoolkit.model.Seq2Seq.ept.EPT(config, dataset)[source]¶
Bases:
Module
- Reference:
Kim et al. “Point to the Expression: Solving Algebraic Word Problems using the Expression-Pointer Transformer Model” in EMNLP 2020.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- calculate_loss(batch_data: dict) float [source]¶
Finish forward-propagating, calculating loss and back-propagation.
- Parameters
batch_data – one batch data.
- Returns
loss value.
batch_data should include keywords ‘question’, ‘ques len’, ‘equation’,’ques mask’, ‘num pos’, ‘num size’ and ‘max numbers’.
- decoder_forward(encoder_output, text_num, text_numpad, src_mask, target=None, output_all_layers=False)[source]¶
- forward(src, src_mask, num_pos, num_size, target=None, output_all_layers=False)[source]¶
- Parameters
src (torch.Tensor) – input sequence.
src_mask (list) – mask of input sequence.
num_pos (list) – number position of input sequence.
num_size (list) – number of numbers of input sequence.
target (torch.Tensor) – target, default None.
output_all_layers (bool) – return output of all layers if output_all_layers is True, default False.
- Returns
token_logits:[batch_size, output_length, output_size], symbol_outputs:[batch_size,output_length], model_all_outputs.
- gather_vectors(hidden: Tensor, mask: Tensor, max_len: int = 1)[source]¶
Gather hidden states of indicated positions.
- Parameters
hidden (torch.Tensor) – Float Tensor of hidden states. Shape [B, S, H], where B = batch size, S = length of sequence, and H = hidden dimension
mask (torch.Tensor) – Long Tensor which indicates number indices that we’re interested in. Shape [B, S].
max_len (int) – Expected maximum length of vectors per batch. 1 by default.
- Return type
Tuple[torch.Tensor, torch.Tensor]
- Returns
Tuple of Tensors: - [0]: Float Tensor of indicated hidden states.
Shape [B, N, H], where N = max(number of interested positions, max_len)
- [1]: Bool Tensor of padded positions.
Shape [B, N].
- model_test(batch_data: dict) tuple [source]¶
Model test.
- Parameters
batch_data – one batch data.
- Returns
predicted equation, target equation.
batch_data should include keywords ‘question’, ‘equation’,’ques mask’, ‘num pos’, ‘num size’.
- predict(batch_data: dict, output_all_layers=False)[source]¶
predict samples without target.
- Parameters
batch_data (dict) – one batch data.
output_all_layers (bool) – return all layer outputs of model.
- Returns
token_logits, symbol_outputs, all_layer_outputs
- shift_target(target: Tensor, fill_value=-1) Tensor [source]¶
Shift matrix to build generation targets.
- Parameters
target (torch.Tensor) – Target tensor to build generation targets. Shape [B, T]
fill_value – Value to be filled at the padded positions.
- Return type
torch.Tensor
- Returns
Tensor with shape [B, T], where (i, j)-entries are (i, j+1) entry of target tensor.
- training: bool¶