mwptoolkit.model.Seq2Seq.transformer¶
- class mwptoolkit.model.Seq2Seq.transformer.Transformer(config, dataset)[source]¶
Bases:
Module
- Reference:
Vaswani et al. “Attention Is All You Need”.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- calculate_loss(batch_data: dict) float [source]¶
Finish forward-propagating, calculating loss and back-propagation.
- Parameters
batch_data – one batch data.
- Returns
loss value.
batch_data should include keywords ‘question’, ‘equation’.
- forward(src, target=None, output_all_layers=False) Tuple[Tensor, Tensor, Dict[str, Any]] [source]¶
- Parameters
src (torch.Tensor) – input sequence, shape: [batch_size, seq_length].
target (torch.Tensor|None) – target, shape: [batch_size, target_length], default None.
output_all_layers (bool) – default False, return output of all layers if output_all_layers is True.
- Returns
token_logits, symbol_outputs, model_all_outputs.
:rtype tuple(torch.Tensor, torch.Tensor, dict)
- model_test(batch_data: dict) tuple [source]¶
Model test.
- Parameters
batch_data – one batch data.
- Returns
predicted equation, target equation.
batch_data should include keywords ‘question’, ‘equation’ and ‘num list’.
- predict(batch_data: dict, output_all_layers=False)[source]¶
predict samples without target.
- Parameters
batch_data (dict) – one batch data.
output_all_layers (bool) – return all layer outputs of model.
- Returns
token_logits, symbol_outputs, all_layer_outputs
- training: bool¶