mwptoolkit.utils.preprocess_tool.number_transfer¶
- mwptoolkit.utils.preprocess_tool.number_transfer.get_num_pos(input_seq, mask_type, pattern)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.num_transfer_alg514(data, mask_type, equ_split_symbol=';', vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.num_transfer_draw(data, mask_type, equ_split_symbol=';', vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.num_transfer_hmwp(data, mask_type, equ_split_symbol=';', vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.num_transfer_multi(data, mask_type, equ_split_symbol=';', vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer(datas, dataset_name, task_type, mask_type, min_generate_keep, linear_dataset, equ_split_symbol=';', vocab_level='word', word_lower=False) Tuple[list, list, int, list] [source]¶
number transfer
- Parameters
datas (list) – dataset.
dataset_name (str) – dataset name.
task_type (str) – [single_equation | multi_equation], task type.
mask_type –
min_generate_keep (int) – generate number that count greater than the value, will be kept in output symbols.
linear_dataset (bool) –
equ_split_symbol (str) – equation split symbol, in multiple-equation dataset, symbol to split equations, this symbol will be repalced with special token SpecialTokens.BRG
vocab_level (str) –
word_lower (bool) –
- Returns
processed datas, generate number list, copy number, unk symbol list.
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer_ape200k(data, mask_type, linear, vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer_asdiv_a(data, mask_type, linear, vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer_math23k(data, mask_type, linear, vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer_mawps(data, mask_type, linear, vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer_mawps_single(data, mask_type, linear, vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer_single(data, mask_type, linear, vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.number_transfer_svamp(data, mask_type, linear, vocab_level='word', word_lower=False)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.seg_and_tag_ape200k(st, nums_fraction, nums)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.seg_and_tag_asdiv_a(st, nums_fraction, nums)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.seg_and_tag_math23k(st, nums_fraction, nums)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.seg_and_tag_mawps(st, nums_fraction, nums)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.seg_and_tag_mawps_single(st, nums_fraction, nums)[source]¶
- mwptoolkit.utils.preprocess_tool.number_transfer.seg_and_tag_multi(st, nums_fraction, nums)[source]¶