mirror of
https://github.com/syssec-utd/pylingual.git
synced 2026-05-11 02:40:13 -07:00
seq2seq
-
train_tokenizer_auto.py:
- trains the manual tokenizer
-
tokenize_seq2seq.py:
- tokenize the dataset for the seq2seq model
-
train_seq2seq.py:
- finetuning the pretrained model
- will create a sequence-to-sequence translation model
-
StatementConfiguration.py
- defines the JSON format for statement translation training
manual1
Contains JSONs mapping bytecode instructions and their configurations to use in training.