Files
pylingual/dev_scripts/statement/README.md
2025-03-13 16:56:36 -05:00

19 lines
449 B
Markdown

# seq2seq
- train_tokenizer_auto.py:
- trains the manual tokenizer
- tokenize_seq2seq.py:
- tokenize the dataset for the seq2seq model
- train_seq2seq.py:
- finetuning the pretrained model
- will create a sequence-to-sequence translation model
- StatementConfiguration.py
- defines the JSON format for statement translation training
# manual1
Contains JSONs mapping bytecode instructions and their configurations to use in training.