pylingual/dev_scripts/statement/README.md

# seq2seq

- train_tokenizer_auto.py:
  - trains the manual tokenizer

- tokenize_seq2seq.py:
  - tokenize the dataset for the seq2seq model

- train_seq2seq.py:
  - finetuning the pretrained model
  - will create a sequence-to-sequence translation model

- StatementConfiguration.py
  - defines the JSON format for statement translation training

# manual1

Contains JSONs mapping bytecode instructions and their configurations to use in training.