State of decay 2 13 trainer

3/6/2024

The Whisper model quickly learns which bit of the pre-trained tokenizer to use when fine-tuning. If we use the pre-trained one, we can use all of the weights (and so all of the knowledge!).

Why? Because then we can also leverage all of the pre-trained Whisper weights directly! If we build a new tokenizer, we have to randomly initialise some of the Whisper weights to work with our new tokenizer, meaning we lose some of the knowledge from pre-training. This means that the pre-trained tokenizer already has a vast vocabulary encompassing many thousands of words! I would recommend that you leverage this pre-trained tokenizer directly rather than training a new one. The Whisper model is pre-trained on 96 languages.

Hey advice would be to follow this blog-post for fine-tuning the model:

0 Comments

State of decay 2 13 trainer

Leave a Reply.

Author

Archives

Categories