Understanding model evolution during pretraining using the JET Expansion framework.
The pretraining dynamics of large language models reveal how the model's behavior changes as it processes increasing amounts of data. Using JET Expansion, we can trace these dynamics without requiring additional data or training. This allows us to visualize and analyze how specific aspects of the model's computation evolve during the pretraining phase. For example, tracking concept emergence, understanding layer behavior over different pretraining stages, comparing different pretraining strategies to evaluate their impact on model behavior
If the app doesn't load, click here.
For more information or inquiries, please reach out via email.