Pretraining Dynamic Analysis

Understanding model evolution during pretraining using the JET Expansion framework.

Overview

The pretraining dynamics of large language models reveal how the model's behavior changes as it processes increasing amounts of data. Using JET Expansion, we can trace these dynamics without requiring additional data or training. This allows us to visualize and analyze how specific aspects of the model's computation evolve during the pretraining phase. For example, tracking concept emergence, understanding layer behavior over different pretraining stages, comparing different pretraining strategies to evaluate their impact on model behavior

Explore the Streamlit App

If the app doesn't load, click here.

Contact

For more information or inquiries, please reach out via email.