Pretraining Dynamic with JET Expansion

Overview

The pretraining dynamics of large language models reveal how the model's behavior changes as it processes increasing amounts of data. Using JET Expansion, we can trace these dynamics without requiring additional data or training. This allows us to visualize and analyze how specific aspects of the model's computation evolve during the pretraining phase. For example, tracking concept emergence, understanding layer behavior over different pretraining stages, comparing different pretraining strategies to evaluate their impact on model behavior

Contact

For more information or inquiries, please reach out via email.

Pretraining Dynamic Analysis

Overview

Explore the Streamlit App

Contact