EMNLP 2024
Sparse AutoEncoders (SAEs) have gained popularity as a tool for enhancing the interpretability of Large Language Models (LLMs). However, training SAEs can be computationally intensive, especially as model complexity grows. In this study, the potential of transfer learning to accelerate SAEs training is explored by capitalizing on the shared representations found across adjacent layers of LLMs. Our experimental results demonstrate that fine-tuning SAEs using pre-trained models from nearby layers not only maintains but often improves the quality of learned representations, while significantly accelerating convergence. These findings indicate that the strategic reuse of pretrained SAEs is a promising approach, particularly in settings where computational resources are constrained.
Meta Llama User Day
The Good Scientist was invited to present at the Meta Llama User Day, for our platform's innovative use of Llama3.
2024 British Conference of Undergraduate Research
We test Autoencoder asset pricing models, Kelly, Gu, and Xiu (KGX, 2019), with a dataset that is smaller than the one they used by two orders of magnitude, and has higher dimensionality; specifically, the new dataset has 123 variables as opposed to the original dataset, which has 94. It’s also more geographically diverse: it comes from Qi4M, and thus includes EMEA-based securities vs. us-based the Center for Research on Security Prices (CRSP) for the original. Lastly, we fit the model on both the original and the Qi4M dataset, and we probe the solidity of the model’s performance when confronted with challenging data by comparing their respective R2s. We check the degree to which the increase in the number of predictive characteristics impacts the model’s tendency to overfit the training dataset.
2023 Advanced Computing User Day
Presented our usage of the SURF supercomputer to accelerate our impact in the field of AI.