Speaking

My work has been presented in conferences all around the world and I've been interviewed by the likes of Google Deepmind.

Conferences

Accelerating Sparse Autoencoder Training via Layer-Wise Transfer Learning in Large Language Models

EMNLP 2024

Sparse AutoEncoders (SAEs) have gained popularity as a tool for enhancing the interpretability of Large Language Models (LLMs). However, training SAEs can be computationally intensive, especially as model complexity grows. In this study, the potential of transfer learning to accelerate SAEs training is explored by capitalizing on the shared representations found across adjacent layers of LLMs. Our experimental results demonstrate that fine-tuning SAEs using pre-trained models from nearby layers not only maintains but often improves the quality of learned representations, while significantly accelerating convergence. These findings indicate that the strategic reuse of pretrained SAEs is a promising approach, particularly in settings where computational resources are constrained.

Llama3 at TGS

Meta Llama User Day

The Good Scientist was invited to present at the Meta Llama User Day, for our platform's innovative use of Llama3.

Applications of Autoencoder Asset Pricing Models to a Highly Dimensional Cross-Section

2024 British Conference of Undergraduate Research

We test Autoencoder asset pricing models, Kelly, Gu, and Xiu (KGX, 2019), with a dataset that is smaller than the one they used by two orders of magnitude, and has higher dimensionality; specifically, the new dataset has 123 variables as opposed to the original dataset, which has 94. It’s also more geographically diverse: it comes from Qi4M, and thus includes EMEA-based securities vs. us-based the Center for Research on Security Prices (CRSP) for the original. Lastly, we fit the model on both the original and the Qi4M dataset, and we probe the solidity of the model’s performance when confronted with challenging data by comparing their respective R2s. We check the degree to which the increase in the number of predictive characteristics impacts the model’s tendency to overfit the training dataset.

TGS x SURF

2023 Advanced Computing User Day

Presented our usage of the SURF supercomputer to accelerate our impact in the field of AI.

Interviews