Discussion about this post

User's avatar
Matteo Santelmo's avatar

Hey Ludovico! First of all thanks for the super interesting content your sharing. 
It’s not super clear to me how you actually interpret the sparse vector computed from the encoder. From my understanding the SAE’s generating a very high-dimensional and very sparse representation for the contextual embedding of what the LLM gets as input, but then I don’t really understand how you actually get to say that the n-th feature of the sparse encoding is related to a certain topic/concept/word. Do they just brute-force this system by inputting very diverse text and observe the sparse encoding? Thx in advance :)

Expand full comment
1 more comment...

No posts