Stars
Sparse Autoencoder for Mechanistic Interpretability
Baidicoot / sparse_coding
Forked from HoagyC/sparse_codingWork on sparse coding, replicating and extending the sparse coding approach to taking transformer features out of superposition.
Sparse probing paper full code.
loganriggs / sparse_coding
Forked from HoagyC/sparse_codingKeeping language models honest by directly eliciting knowledge encoded in their activations.
A library for mechanistic interpretability of GPT-style language models
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Multiversal tree writing interface for human-AI collaboration