I ran an fMRI on LLMs: a concept is a direction, not a region
Research on LLMs found that concepts are not stored in specific regions of neurons, but rather as directions in activation space. This is in contrast to the brain, where categories are localized to specific regions. The study used fMRI-like techniques to map how meaning is organized in LLMs and found that concepts are distributed and superposed across neurons. This has implications for how we understand and develop AI models. To apply this knowledge, researchers should consider the distributed nature of concepts in LLMs when designing and training models.