Mapping the Moral Compass: Do LLMs Encode Ethics in Their Latent Space?
A new study uses cross-lingual probing to investigate whether language models develop internal moral representations—or merely learn to produce morally acceptable outputs. When a language model declines to help with…