Content-type: text/html Downes.ca ~ Stephen's Web ~ A Primer on the Inner Workings of Transformer-based Language Models

Stephen Downes

Knowledge, Learning, Community

This is a very technical paper but a lovely read if you're really interested in the deep details of today's transformer-based language models like ChatGPT and others. This to me is what a real science of learning looks like. GPTs are by no means the last word, and indeed, the paper is very careful to identify the shortcomings and limitations (one highlight here is the warning not to trust a LLM when it offers an explanation of its own result, specifically, "the tendency of LMs to produce explanations that are very plausible according to human intuition, but unfaithful to model inner workings"). As an aside, I think there's a PhD dissertation to be had in identifying the mechanisms described in this paper and mapping them to social learning, that is, how people in a society interact in order to help society as a whole learn new things.

Today: 1 Total: 27 [Direct link] [Share]


Stephen Downes Stephen Downes, Casselman, Canada
stephen@downes.ca

Copyright 2024
Last Updated: Dec 21, 2024 11:16 a.m.

Canadian Flag Creative Commons License.

Force:yes