Content-type: text/html Downes.ca ~ Stephen's Web ~ NVIDIA: Copyrighted Books Are Just Statistical Correlations to Our AI Models

Stephen Downes

Knowledge, Learning, Community

This is a fairly in-depth look at the details of a case between content authors and Nvidia, a manufacturer of AI chips. Some parts aren't in contention - for example, "the company's use of the 'Books3' dataset, which was scraped from the library of 'pirate' site Bibliotik." But others are contested. There are two major elements here. First, is the copying of a book for the purposes of analyzing it fair use? Second, and more significantly, is the extraction of certain 'facts' from the book fair use? Eg. suppose I learn from a book that "the Battle of Hastings was in 1066" (which, in fact, I did). If I restate it, is that fair use? It seems so. But what about things like grammatical principles and word order? We start a sentence 'The battle of Hastings...' but never 'Battle the of Hastings'. This, Nvidia argues, is what it extracts from books. Not the content.

Today: 5 Total: 363 [Direct link] [Share]


Stephen Downes Stephen Downes, Casselman, Canada
stephen@downes.ca

Copyright 2024
Last Updated: Aug 24, 2024 10:07 a.m.

Canadian Flag Creative Commons License.

Force:yes