Content-type: text/html Downes.ca ~ Stephen's Web ~ Selection of Audio Learning Resources Based on Big Data

Stephen Downes

Knowledge, Learning, Community

We're so focused on text-based content recommendation systems it's easy to forget about other types of data, such as audio. Looking at such systems takes us into a world of what are to me new concepts and entities. This paper (16 page PDF) looks at audio similarity based on "mel-frequency cepstral coefficient features" (read about this in Wikipedia here). These audio features are used by speech recognition engines; they are combined to build up words and phrases. The recommendation system described here then adds other, more familiar, elements, such as language, scene, genre and mood to fine-tune the categorization. A discussion for the future will be to ask whether feature-based audio recognition and recommendation will be superseded by more general transformer neural network algorithms, which I talked about here.

Today: 7 Total: 95 [Direct link] [Share]


Stephen Downes Stephen Downes, Casselman, Canada
stephen@downes.ca

Copyright 2024
Last Updated: Nov 22, 2024 10:53 p.m.

Canadian Flag Creative Commons License.

Force:yes