Downes.ca ~ Stephen's Web ~ Selection of Audio Learning Resources Based on Big Data

Selection of Audio Learning Resources Based on Big Data

Peng Wang, Xia Wang, Xia Liu, International Journal of Emerging Technologies in Learning, Mar 29, 2022
Commentary by Stephen Downes

We're so focused on text-based content recommendation systems it's easy to forget about other types of data, such as audio. Looking at such systems takes us into a world of what are to me new concepts and entities. This paper (16 page PDF) looks at audio similarity based on "mel-frequency cepstral coefficient features" (read about this in Wikipedia here). These audio features are used by speech recognition engines; they are combined to build up words and phrases. The recommendation system described here then adds other, more familiar, elements, such as language, scene, genre and mood to fine-tune the categorization. A discussion for the future will be to ask whether feature-based audio recognition and recommendation will be superseded by more general transformer neural network algorithms, which I talked about here.

Today: 3 Total: 26 [Direct link] [Share]

View full size