Various aspects of the organisation of media archives and collections have produced eager interest in recent years. The Music Information Retrieval community has been gaining many insights into the area of abstract representations of music by means of audio signal processing. On top of that, recommendation engines are built to provide novel ways of creating playlists based on users' preferences. Another important application of audio representation is automatic genre categorisation, i.e. the automatic assignment of genre tags to untagged audio files. However, for many applications representation based on audio features only do not contain enough information. A song's lyrics often describe its genre better than what it sounds like, e.g.
`Christmas carols' or `love songs'. Therefore, approaches for the combination of additional data like song lyrics, artist biographies, or album reviews for music recommendation are examined. Further, the application of the Self-Organising Map for clustering, i.e. the mapping from the resultant high-dimensional feature spaces onto two-dimensional maps, for explorative analysis of audio collections with respect to multi-modal feature sets is investigated (audio / text). Additionally, a new visualisation for simultaneous display of multi-modal clusterings as well as cluster validation metrics are presented. Finally, a short overview and outlook on future work is given.