Introducing Freesound Datasets (and more!)

Dear Freesounders,

Today we are very happy to introduce you to Freesound Datasets, a new platform that we’ve been developing during the last year to foster the re-use of Freesound content in research contexts and that will eventually help us make Freesound better and better. Curious? Check out the website at

But what exactly is a dataset? To say it short, a dataset is a collection of items (sounds) annotated with labels chosen from a limited vocabulary of concepts. Well-curated datasets are one of the most important things that are needed to advance research in many fields, including sound and music related research.

Freesound Datasets is a platform that allows users to explore the contents of datasets made with Freesound sounds. But even more importantly, Freesound Datasets allows anyone to help make the datasets better by providing new annotations. Furthermore, it also promotes discussion about the datasets that it hosts, and allows (or better said, will allow) anyone to download different timestamped versions of them. If you’d like a more academic description about the platform, you can check out this paper we presented at the International Society for Music Information Retrieval Conference last year: Freesound Datasets: A Platform for the Creation of Open Audio Datasets.

Using Freesound Datasets, we already started creating a first dataset which we called FSD. FSD is a big, general-purpose dataset composed of Freesound content and annotated with labels from Google’s AudioSet Ontology (a vocabulary of more than 600 sound classes). Currently, FSD is still much smaller than what we would like, but we are sure with the help of people all around the world it will get bigger and bigger. Needless to say, you are more than welcome to contribute to it (or in other words, please contribute!). All you need to do is visit the Freesound Datasets website and click on Get started with our annotation tasks! We will simply ask you to listen to some sounds and have fun 🙂 You’ll see an interface like this (you can login with your Freesound credentials):

That’s really cool, isn’t it!?

Yeah that’s awesome, take me to this interface because I can’t wait any longer to start annotating!

But you know what? There is even more! We have been awarded a Google Faculty Research Award to support the development of Freesound Datasets and FSD, and, in relation to that, have started a collaboration with some colleagues from Google’s Machine Perception Team to do research on machine listening. As the first outcome of this collaboration, we recently launched a competition in Kaggle  (see Freesound General-Purpose Audio Tagging Challenge), in which participants are challenged to build artificial intelligence algorithms able to recognize 41 diverse categories of everyday sounds. The dataset used for this competition is a small subset of FSD.

The great great great thing is that the outcomes of all these research efforts will help us improve Freesound in many ways. By training our search engine with FSD, we would, for example, be able to find search results inside sounds (for example, a fragment of a field recording with bird chirps), or be able to allow you to browse Freesound sounds using a hierarchical structure. This, and many other things that we will find out in the future 🙂

That’s it for now, thanks for reading…

the Freesound Team

This entry was posted in Uncategorized. Bookmark the permalink.

8 Responses to Introducing Freesound Datasets (and more!)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.