Vous êtes sur la page 1sur 12

Million Song Database

Group : 4 Chandrakumar N, Arun Patrose, Sajith M, Goutham K S, Keerthi Penmasta, Raghavendra Rao

DMBI PROJECT

Problem Description
Problem of plenty in music arena since digital revolution. More the songs available more difficult it is to find relevant ones. The problem is to find songs similar to the ones chosen by the user.

DMBI PROJECT

Common Recommendation criteria


Popularity Content based filtering Collaborative filtering (not used in our solution as we assume loginless recommendation) Artist filtering

DMBI PROJECT

Algorithm Used
Popularity

Content-based filtering

Recommendation

Rank Generation Collaborative filtering

Music Dataset

Artist Similarity

DMBI PROJECT

Data Source
Audio features and metadata for one million songs provided by Echo Nest. The meta-data also makes it easy to link tracks and artists to online resources and APIs. Echo Nest uses web crawlers to collect acoustic and cultural information and stores into files. Each file represents one track with all the related information
DMBI PROJECT

Data Preparation
Data preparation involves following steps 1. checking or logging the data in 2. Entering the data into the computer 3. Transforming the data 4. Developing and documenting an integrated database structure

DMBI PROJECT

Descriptive Statistics of Data


1,000,000 songs / files 273 GB of data 44,745 unique artists 7,643 unique terms (The Echo Nest tags) 2,321 unique music brainz tags 43,943 artists with at least one term 2,201,916 asymmetric similarity relationships 515,576 dated tracks starting from 1922 18,196 cover songs identified DMBI PROJECT

Modified Filtering Algorithm


Field Name Your Subtopics Go Here Description
Artist Hotttnesss
Danceability Song Duration Key Energy Loudness Segments_tim bre Range

The popularity of the artist.


How danceable the song is. Duration of the song in seconds. Key the song is in.

1-10
1-10 120-300 1-7

Energy from listener point of 1-10 view. Overall loudness in dB . The quality of sound. 1-10 1-10

Song Hotness
Tempo Year

The popularity of the song.


Estimated tempo in BPM.

1-10
1-10

Year the song was released. 1990-2010

Modified Filtering Algorithm


Artist Hotness Danceabilty Song Duration Key Energy Loudness Timbre Song Hotness Tempo Year

Artist Id Music dataset Genre Id

Song Suggestion s

Level 1 filtering

Level 2 filtering

DMBI PROJECT

Clustering- Dendogram
Output from clustering of subset data is shown In Level 1, we find 7 clusters In Level 2, we find 5 clusters In Level 3, we find 2 clusters If one of the songs in any cluster is selected, other songs in the same cluster are recommended. In case of tie, proximity matrix is used as shown in next slide.
DMBI PROJECT

Proximity Matrix

DMBI PROJECT

Thank You!

DMBI PROJECT

Vous aimerez peut-être aussi