An Adaptive Approach to Identify Genre in Music Videos Using Word2Vec Model

S.Metilda Florence, S. Mohan


The vector representations of words presented by Word2Vec model have been shown to be very useful in many application developments due to the semantic information they convey. This paper proposes a similar form, the MusicGenre2Vec. MusicGenre2Vec represents the numerical genre features of music segments inside the vector with the intention to describe the phonetic systems of the tune segments in an excellent way. We are hoping the vector representations obtained in this way can describe more precisely the phonetic structures of the Music indicators, so the Music segments that sound alike would have vector representations close by within the space.  This form of depiction is called MusicGenre2Vec in this paper. The proposed system gives 80% of accuracy in finding the Genre of a video song.

Full Text:



Steven W. Smith, The Scientist and Engineer's Guide to Digital Signal Processing, book, chapter- 17,2002.

T. Lidy and A. Rauber. Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In Proc. ISMIR, pages 34–41,2005.

Logan .B, “Mel frequency cepstral coefficients for music modeling” in International symposium on music information retrieval,2000.

Foote, J. (1999). Visualizing music and audio using self-similarity. In Proceedings of 7th ACM international conference on multimedia (part 1) (pp. 77–80)

D.M.Chandwadkar, Dr. M.S.Sutaone “Role of Features and Classifiers on Accuracy of Identification of Musical Instruments” in Proceedings of CISP 2012.

Jouni Paulus, “Acoustic Modelling Of Drum Sounds With Hidden Markov Models For Music Transcription” in IEEE Transactions, 2006.

Q. V. Le and T. Mikolov, “Distributed representations of sentences and documents,” arXiv preprint arXiv:1405.4053, 2014.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.

Ganapathiraju M, Weisser D, Rosenfeld R, Carbonell J, Reddy R, Klein,Seetharaman J. Comparative n-gram analysis of whole-genome protein sequences. In: Proceedings of the second international conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc.; 2002.

Srinivasan SM, Vural S, King BR, Guda C. Mining for class-specific motifs in protein sequence classification. BMC bioinformatics. 2013;14(1):96. doi: 10.1186/1471-2105-14-96. pmid:23496846

Vries JK, Liu X. Subfamily specific conservation profiles for proteins based on n-gram patterns. BMC bioinformatics. 2008;9(1):72. doi: 10.1186/1471-2105-9-72. pmid:18234090

Chris McCormick, “Word2Vec Tutorial - The Skip-Gram Model”,April 2016

Andriy Mnih and Yee Whye Teh. A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426, 2012.

S.Metilda Florence,Dr.S.Mohan, “Automatic Video Annotation for Music Genre Based on Spectral and Cepstral Features”,in ELSEVIER Proc. Int. Conf.on Applied Information and Communications Technology (ICAICT 2014),Oman, pp. 27 – 32.



  • There are currently no refbacks.

© International Journals of Advanced Research in Computer Science and Software Engineering (IJARCSSE)| All Rights Reserved | Powered by Advance Academic Publisher.