Ваш браузер устарел.

Для того, чтобы использовать все возможности сайта, загрузите и установите один из этих браузеров.

скрыть

Article

  • Title

    Applying latent Dirichlet allocation for analysis of publications in scientometric databases

  • Authors

    Kolyada A. S.
    Yakovenko B. A.
    Gogunsky Victor Dmitriyevich

  • Subject

    COMPUTER AND INFORMATION NETWORKS AND SYSTEMS. MANUFACTURING AUTOMATION

  • Year 2014
    Issue 1(43)
    UDC 004.62
    DOI 10.15276/opu.1.43.2014.32
    Pages 186-191
  • Abstract

    The aim of the work is to determine the most appropriate model for a thematic classification of scientific publications by author with the same sirname. The probabilistic models are analyzed and it is proposed to use the model of latent Dirichlet allocation — the leading one among probabilistic models thanks to numerous generalizations and applications to the analysis of collections of text documents. For comparison the latent semantic analysis model is chosen. The model is used in the project for the extraction of publications from scientometric databases. In this project the usage of topic modeling solves the problem of separation of publications of authors with the same sirname, where titles of publications are selected as collection of documents. The results show that the model of latent Dirichlet allocation yield to the latent semantic analysis with usage of small volume of the contents of documents. Therefore, for small collections of documents of volume it is preferable to use latent semantic analysis, and for large volumes — latent Dirichlet allocation.

  • Keywords
  • Viewed: 1685 Dowloaded: 1
  • Download Article
  • References

    1.Коляда, А.С. Автоматизация извлечения информации из наукометрических баз даннях / А.С. Коляда, В.Д. Гогунский // Управління розвитком складних систем. - 2013. - Вип. 16. - С. 96 - 99.

    2.Коляда, А.С. Латентно семантический подход для анализа информации из наукометрических баз даннях / А.С. Коляда // Управління розвитком складних систем. - 2014. - Вип. 17. - С. 101 -108.

    3.Воронцов, К.В. Вероятностное тематическое моделирование [Электронный ресурс] / К.В. Воронцов // MachineLearning.ru. - Режим доступа:

    4.http://www.machinelearning.ru/wiki/images/2/22/Voron-2013-ptm.pdf (Дата обращения: 03.03.2014).

    5.Daud, A. Knowledge discovery through directed probabilistic topic models: a survey / A. Daud, J. Li, L. Zhou, F. Muhammad // Frontiers of Computer Science in China. - 2010. - Vol. 4, Iss. 2. - PP. 280 - 301.

    6.Blei, D.M. Latent Dirichlet Allocation / D.M. Blei, A.Y. Ng, M.I. Jordan // Journal of Machine Learn-ing Research. - 2003. -Vol. 3. - PP. 993 - 1022.

  • Creative Commons License by Author(s)