Control Systems and Computers, N5-6, 2021, Article 6

https://doi.org/10.15407/csc.2021.05-06.055

Control Systems and Computers, 2021, Issue 5-6 (295-296), pp. 55-60.

UDC 004.4

Khodiakova Halyna V., PhD (Education), Associate Professor at the Department of Computer Science of the V. O. Sukhomlynsky Mykolaiv National University, Shneerson ave., 11, Mykolaiv, 54001, Ukraine, khodiakovagalina@gmail.com

Khodiakova Nataliia V. Senior Software Developer at Ray Sono AG, Bruderhofstraße 3, Munich, 81371, Germany, nathalie.mk.ua@gmail.com

Pozdeev Valery A. – Doctor of Physical and Mathematical Sciences, Professor, of the V. O. Sukhomlynsky Mykolaiv National University, st. Nikolskaya, 24, Nikolaev, 54030, Ukraine, valer.al.pozdeev@gmail.com

FULL-TEXT SEARCH SET UP ON A WEBSITE

Introduction. When implementing the search for text fragments on the site, approaches are used that are different in complexity and performance. There is also a sequence of related tasks: choosing a text indexing option, sending a text for indexing, selecting texts for indexing specifically from the CMS database, choosing a search engine, and others. These approaches do not always provide satisfactory search results.

Purpose. The purpose of the article is to the description of existing solutions for full-text search on a website, their advantages, and disadvantages. Development of a full-text search algorithm using the Elasticsearch system.

Methods. Analysis of approaches to the implementation of full-text search on a website, varying in complexity and performance. Identification of flaws and vulnerabilities in more primitive approaches and the development of more advanced and complex algorithms that eliminate the identified deficiencies. Step-by-step implementation of full-text search using third-party systems.

Results. A method for implementing full-text search using Elasticsearch is described. The advantage of the new approach is the asynchronous sending of the page content and its address to a specific service responsible for communication with Elasticsearch. This allows you not to block the normal work with the CMS and not depend on the availability of the indexing service. The approach described in the article is flexible and adaptable for various website architectures. Asynchronous processing of indexing requests ensures high query execution speed and system fault tolerance.

Conclusions. The article discusses various approaches to implementing full-text search on a website, their advantages and disadvantages. Based on the analysis, a more flexible and universal approach to the implementation of a full-text search system has been developed. A solution is proposed with step-by-step implementation and setup of advanced full-text search using Elasticsearch.

Download full text! (In English)

Keywords: Full-text search, algorithm, natural language processing, search robot, website, text indexing, search engine.

  1. Polnotekstovyj poisk po sajtu — bich sovremennogo interneta. Habr. [online] Available at: <https://habr.com/ru/post/60551/> [Accessed 26 May. 2021]
  2. Poisk podstroki v stroke. Universitet ITMO. [online] Available at: <https://neerc.ifmo.ru/wiki/index.php?title=Poisk_podstroki_v_stroke> [Accessed 27 May. 2021].
  3. Obrabotka estestvennogo yezyka. Wikipedia. [online] Available at: <https://ru.wikipedia.org/wiki/Obrabotka_yestestvennogo_yezyka> [Accessed 8 Dec. 2020].
  4. Poiskovyj robot. Wikipedia. [online] Available at: <https://ru.wikipedia.org/wiki/ Poiskovyy_robot> [Accessed 19 Oct. 2020].
  5. Standart isklyuchenij dlya robotov. Wikipedia. [online] Available at: <https://ru.wikipedia.org/wiki/ Standart isklyuchenij dlya robotov> [Accessed 23 Apr. 2021].
  6. [online] Available at: <https://ru.wikipedia.org/wiki/Noindex> [Accessed 2 Oct. 2020].
  7. [online] Available at: <https://ru.wikipedia.org/wiki/Elasticsearch> [Accessed 11 Sep. 2020].
  8. [online] Available at: <https://habr.com/ru/post/280488/> [Accessed 18 Nov. 2020].
  9. Stroim prodvinutyj poisk s ElasticSearch. DOI. [online] Available at: <https://dou.ua/lenta/columns/building-advanced-search-with-elasticsearch/> [Accessed 18 Dec. 2020].
  10. Ochered’ soobshchenij. Wikipedia. [online] Available at: <https://ru.wikipedia.org/wiki/ Ochered’_soobshcheniy> [Accessed 30 Jan. 2021].

Received 16.11.2021