Управляющие системы и машины, №1, 2018, статья 6

DOI: https://doi.org/10.15407/usim.2018.01.057
Урсатьев А. Большие Данные. Аналитические базы данных и хранилища: Vertica, Kdb. 2018. № 1. С. 58-71.

Abstract on English.

УДК 004.65:004.7:004.75:004.738.5

А.А. Урсатьев, к. техн. н., Международный научно-учебный центр информационных технологий и систем НАН и МОН Украины, просп. Глушкова, 40, Киев 03187, Украина,
E-mail: aleksei@irtc.org.ua

Большие Данные. Аналитические базы данных и хранилища: Vertica, Kdb

Статья представляет собой продолжение исследований Больших Данных и инструментария, трансформируемого в новое поколение технологий и архитектур платформ баз данных и хранилищ для интеллектуального вывода. Рассмотрен ряд прогрессивных разработок известных в мире ИТ-компаний.

Загрузить полный текст в PDF (на русском).

Ключевые слова: MPP – архитектура, HTAP – гибридная транзакционная/аналитическая обработка, LDW – логические хранилища данных, облачное хранение, платформа баз данных как услуга DBPaaS, аналитика по модели SaaS, среда управления данными, технология IMC.

СПИСОК ЛИТЕРАТУРЫ

  1. Big Data: The next frontier for innovation, competition, and productivity. J. Manyika, M. Chui, B. Brown et al., May 2011, http://www.mckinsey.com/businessfunctions/business-technology/our-insights/big-data-the-next-frontier-for-innovation.
  2. Research and analysis of IDC «Digital universe study» commissioned by EMC Corporation EMC, 2014, http://ukraine.emc.com/leadership/digital-universe/index.htm#Archive (In Russian).
  3. The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things. EMC Digital Universe with Research & Analysis by IDC, April 2014,  http://www.emc.com/leadership/digital-universe/2014iview/index.htm
  4. The expanding digital universe March 2007. J.F. Gantz, D. Reinsel, Chute Chr. et al., 24 марта 2015, http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf.
  5. Gantz J., Reinsel D. The Digital Universe Decade – Are You Ready?, May 2010, http://www.emc.com/collateral/analyst-reports/idc-digital-universe-are-you-ready.pdf.
  6. EMC NEWS. Press Rellease. New Digital Universe Study Reveals Big Data Gap: Less Than 1% of World’s Data is Analyzed; Less Than 20% is Protected, http://www.emc.com/about/ news/press/2012/20121211-01.htm.
  7. Gantz J., Reinsel D. The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East, Dec. 2012, http://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf.
  8. Lesk M. How Much Information Is There In the World?, 1997, www.lesk.com/ mlesk/ksg97/ksg.html
  9. Lyman P., Hal R. Varian How much information? 2003 (School of Information Management and Systems, Univ. of California at Berkeley). http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/.
  10. Hilbert M., López P. The World’s Technological Capacity to Store, Communicate, and Compute Information, Published Online Feb. 10 2011, Science 1 April 2011, 332, N 6025, P. 60–65. DOI: 10.1126/science. 1200970, http://www.sciencemag. org/content /332/6025/ 60.full.
  11. Big Data. Nature, 2008, 455, N 7209, P. 1–136, http://www.nature.com/nature/journal/v455/n7209/ index.html.
  12. Marx V. Biology: The big challenges of big data. Nature International weekly journal of science, 2013, 498, N 7753, P. 255–260, http://www.nature.com/nature/journal/ v498/n7453/full/498255a.html.
  13. Olofson Carl W., Vesset Dan Big Data: Trends, Strategies, and SAP Technology, August 2012, https://www.sap.com/bin/sapcom/en_ae/downloadasset.2012-09-sep26-13. idc-report-big-data-trends-strategies-and-saptechnology-pdf.html;http://www.itexpocenter.nl/iec/sap/BigDataTrendsStrategiesandSAPTechnology.pdf.
  14. Gantz J.,  Reinsel D. Extracting Value from Chaos, June 2011, https://www.emc.com/collateral/analyst reports/idc-extracting-value-from-chaos-ar.pdf.
  15. Chui M., Loffler M., Roberts R. The Internet of Things. McKinsey Quarterly, March 2010, http://www.mckinsey.com/industries/high-tech/our-insights/the-in-ternet-of-things.
  16. Uont R.,  Shilit B. The mechanisms of the Internet of things, Otkrytye sistemy, 2015,  N 1, P. 38– 42. (In Russian).
  17. Oracle: Big Data for the Enterprise. June 2013, http://www.oracle.com/us/products/database/big-data-for-enterprise-519135.pdf.
  18. Gritsenko V.I., Oursatyev A.A. Information Technologies: the Tendency, the Ways of the Development. Upr. sist. mas, 2011, N 5,– P. 3–20. (In Russian).
  19. Detecting influenza epidemics using search engine query data. J. Ginsburg, M. Mohebbi, R. Patel et al., Nature, 2009,  457, P. 1012–1014, http://www.nature.com/nature/journal/v457/n7232/full/nature07634.html.
  20. Asadullaev S. Data Warehouse Architectures-1, -2, -3, 2009, http://www.ibm.com/developerworks/ru/library/sabir/axd_1/index.html … axd_3/ index.html (In Russian).
  21. Asadullaev S. Data, metadata and NSI: triple storage strategy. 2009. http://www.ibm.com/developer-works/ru/library/r-nci/index.html (In Russian).
  22. The Practice of Building Data Warehousing: The SAS System, Open systems, 1998, n 4–5, http://www.osp.ru/dbms/1998/04-05/13031592/ (In Russian).
  23. Mark A. Beyer, Roxane Edjlali. Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics, 12 Feb. 2015, http://www.gartner.com/technology/reprints.do?id=1-2A21OQO&ct=150217&st=sg.
  24. Roxane Edjlali,  Mark A. Beyer. Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics, 25 Feb. 2016, https://www.gartner.com/doc/reprints?id=1-2ZFVZ5B&ct=160225&st=sb.
  25. Data warehouses: the market is being transformed. On the materials of foreign sites. Intersoft Lab, 2012, http://www.iso.ru/print/rus/journal/document10179.phtml
    (In Russian).
  26. The Logical Data Warehouse: What it is and why you need it., June 24, 2015, https://tdwi.org/webcasts/2015/06/the-logical-data-warehouse-what-it-is-and-why-you-need-it.aspx.
  27. Logical Data Warehousing for. Gartner , http://imagesrv. gart-ner.com/media-products/pdf/samples/ sample3.pdf.
  28. Column DBMS – the principle of operation, advantages and scope. – 28 Jan. 2011. – http://habrahabr.ru/post/95181/ (In Russian).
  29. Whitehorn M. Big Data Technologies emerge to battle large, complex data sets, 05 Dec. 2011, http://www.computerweekly.com/news/2240111952/Big-data-tech-nologies-emerge-to-battle-large-complex-data-sets or https://www.prj-exp.ru/dwh/big_data_technologies_emer-ge_to_battle.php.
  30. Gritsenko V.I., Oursatyev A.A. Big Data and the Tools for Analytics, Upr. sist. maš., 2017, N 4, P. 3–14. (In Russian).
  31. Hinchcliffe Dion. The enterprise opportunity of Big Data: Closing the “clue gap”,
    http://www.zdnet.com/article/the-enterprise-opportunity-of-big-data-closing-the- clue-gap/.
  32. HP Vertica, http://www.vertica.com/.
  33. IT architect of the data warehouse architect. The cho.ice of Vertica VS, Jan. 28, 2013,
    http://ascrus.blogspot.com /2013/01/vertica-vs.html (In Russian).
  34. Borchuk L. Value Optimizers for DBMS: yesterday and today. Open Systems, 2016, N 1, P. 36–39. (In Russian).
  35. HP Vertica Analytics Platform Version 7.0.x Documentation. Flex Zone, https://my.vertica.com/docs/7.0.x/ HTML/index.htm#Authoring/FlexTables/FlexTableHandbook.htm%3FToc­Path%3DFlex%2520Tables%2520 Guide%7C_0.
  36. Brust Andrew. Vertica 7 to NoSQL DBs: Drop dead. ZDNet – for Big on Data, 21 Nov. 2013, Topic: Big Data Analytic, http://www.zdnet.com/article/vertica-7-to-nosql-dbs-drop-dead/.
  37. Spark SQL: Relational Data Processing in Spark / M. Armbrust, R. Xin, C. Lian et al., Proc. of the 2015 ACM SIGMOD Int. Conf. on Management of Data, 31 May – 4 June 2015, Melbourne, Victoria, Australia, 2015, http://people.csail.mit.edu/matei/papers/2015/ pdf.
  38. Vertica Blog. Looking Under the Hood at Vertica Queries, 02 2016, https://my.vertica.com/blog/ looking-under-the-hood-at-vertica-queriesba-p235038/
  39. Spark SQL and DataFrames. Spark 1.5.2 Documentation, http://spark.apache.org/docs/latest/sql-programming-html
  40. Deep Dive into Spark SQL’s Catalyst Optimizer. M. Armbrust, Y. Huai, C. Liang et al., 15 Apr. 2015, https://databricks.com/blog/2015/04/13/deep-dive-into- spark-sqls-catalyst-optimizer.html.
  41. Xin R., Rosen J. Project Tungsten: Bringing Apache Spark Closer to Bare Metal, 28 Apr. 2015, https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html.
  42. Mark A. Beyer, Edjlali R. Magic Quadrant for Data Warehouse Database Management Systems, 7 Mar. 2014, https://www.slideshare.net/paramitap/gartner-magic-quadrant-for-data-warehouse-database-management-systems.
  43. HP Haven OnDemand, http://www8.hp.com/ua/ru/software-solutions/big-data-cloud-haven-ondemand/.
  44. Платформа для больших объемов данных,
    http:// www8.hp.com/ua/ru/software-solutions/big-data-platform-haven/.
  45. Kx, https://kx.com
  46. Encyclopedia of programming languages. K (programming language), http://progopedia.ru/language/k/. (In Russian).
  47. Graves Steve. In-Memory Database Systems, 1 Sep. 2002, http://www.linuxjournal.com/article/6133.
  48. Gartner. Delivering Scalable and Robust Data Infrastructures with DaaS in Financial Markets. Kx for DaaS, Feb. 2017, http://www.gartner.com/imagesrv/media-products/pdf/Kx/KX-1-3RU8DEE.pdf
  49. Gartner. Real-time Insights and Decision Making using Hybrid Streaming, In-Memory Computing Analytics and Transaction Processing,
    https://www.gartner.com/imagesrv/media-products/pdf/Kx/KX-1-3CZ44RH.pdf
  50.  Pezzini Massimo. Predicts 2016: In-Memory Computing-Enabled Hybrid Transaction/Analytical Processing Supports Dramatic Digital Business Innovation, Jan. 2016, https://www.linkedin.com/pulse/predicts-2016-in-memory-computing-enabled-hybrid-supports-pezzini.
  51. Colmer P. In Memory Data Grid Technologies Wednesday, 21 Dec. 2011, http://highscalability.com/blog/ 2011/12/21/in-memory-data-grid-technologies.html.

Поступила 16.01.2018