LibCat » Книги » Приключения » unrecognised » Computational Statistics in Data Science

Computational Statistics in Data Science

Здесь есть возможность читать онлайн «Computational Statistics in Data Science» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Computational Statistics in Data Science
Автор:
Неизвестный Автор
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

Computational Statistics in Data Science: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Computational Statistics in Data Science»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

An essential roadmap to the application of computational statistics in contemporary data science
Computational Statistics in Data Science
Computational Statistics in Data Science
Wiley StatsRef: Statistics Reference Online
Computational Statistics in Data Science

Computational Statistics in Data Science — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Computational Statistics in Data Science», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

Streaming is an active research area. However, there are still some aspects of streaming that have received little attention. One of them is transactional guarantees. Current stream processing can provide basic guarantees such as processing each data point in the stream exactly once or at least once but cannot provide guarantees that span multiple operations or stream elements. Another area to intensify research effort is data stream pre‐processing. Data quality is a vital determinant in the knowledge discovery pipeline as low‐quality data yields low‐quality models and choices [69]. There is need to reinforce data stream pre‐processing stage [67] in the face of multi‐label [70], imbalance [71], and multi‐instance [72] problems associated data stream [66]. Also, the representation of social media posts must be such that the semantics of social media content is preserved [74, 75]. Moreover, data stream pre‐processing techniques with low computational requirement [73] need to be evolved as this is still open for research.

Data stream processing requires two factors which include storage capability and computational power in the face of an unbounded generation of data with high velocity and brief life span. To cope with these requirements, approximate computing, which aims at low latency at the expense of acceptable quality loss, has been a practical solution [110]. Even though approximate computing has been extensively used for the processing of data stream, combining it with distributed processing models brings new research directions. Such research directions include approximation with heterogeneous resources, pricing models with approximation, intelligent data processing, and energy‐aware approximation.

References

1 1 World Economic Forum (2019) How Much Data is Generated Each Day? Visual Capitalist, https://www.visualcapitalist.com/how‐much‐data‐is‐generated‐each‐day.

2 2 Huynh, V. and Phung, D. (2017) Streaming clustering with Bayesian nonparametric models. Neurocomputing, 258, 52–62. doi: 10.1016/j.neucom.2017.02.078.

3 3 Ray, I., Adaikkalavan, R., Xie, X., and Gamble, R. (2015) Stream Processing with Secure Information Flow Constraints. 29th IFIP Annual Conference on Data and Applications Security and Privacy. Fairfax, USA, pp. 311–329. doi: 10.1007/978‐3‐319‐20810‐7_22.

4 4 Sibai, R.E., Chabchoub, Y., Demerjian, J. et al. (2016) Sampling Algorithms in Data Stream Environment. 2016 International Conference on Digital Economy Carthage. IEEE, Tunisia, pp. 29–36. doi: 10.1109/ICDEC.2016.7563142.

5 5 Youn, J., Shim, J., and Lee, S.G. (2018) Efficient data stream clustering with sliding windows based on locality sensitive hashing. IEEE Access, 6, 63757–63776. doi: 10.1109/ACCESS.2018.2877138.

6 6 Das, S., Beheraa, R.K., Kumar, M., and Rath, S.K. (2018) Real‐time sentiment analysis of twitter streaming data for stock prediction. Procedia Comput. Sci., 132, 956–964.

7 7 Wang, J., Zhu, R., and Liu, S. (2018) A differentially private unscented Kalman filter for streaming data in IoT. IEEE Access, 6 (1), 6487–6495. doi: 10.1109/ACCESS.2018.2797159.

8 8 Kolchinsky, I. and Schuster, A. (2019) Real‐Time Multi‐Pattern Detection Over Event Streams. Proceedings of the 2019 International Conference on Management of Data, Amsterdam Netherlands: New York, NY, USA: ACM, pp. 589–606. doi: 10.1145/3299869.3319869.

9 9 Tozi, C. (2017) Dummy's Guide to Batch vs Streaming. Retrieved from Trillium Software, https://www.precisely.com/blog/big‐data/big‐data‐101‐batch‐stream‐processing.

10 10 Kolajo, T., Daramola, O., and Adebiyi, A. (2019) Big data stream analysis: a systematic literature review. J. Big Data, 6, 47.

11 11 Kusumakumari, V., Sherigar, D., Chandran, R., and Patil, N. (2017) Frequent pattern mining on stream data using Hadoop CanTree‐GTree. Procedia Comput. Sci., 115, 266–273.

12 12 Giustozzia, F., Sauniera, J., and Zanni‐Merk, C. (2019) Abnormal situations interpretation in industry 4.0 using stream reasoning. Procedia Comput. Sci., 159, 620–629.

13 13 Liu, R., Li, Q., Li, F. et al. (2014) Big Data Architecture for IT Incident Management. Proceedings of IEEE international conference on service operations and logistics, and informatics. Qingdao, China, pp. 424–429.

14 14 Sakr, S. (2013) An Introduction to Infosphere Streams: A Platform for Analyzing Big Data in Motion, IBM, https://www.ibm.com/developerworks/library/bd‐streamsintro/index.html.

15 15 Inoubli, W., Aridhi, S., Mezni, H. et al. (2018) An experimental survey on big data frameworks. Future Gener. Comp. System, 86, 546–564. doi: 10.1016/j.future.2018.04.032.

16 16 International Business Machine (2019) Stream Computing Platforms, Applications and Analytics, https://researcher.watson.ibm.com/researcher/view_group.php?id=2531.

17 17 Vidyasankar, K. (2017) On continuous queries in stream processing. Procedia Comput. Sci., 109C, 640–647.

18 18 Joseph, S., Jasmin, E.A., and Chandran, S. (2015) Stream computing: opportunities and challenges in smart grid. Procedia Tech., 21, 49–53.

19 19 Wozniak, M., Ksieniewicz, P., Cyganek, B. et al. (2016) Active learning classification of drifted streaming data. Procedia Comput. Sci., 80, 1724–1733.

20 20 Kim, T. and Park, C.H. (2020) Anomaly pattern detection for streaming data. Expert Syst. Appl., 149, 113252. doi: 10.1016/j.eswa.2020.113252.

21 21 Sethi, T.S. and Kantardzic, M. (2018) Handling adversarial concept drift in streaming data. Expert Syst. Appl., 97, 18–40.

22 22 Toor, A.A., Usman, M., Younas, F. et al. (2020) Mining massive e‐health data streams for IoMT enabled healthcare systems. Sensors, 20 (7), 2131. doi: 10.3390/s20072131.

23 23 Shan, J., Luo, J., Ni, G. et al. (2016) CVS: fast cardinality estimation for large‐scale data streams over sliding windows. Neurocomputing, 194, 107–116.

24 24 Liu, W., Wang, Z., Liu, X. et al. (2017) A survey of deep neural network architectures and their applications. Neurocomputing, 234, 11–26.

25 25 Priya, S. and Uthra, R.A. (2020) Comprehensive analysis for class imbalance data with concept drift using ensemble based classification. J. Ambient Intell. Humaniz. Comput. doi: 10.1007/s12652‐020‐01934‐y.

26 26 Zhou, L., Pan, S., Wang, J., and Vasilakos, A.V. (2017) Machine learning on big data: opportunities and challenges. Neurocomputing, 237, 350–361. doi: 10.1016/j.neucom.2017.01.026.

27 27 O'Donovan, P., Leahy, K., Bruton, K., and O'Sullivan, D.T.J. (2015) An industrial big data pipeline for data‐driven analytics maintenance applications in large‐scale smart manufacturing facilities. J. Big Data, 2, 25. doi: 10.1186s40537‐015‐0034‐z.

28 28 Zaharia, M., Das, T., Li, H. et al. (2013) Discretized Streams: Fault‐Tolerant Streaming Computation at Scale. Proceedings of the 24th ACM Symposium on Operating System Principles (SOSP 2013), Farmington: ACM Press, pp. 423–438.

29 29 Jayasekara, S., Harwood, A., and Karunasekera, S. (2020) A utilization model for optimization of checkpoint intervals in distributed stream processing systems. Futur. Gener. Comput. Syst., 110, 68–79. doi: 10.1016/j.future.2020.04.019.

30 30 Chong, D. and Shi, H. (2015) Big data analytics: a literature review. J. Manag. Anal., 2 (3), 175–201.

31 31 Qian, Z., He, Y., Su, C. et al. (2013) TimeStream: Reliable Stream Computation in the Cloud. Proceedings of the 8th ACM European Conference on Computer Systems. ACM, Prague, pp. 1–14. doi: 10.1145/2465351.2465353.

32 32 Shi, P., Cui, Y., Xu, K. et al. (2019) Data consistency theory and case study for scientific big data. Information, 10, 137. doi: 10.3390/info10040137.

33 33 Santipantakis, G., Kotis, K., and Vouros, G.A. (2017) OBDAIR: ontology‐based distributed framework for accessing, integrating and reasoning with data in disparate data sources. Expert Syst. Appl., 90, 464–483.