LibCat » Книги » Приключения » unrecognised » Data Analytics in Bioinformatics

Data Analytics in Bioinformatics

Здесь есть возможность читать онлайн «Data Analytics in Bioinformatics» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Data Analytics in Bioinformatics
Автор:
Неизвестный Автор
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
5 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 100
- 1
- 2
- 3
- 4
- 5

Data Analytics in Bioinformatics: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Data Analytics in Bioinformatics»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Machine learning techniques are increasingly being used to address problems in computational biology and bioinformatics. Novel machine learning computational techniques to analyze high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. Machine learning techniques such as Markov models, support vector machines, neural networks, and graphical models have been successful in analyzing life science data because of their capabilities in handling randomness and uncertainty of data noise and in generalization. Machine Learning in Bioinformatics compiles recent approaches in machine learning methods and their applications in addressing contemporary problems in bioinformatics approximating classification and prediction of disease, feature selection, dimensionality reduction, gene selection and classification of microarray data and many more.

Data Analytics in Bioinformatics — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Data Analytics in Bioinformatics», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

95. Baltzakis, H. and Papamarkos, N., A new signature verification technique based on a two-stage neural network classifier. Eng. Appl. Artif. Intell ., 14, 1, 95–103, 2001.

96. Zhao, Z.Q., Huang, D.S., Sun, B.Y., Human face recognition based on multi-features using neural networks committee. Pattern Recognit. Lett ., 25, 12, 1351–1358, 2004.

97. Patil, V. and Shimpi, S., Handwritten English character recognition using neural network. Elixir Comput. Sci. Eng ., 41, 5587–5591, 2011.

98. Davydova, 10 Applications of Artificial Neural Networks in Natural Language Processing, Retrieved from https://medium.com/@datamonsters/artificial-neural-networks-in-natural-language-processing-bcf62aa9151a.

99. Murakawa, M., Yoshizawa, S., Kajitani, I., Yao, X., Kajihara, N., Iwata, M., Higuchi, T., The grd chip: Genetic reconfiguration of dsps for neural network processing. IEEE Trans. Comput ., 48, 6, 628–639, 1999.

100. Mozolin, M., Thill, J.C., Usery, E.L., Trip distribution forecasting with multi-layer perceptron neural networks: A critical evaluation. Transport. Res. Part B: Meth ., 34, 1, 53–73, 2000.

101. Kalchbrenner, N., Grefenstette, E., Blunsom., P., A Convolutional Neural Network for Modelling Sentences, in: Proceedings of ACL , vol. 1, pp. 655–665, 2014.

102. Setiono, R., Baesens, B., Mues, C., Recursive neural network rule extraction for data with mixed attributes. IEEE Trans. Neural Networks , 19, 2, 299–307, 2008.

103. Gregor, K., Danihelka, I., Graves, A., Rezende, D.J., Wierstra, D., Draw, in: Proceedings of the 32nd International Conference on Machine Learning, PMLR , vol. 37, pp. 1462–1471, 2015.

104. Zen, H. and Sak, H., Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis, in: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2015, April, IEEE, pp. 4470–4474.

105. Sutskever, I., Vinyals, O., Le, Q.V., Sequence to sequence learning with neural networks, in: Advances in Neural Information Processing Systems , pp. 3104–3112, 2014.

106. Oymak, S. and Soltanolkotabi, M., Towards moderate overparameterization: global convergence guarantees for training shallow neural networks. IEEE J. Sel. Areas Inf. Theory , 1, 84–105, 2020.

* Corresponding author : satyasundara123@gmail.com

Introduction to Unsupervised Learning in Bioinformatics

Nancy Anurag Parasa 1 , Jaya Vinay Namgiri 1 , Sachi Nandan Mohanty 2 and Jatindra Kumar Dash 1 *

1 Department of Computer Science and Engineering, SRM University-AP, Andhra Pradesh, Amaravathi, India

2 Department of Computer Science and Engineering, IcfaiTech, ICFAI Foundation for Higher Education, Hyderabad, India

* Corresponding author : jatinkdash@gmail.com

Abstract

Unsupervised learning algorithmic techniques are applied in grouping the data depending upon similar attributes, most similar patterns, or relationships amongst the dataset points or values. These Machine learning models are also referred to as self-organizing models which operate on clustering technique. Distinct approaches are employed on every other algorithm in splitting up data into clusters. Unsupervised machine learning uncovers previously unknown patterns in data. Unsupervised machine learning algorithms are applied in case of data insufficiency. Few applications of unsupervised machine learning techniques include: Clustering, anomaly detection. Clustering algorithms in bioinformatics are mostly used to decrypt the salient data in gene expression which is used to acknowledge biological processes in an organism. These models aid in drug design through gene expression profiling. Self organising maps are used in data reduction which provides a better understanding of genomics. Various clustering algorithms are deployed in microarray analysis which is useful in clinical research in keeping track of gene expression data. To define in simpler terms unsupervised learning is a technique which works on the input data to produce the output which is hidden or undetermined. This chapter presents various unsupervised algorithms used for knowledge exploration in the field of bioinformatics and highlights several novel works reported in the recent literature.

Keywords :Clustering, self-organizing-maps, microarray

2.1 Introduction

Machine Learning can be coined as equipping the machine (computers) to learn from the environment through experience by facilitating the machines with some tasks whose performance can be measured using some metrics and algorithms. This broad spectrum of machine learning is subdivided into few areas as mentioned below.

Supervised learning—In the above categories supervised learning is stipulated as learning system where the data (input) is provided and the output is also known which states that output is dependent on the input provided. From the experience of learning from the data provided this approach predicts labels for the newly given data.

Reinforcement Learning—This learning approach drives on a goal oriented approach in an interactive environment, and functions on the basis of feedback system using the cases rewards and punishments based on the interaction with the data and its outcomes.

Unsupervised Learning—This learning approach explores all the hidden patterns from the input provided as the output is unknown. Prediction is performed on the dataset where the algorithms are applied and the resultant outcome is produced [1].

As the biological data is vast because of compound protein structures and genome sequences, understanding and decrypting the function of cells is resilient. So as to study the rudimentary biological processes, machine learning approaches paves a way to make the system hassle free in developing tools, software and algorithms. This chapter dives in introducing the unsupervised learning approaches, algorithms and their practices in bioinformatics domain which is an interdisciplinary field of science grouping together biology, statistics and computer science in order to analyse and assess the huge amounts of biological data [2].

In unsupervised learning approach the machine learns from the dataset given as input and labels or groups data accordingly [1]. This can also be referred as self-organization, where the algorithm applied structures the data based on the input provided with minimum human intervention. This approach draws all the hidden patterns that exist in the data and also reveals the relationship of the patterns present.

Unsupervised learning basically operates on few common algorithms [3]

Clustering

Association

Anomaly detection

Latent variable

Dimensionality reduction.

Figure 21Machine learning in bioinformatics Among the above approaches this - фото 22

Figure 2.1Machine learning in bioinformatics.

Among the above approaches this chapter explores about the algorithmic techniques that are widely applied in bioinformatics paradigm.

Unsupervised learning in bioinformatics—Machine learning in bioinformatics is spread across 6 realms [6] as shown in Figure 2.1.

Genomics and proteomics—the complete set of genes in a cell of an organism is called genome. Genes are structures in which DNA is stored produced from RNA (mRNA-messenger RNA) that is made up from proteins [7]. Every cell of an organism is developed from proteins which are dynamic in nature because every other tissue produces non identical set of proteins. This dynamic nature of proteins is based on the gene expression data. Unlike proteomes, genomes are constant. The set of proteins present in a cell provides insights about the structure and function of a cell [8]. It is difficult to handle gene expression data manually due to its size. Hence machine learning approach such as clustering algorithms are deployed upon varied gene expression data so as to group up similar functions and structures of tissues and explore hidden information.