LibCat » Книги » Приключения » unrecognised » Bioinformatics

Bioinformatics

Здесь есть возможность читать онлайн «Bioinformatics» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Bioinformatics
Автор:
Неизвестный Автор
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
4 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 80
- 1
- 2
- 3
- 4
- 5

Bioinformatics: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Bioinformatics»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Praise for the third edition of
“This book is a gem to read and use in practice.”
— "This volume has a distinctive, special value as it offers an unrivalled level of details and unique expert insights from the leading computational biologists, including the very creators of popular bioinformatics tools."
— “A valuable survey of this fascinating field. . . I found it to be the most useful book on bioinformatics that I have seen and recommend it very highly.”
— “This should be on the bookshelf of every molecular biologist.”
— The field of bioinformatics is advancing at a remarkable rate. With the development of new analytical techniques that make use of the latest advances in machine learning and data science, today’s biologists are gaining fantastic new insights into the natural world’s most complex systems. These rapidly progressing innovations can, however, be difficult to keep pace with.
The expanded fourth edition of the best-selling
aims to remedy this by providing students and professionals alike with a comprehensive survey of the current field. Revised to reflect recent advances in computational biology, it offers practical instruction on the gathering, analysis, and interpretation of data, as well as explanations of the most powerful algorithms presently used for biological discovery.
offers the most readable, up-to-date, and thorough introduction to the field for biologists at all levels, covering both key concepts that have stood the test of time and the new and important developments driving this fast-moving discipline forwards.
This new edition features:
New chapters on metabolomics, population genetics, metagenomics and microbial community analysis, and translational bioinformatics A thorough treatment of statistical methods as applied to biological data Special topic boxes and appendices highlighting experimental strategies and advanced concepts Annotated reference lists, comprehensive lists of relevant web resources, and an extensive glossary of commonly used terms in bioinformatics, genomics, and proteomics
is an indispensable companion for researchers, instructors, and students of all levels in molecular biology and computational biology, as well as investigators involved in genomics, clinical research, proteomics, and related fields.

Bioinformatics — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Bioinformatics», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

1 Bairoch, A. (2000). Serendipity in bioinformatics: the tribulations of a Swiss bioinformatician through exciting times! Bioinformatics. 16: 48–64. A personal narrative conveying the early history of the development of sequence databases and related software tools, events that set the groundwork for the modern bioinformatics landscape.

2 Green, E.D., Rubin, E.M., and Olson, M.V. (2017). The future of DNA sequencing. Nature. 550: 179–181. An insightful perspective regarding the next several decades of the application of DNA sequencing methodologies in novel contexts and the implications of those applications to issues of data storage and data sharing.

3 Rigden, D.J. and Fernández, X.M. (2018). The 2018 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res. 46: D1–D7. The 25th overview of the annual database issue published by Nucleic Acids Research, capturing the wide variety of publicly available bioinformatic databases available to the community. This overview is updated yearly, and the individual papers describing these database resources are freely available through the Nucleic Acids Research web site.

References

1 Apweiler, R. (2001). Functional information in Swiss-Prot: the basis for large-scale characterization of protein sequences. Briefings Bioinf. 2: 9–18.

2 Bairoch, A. (2000). Serendipity in bioinformatics: the tribulations of a Swiss bioinformatician through exciting times! Bioinformatics. 16: 48–64.

3 Baxevanis, A.D. and Bateman, A. (2015). The importance of biological databases in biological discovery. Curr. Protoc. Bioinf. 50: 1.1.1–1.1.8.

4 Benson, D.A., Cavanaugh, M., Clark, K. et al. (2018). GenBank. Nucleic Acids Res. 46: D41–D47.

5 Cook, C.E., Bergman, M.T., Cochrane, G. et al. (2018). The European Bioinformatics Institute in 2017: data coordination and integration. Nucleic Acids Res. 46: D21–D29.

6 Dayhoff, M.O., Eck, R.V., Chang, M.A., and Sochard, M.R. (1965). Atlas of Protein Sequence and Structure. Silver Spring, MD: National Biomedical Research Foundation.

7 Gene Ontology Consortium (2017). Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 45: D331–D338.

8 Green, E.D., Rubin, E.M., and Olson, M.V. (2017). The future of DNA sequencing. Nature. 550: 179–181.

9 Karsch-Mizrachi, I., Tagaki, T., and Cochrane, G., on behalf of the International Nucleotide Sequence Database Collaboration (2018). The International Nucleotide Sequence Database Collaboration. Nucleic Acids Res. 46: D48–D51.

10 Kim, H.J., Kim, N.C., Wang, Y.D. et al. (2013). Mutations in prion-like domains in hnRNPA2B1 and hnRNPA1 cause multisystem proteinopathy and ALS. Nature. 495: 467–473.

11 Kodama, Y., Mashima, J., Kosuge, T. et al. (2018). DNA Data Bank of Japan: 30th anniversary. Nucleic Acids Res. 46: D30–D35.

12 Landrum, M.J., Lee, J.M., Benson, M. et al. (2016). ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44: D862–D868.

13 Lee, R.Y.N., Howe, K.L., Harris, T.W. et al. (2018). WormBase 2017: molting into a new stage. Nucleic Acids Res. 46: D869–D874.

14 Lipman, D.J. and Pearson, W.R. (1985). Rapid and sensitive protein similarity searches. Science. 227: 1435–1441.

15 Liu, Q., Shu, S., Wang, R.R. et al. (2016). Whole-exome sequencing identifies a missense mutation in hnRNPA1in a family with flail arm ALS. Neurology. 87: 1763–1769.

16 Rigden, D.J. and Fernández, X.M. (2018). The 2018 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res. 46: D1–D7.

17 Silvester, N., Alako, B., Amid, C. et al. (2018). The European Nucleotide Archive in 2017. Nucleic Acids Res. 46: D36–D40.

18 Smith, C.L., Blake, J.A., Kadin, J.A. et al., and The Mouse Genome Database Group (2018). Mouse Genome Database (MGD)-2018: knowledgebase for the laboratory mouse. Nucleic Acids Res. 46: D836–D842.

19 Suzek, B.E., Wang, Y., Huang, H. et al., and The UniProt Consortium (2015). UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics. 31: 926–932.

20 UniProt Consortium (2017). UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45: D158–D169.

This chapter was written by Dr. Andreas D. Baxevanis in his private capacity. No official support or endorsement by the National Institutes of Health or the United States Department of Health and Human Services is intended or should be inferred .

2 Information Retrieval from Biological Databases

Andreas D. Baxevanis

Introduction

On April 14, 2003, the biological community celebrated the achievement of the Human Genome Project's major goal: the complete, accurate, and high-quality sequencing of the human genome (International Human Genome Sequencing Consortium 2001; Schmutz et al. 2004). The attainment of this goal, which many have compared to landing a person on the moon, has had a profound effect on how biological and biomedical research is conducted and will undoubtedly continue to have a profound effect on its direction in the future. The availability of not just human genome data, but also human sequence variation data, model organism sequence data, and information on gene structure and function provides fertile ground for biologists to better design and interpret their experiments in the laboratory, fulfilling the promise of bioinformatics in advancing and accelerating biological discovery.

One of the most important databases available to biologists is GenBank, the annotated collection of all publicly available DNA and protein sequences (Benson et al. 2017; see Chapter 1). This database, maintained by the National Center for Biotechnology Information (NCBI) at the National Institutes of Health (NIH), represents a collaborative effort between NCBI, the European Molecular Biology Laboratory (EMBL), and the DNA Data Bank of Japan (DDBJ). At the time of this writing, GenBank contained over 200 million sequences and over 300 trillion nucleotide bases. The completion of human genome sequencing and the sequencing of an ever-expanding number of model organism genomes, as well as the existence of a gargantuan number of sequences in general, provides a golden opportunity for biological scientists, owing to the inherent value of these data. However, at the same time, the sheer magnitude of data presents a conundrum to the inexperienced user, resulting not just from the size of the “sequence information space” but from the fact that the information space continues to get larger and larger – by leaps and bounds – at a pace that will continue to accelerate, even though human genome sequencing has long been “completed.”

The effect of the Human Genome Project and other systematic sequencing projects on the continued accumulation of sequence data is illustrated by the growth of GenBank, as shown in Figure 2.1; the exponential growth rate illustrated in the figure is expected to continue for some time to come. The continued expansion of not just the sequence space but of the myriad biological data now available because of the expansion of the sequence space underscores the necessity for all biologists to learn how to effectively navigate this information for effective use in their work – even allowing investigators to avoid performing expensive experiments themselves based on the data found within these virtual treasure troves.

GenBank (or any other biological database, for that matter) serves little purpose unless the data can be easily searched and entries retrievable in a useful, meaningful format. Otherwise, sequencing efforts such as those described above have no useful end – without effective search and retrieval tools, the biological community as a whole cannot make use of the information hidden within these millions of bases and amino acids, much less the structures they form or the mutations they harbor. Much effort has gone into making such data accessible to the biologist, and a selection of the programs and interfaces resulting from these efforts are the focus of this chapter. The discussion will center on querying databases maintained by NCBI, as these more “general” repositories are far and away the ones most often accessed by biologists, but attention will also be given to specialized databases that provide information not necessarily found through the use of Entrez, NCBI's integrated information retrieval system.