BLAST |
|
European Bioinformatics Institute (EBI) |
www.ebi.ac.uk/blastall |
National Center for Biotechnology Information (NCBI) |
blast.ncbi.nlm.nih.gov |
BLAST-Like Alignment Tool (BLAT) |
genome.ucsc.edu/cgi-bin/hgBlat |
NCBI Conserved Domain Database (CDD) |
ncbi.nlm.nih.gov/cdd |
Cancer Genome Anatomy Project (CGAP) |
ocg.cancer.gov/programs/cgap |
FASTA |
|
EBI |
www.ebi.ac.uk/Tools/sss/fasta |
University of Virginia |
fasta.bioch.virginia.edu |
RefSeq |
ncbi.nlm.nih.gov/refseq |
Structural Classification of Proteins (SCOP) |
scop.berkeley.edu |
Swiss-Prot |
www.uniprot.org |
1 Altschul, S.F., Boguski, M.S., Gish, W., and Wootton, J.C. (1994). Issues in searching molecular sequence databases. Nat. Genet. 6: 119–129. A review of the issues that are of importance in using sequence similarity search programs, including potential pitfalls.
2 Fitch, W. (2000). Homology: a personal view on some of the problems. Trends Genet. 16: 227–231. A classic treatise on the importance of using precise terminology when describing the relationships between biological sequences.
3 Henikoff, S. and Henikoff, J.G. (2000). Amino acid substitution matrices. Adv. Protein Chem. 54: 73–97. A comprehensive review covering the factors critical to the construction of protein scoring matrices.
4 Koonin, E. (2005. Orthologs, paralogs, and evolutionary genomics). Annu. Rev. Genet. 39: 309–338. An in-depth explanation of orthologs, paralogs, and their subtypes, with a discussion of their evolutionary origin and strategies for their detection.
5 Pearson, W.R. (2016). Finding protein and nucleotide similarities with FASTA. Curr. Protoc. Bioinf. 53: 3.9.1–3.9.23. An in-depth discussion of the FASTA algorithm, including worked examples and additional information regarding run options and use scenarios.
6 Wheeler, D.G. (2003). Selecting the right protein scoring matrix. Curr. Protoc. Bioinf. 1: 3.5.1–3.5.6. A discussion of PAM, BLOSUM, and specialized scoring matrices, with guidance regarding the proper choice of matrices for particular types of protein-based analyses.
1 Agarawal, P. and States, D.J. (1998). Comparative accuracy of methods for protein similarity search. Bioinformatics. 14: 40–47.
2 Altschul, S.F. (1991). Amino acid substitution matrices from an information theoretic perspective. J. Mol. Biol. 219: 555–565.
3 Altschul, S.F. and Koonin, E.V. (1998). Iterated profile searches with PSI-BLAST: a tool for discovery in protein databases. Trends Biochem. Sci. 23: 444–447.
4 Altschul, S.F., Gish, W., Miller, W. et al. (1991). Basic local alignment search tool. J. Mol. Biol. 215: 403–410.
5 Altschul, S.F., Madden, T.L., Schäffer, A.A. et al. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402.
6 Brenner, S.E., Chothia, C., and Hubbard, T.J.P. (1998). Assessing sequence comparison methods with reliable structurally identified evolutionary relationships. Proc. Natl. Acad. Sci. USA. 95: 6073–6078.
7 Bücher, P., Karplus, K., Moeri, N., and Hofmann, K. (1996). A flexible motif search technique based on generalized profiles. Comput. Chem. 20: 3–23.
8 Chen, Z. (2003). Assessing sequence comparison methods with the average precision criterion. Bioinformatics. 19: 2456–2460.
9 Dayhoff, M.O., Schwartz, R.M., and Orcutt, B.C. (1978). A model of evolutionary change in proteins. In: Atlas of Protein Sequence and Structure, vol. 5 (ed. M.O. Dayhoff), 345–352. Washington, DC: National Biomedical Research Foundation.
10 Doolittle, R.F. (1981). Similar amino acid sequences: chance or common ancestry. Science 214: 149–159.
11 Doolittle, R.F. (1989). Similar amino acid sequences revisited. Trends Biochem. Sci. 14: 244–245.
12 Gonnet, G.H., Cohen, M.A., and Benner, S.A. (1992). Exhaustive matching of the entire protein sequence database. Proteins. 256: 1443–1445.
13 Gribskov, M., McLachlan, A.D., and Eisenberg, D. (1987). Profile analysis: detection of distantly-related proteins. Proc. Natl. Acad. Sci. USA. 84: 4355–4358.
14 Henikoff, S. and Henikoff, J.G. (1991). Automated assembly of protein blocks for database searching. Nucleic Acids Res. 19: 6565–6572.
15 Henikoff, S. and Henikoff, J.G. (1992). Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA. 89: 10915–10919.
16 Henikoff, S. and Henikoff, J.G. (1993). Performance evaluation of amino acid substitution matrices. Proteins Struct. Funct. Genet. 17: 49–61.
17 Henikoff, S. and Henikoff, J.G. (2000). Amino acid substitution matrices. Adv. Protein Chem. 54: 73–97.
18 Jones, D.T., Taylor, W.R., and Thornton, J.M. (1992). The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8: 275–282.
19 Karlin, S. and Altschul, S.F. (1990). Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. USA. 87: 2264–2268.
20 Kent, W.J. (2002). BLAT: the BLAST-like alignment tool. Genome Res. 12: 656–664.
21 Lipman, D.J. and Pearson, W.R. (1985). Rapid and sensitive protein similarity searches. Science. 227: 1435–1441.
22 Ma, B., Tromp, J., and Li, M. (2002). PatternHunter: faster and more sensitive homology search. Bioinformatics. 18: 440–445.
23 Pearson, W.R. (1995). Comparison of methods for searching protein sequence databases. Protein Sci. 4: 1145–1160.
24 Pearson, W.R. (2000). Flexible sequence similarity searching with the FASTA3 program package. Methods Mol. Biol. 132: 185–219.
25 Pearson, W.R. (2016). Finding protein and nucleotide similarities with FASTA. Curr. Protoc. Bioinf. 53: 3.9.1–3.9.23.
26 Pearson, W.R. and Lipman, D.J. (1988). Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA. 85: 2444–2448.
27 Rost, B. (1999). Twilight zone of protein sequence alignments. Protein Eng. 12: 85–94.
28 Ryan, J.F., Pang, K., Schnitzler, C.E. et al., and NISC Comparative Sequencing Program. (2013). The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science. 342: 1242592.
29 Schneider, T.D., Stormo, G.D., Gold, L., and Ehrenfeucht, A. (1986). Information content of binding sites on nucleotide sequences. J. Mol. Biol. 188: 415–431.
30 Schnitzler, C.E., Simmons, D.K., Pang, K. et al. (2014). Expression of multiple Sox genes through embryonic development in the ctenophore Mnemiopsis leidyi is spatially restricted to zones of cell proliferation. EvoDevo. 5: 15.
31 Smith, T.F. and Waterman, M.S. (1981). Identification of common molecular subsequences. J. Mol. Biol. 147: 195–197.
32 Staden, R. (1988). Methods to define and locate patterns of motifs in sequences. Comput. Appl. Biosci. 4: 53–60.
33 Tatusov, R.L., Altschul, S.F., and Koonin, E.V. (1994). Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. Proc. Natl. Acad. Sci. USA. 91: 12091–12095.
34 Tatusova, T.A. and Madden, T.L. (1999). BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol. Lett. 174: 247–250.
35 Török, A., Schiffer, P.H., Schintzler, C.E. et al. (2016). The cnidarian Hydractinia echinata employs canonical and highly adapted histones to pack its DNA. Epigenet. Chromatin. 9: 36.
36 Vogt, G., Etzold, T., and Argos, P. (1995). An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. J. Mol. Biol. 249: 816–831.
Читать дальше