1 From Mendel to Molecules
Since the nineteenth century, scientists have been working to unravel the biological basis of inheritance. With Gregor Mendel’s mid‐nineteenth‐century discovery of the basic mechanisms of heredity, genetics was born, and humanity took its first small steps toward deciphering the genetic code. No longer would heredity solely be the domain of philosophers and farmers. Indeed, Mendel’s discoveries set the stage for major advances in genetics in the twentieth century and help put in motion the series of discoveries that led to the development of the sequencing of human and nonhuman genomes. This age of discovery, from Mendel to genome sequencing, is the subject of the first four chapters of this book. Chapter 1covers some basic biology and tells the story of the evolution of genetics by examining some of the most significant discoveries in the field—discoveries that enabled the development of genomics. Chapter 2looks specifically at the evolution of genetic and genomic sequencing technologies. Chapter 3examines the human genome itself and the ways in which we are exploring and exploiting it now and in the future. And, finally, Chapter 4looks at the sequencing and genome analysis tools of the post‐genomic era also called next generation sequencing or (NGS).
Without any further ado, may we present to you the human genome!
This photo ( Figure 1.1), also known as a karyotype, shows the 46 human chromosomes, the physical structures in the nuclei of your cells that carry almost the entire complement of your genetic material, also known as your genome. But don’t let this two‐dimensional representation of the genome fool you into believing in its simplicity. Almost 20 years ago biologist Richard Lewontin called DNA a “triple‐helix” to explain how genes function, and how they interact with each other and the environment. This triple helix is largely inseparable, and genetics doesn’t make sense unless taking these effects into account.
We could also have introduced you to your genome with a slew of the DNA sequence units—As, Ts, Gs, and Cs—in a string, or we could have shown you a picture of DNA in a test tube or even a picture of a nucleus of one of your cells where the DNA would be visible as dark stringy stuff. There are many ways to visualize the genome and this is part of its beauty.
Figure 1.1 This picture, known as a karyotype, is a photograph of all 46 human chromosomes. With an X and a Y chromosome, this is a male’s karyotype. A female’s karyotype would show two X chromosomes.
Credit: Photo Researchers
Figure 1.2The nucleus of every human cell (the large purple mass inside the cell) contains DNA. Mitochondria, organelles in cells that produce energy (the smaller purple objects within the cell), also contain some DNA.
Credit: Wiley
Still, to understand function, we do need to learn about basic form. And a karyotype, despite its limitations as a representation of the genome, illustrates that in almost all the cells in the human body there are 22 pairs of chromosomes and two sex‐determining chromosomes. The double helices that make up your chromosomes are composed of deoxyribonucleic acid, also known as DNA, on which are found approximately 20,000 genes. These cells are called somatic cells, and they are found in almost all nonreproductive tissue.
Humans also have cells with 23 nonpaired chromosomes. In these cells, each chromosome is made up of a single double helix of DNA that contains approximately 20,000 genes. These cells are called germ cells and are the sperm and egg cells produced for reproduction. These germ cells carry a single genome’s worth of DNA or more than 3 billion bases worth of nucleic acids.
Chromosomes are somewhat like genetic scaffolding—they hold in place the long, linearly arranged sequences of the nucleotides or base pairs that make up our genetic code. There are four different nucleotides that make up this code—adenine, thymine, guanine, and cytosine. These four nucleotides are commonly abbreviated as A, T, G, and C. Found along that scaffolding are our genes, which are made from DNA, the most basic building block of life. These genes code for proteins, which are the structural and machine‐like molecules that make up our bodies, physiology, our mental state. Through the Human Genome Project scientists are not simply learning the order of this DNA sequence, but are also beginning to locate and study the genes that lie on our chromosomes. But not all DNA contains genes.
On average 3 billion base pairs exist in the collection of the chromosomes your mother transmitted to you. Add to that the chromosomes given to you by your father gave you and in your cells there are around 6 billion bases, a complete diploid human genome. There are long stretches of DNA between genes known as intergenic or noncoding regions. And even within genes some DNA may not code for proteins. These areas, when they are found within genes, are called introns. While these genomic regions were once believed to have no products and/or no function, scientists now understand that both introns and intergenic regions play a role in regulating DNA function. The Encyclopedia of DNA Elements or ENCODE Project estimates, for example, that while only 2.94% of the entire human genome is protein coding, 80.4% of genome sequences might govern the regulation of genes. (1) Unlike the human genome and all other eukaryotic genomes, however, bacterial genomes do not have introns and have very short intergenic regions. Curiously though, the archaea, a third major domain of life (in addition to eukaryotes and bacteria) do have introns, but not necessarily the same kind of introns as eukaryotes.
Let’s begin our tour of the human genome with a very basic lesson in genetic terminology. For example, what exactly is genetics, and how is it different from genomics? Genetics is the study of the mechanisms of heredity. The distinction between genetics and genomics is one of scale. Geneticists may study single or multiple human traits. In genomics, an organism’s entire collection of genes, or at least many of them, is examined to see how entire networks of genes influence various traits. A genome is the entire set of an organism’s genetic material. The fundamental goal of the Human Genome Project was to sequence all of the DNA in the human genome. Sequencing a genome, whether human or nonhuman, simply means deciphering the linear arrangement of the DNA that makes up that genome. In eukaryotes (plants, animals, fungi, and single‐celled organisms called protists), the vast majority of the genetic material is found in the cell’s nucleus. The Human Genome Project has been primarily interested in the more than 3 billion base pairs of nuclear DNA. A tiny amount of DNA is also found in the mitochondria, a cellular structure responsible for the production of energy within a cell. Whereas the human nuclear genome contains more than 3 billion base pairs of DNA and approximately 20,000 genes (that’s nearly 10,000 genes fewer than when the first edition of this book was published in 2005), the reference human mitochondrial genome contains only 16,568 bases and 37 genes. (2) Like bacteria, mitochondrial DNA, or mtDNA, has short intergenic regions and its genes do not contain introns. Another interesting characteristic of mtDNA is that it is always maternally inherited. This has made mtDNA very helpful to track female human evolutionary phenomena. These discoveries were made possible, in part, by sequencing mtDNA.
Читать дальше