March 1, 2006
Genes and Genomes
David Botstein
Professor of Molecular Biology and Director of the Lewis-Sigler Institute for Integrative Genomics, Princeton University
Minutes of the 22nd Meeting of the 64th Year
After enjoying refreshments prepared by the Hospitality Committee, Bill Haynes called the meeting to order at 10:15 AM and John Marks led the 95 members present in the Invocation. John Frederick read the minutes of last week’s meeting with a thorough and eloquent account of last week’s talk. Jim Johnson introduced his visitor, Bob Caghan. Bill Haynes announced some good news: Charlie Ufford’s raincoat had been found! John Schmidt then introduced the speaker, Prof. David Botstein, Director of the Lewis-Sigler Center for Integrative Genomics. The title of his talk: Genes and Genomes.
Prof. Botstein planned to begin with an account of his own research and follow with a description of an innovative program in introductory science education he has developed. He stressed that time constraints would make details true to only a first approximation, so I hope you will forgive my précis of that approximation, which cannot do justice to such a fact-filled presentation.
To introduce his research, Prof. Botstein gave a whirlwind review of a science that began as biochemistry in the 19th century, became genetics during the first half of the 20th century, morphed into molecular biology in the 1950s, and in the last 25 years gave birth to the new science of genomics. He observed that it is no coincidence that this new field evolved contemporaneously with computer science, for it is on this new technology that genomics depends for its very existence.
Introducing key concepts, Prof. Botstein explained that DNA encodes inherited biological information as a sequence of nucleotide bases. He likened DNA to a recording tape and the cell to a tape recorder. The cell reads information on the tape and translates it into amino acids, which are the building blocks of proteins. And it is proteins that do the work of the cell: metabolizing food, contracting muscles, conducting nerve impulses and so on. Although the concept of the genome is at least 70 years old, it was not until Watson and Crick established the structure of DNA in 1952, and the subsequent ability to read the code, that the science of genomics began to flourish. Initial research studied viral phage particles, yeast, microscopic worms, and fruit flies. But even these simple organisms turned out to have strings of tens of millions of bases in their genetic code. When the human genome was finally sequenced at the turn of the century, some three billion bases encoding some 30 thousand genes were found.
When Prof. Botstein showed a small segment of the code, a few hundred seemingly random letters representing amino acids in a simple protein molecule, he dramatically illustrated the truth that the unaided human brain is unable to read, let alone interpret, the message in such a code. And this is the challenge of genomics, an information science that analyzes raw biochemical data and reformats it in a way we can understand. With this technology some remarkable insights have been obtained. In dramatic confirmation of Darwin’s theory, we have learned that much of the basic cellular machinery in yeast and bacteria, organisms thought to be some three billion years old, is powering our bodies today. For instance, actin, the protein that allows our muscles to contract, is also found in yeast. In another illustration, Prof. Botstein described how a yeast, fatally deprived of a vital gene for producing the protein HMGCoA reductase (the very same enzyme inhibited by cholesterol-lowering drugs) can be reanimated by inserting the corresponding gene from a human cell. He wryly commented that these experiments are relatively easy to do: there are no yeast-rights groups to interfere.
So what is a human muscle gene doing in a sedentary yeast culture? And since every cell in the human body has the same complement of genes, how is it that the muscle cells differ from liver cells, brain cells from blood cells? It appears that the answer lies in the differential activation of genes and the seemingly infinite combinatorial possibilities in the interaction thousands of different proteins. It is the understanding of these mechanisms that is Prof. Botstein’s Holy Grail, and what genomics is all about. He summarized the objectives of the Institute as addressing two problems: “What’s going on in the cell to begin with, and getting the computer to explain it to us.”
The gene chip is an example of the kind of technology that, combined with computer analysis, is helping to answer these questions. The chip resembles a microscope slide on which thousands of different gene sequences are arrayed. When an unknown cell sample is applied to the surface of the chip, active biochemical states are recognized by a change in color, somewhat analogous to the way litmus paper changes from red to blue as an acid solution becomes alkaline. Instead of recognizing acidity, the gene chip recognizes which genes are activated, and as a camera scans the chip, the computer converts colors to digital information. At this point, Prof. Botstein observed, he is doing experiments in the computer, long after the test tubes have been put away. And it is the development of algorithms that sort and recognize patterns in this enormous wealth of raw data that is Prof. Botstein’s brave new world. In his words, “We educate the computer and the computer educates us.”
To illustrate the way in which computer algorithms can help to analyze complex data, Prof. Botstein showed a painting by Raphael. He decomposed the image into a random mixture of pixels, and then, with a sorting algorithm that correlated colors, vectors, and their relationship to each other, reconstructed the picture. I have to admit I appreciated this demonstration in the same way I enjoy a magic show. I have no idea how it works, and I can only marvel that a rabbit really does come out of the hat. A direct application of these techniques to the study of human disease, is seen in breast cancer, which often appears as a single entity under the microscope, but can have a variety of genetic patterns, that result in different outcomes, and call for different treatments.
After limning these exciting prospects of the future, and dazzling us with a startling array of mathematical equations that flashed across the screen, Prof. Botstein moved to the second half of his talk and described the new course of study he has established at Princeton. It was clear that the traditional departmental course structure is ill-suited to creating scientists who must be equally at ease with evolutionary dynamics, cellular metabolism, computer programming, and quantum mechanics, not to mention arcane areas of mathematical analysis.
So, with a group of like-minded adventurous colleagues representing these varied disciplines, an integrated approach was formulated. Bravely they triaged their cherished teaching traditions into two piles: Pile A consisted of truly fundamental concepts, and Pile B, all those wonderful ideas that are usually taught but could be set aside. And since the teachers in the course receive half their salary from the Institute, Prof. Botstein gleefully noted, they are beholden to him. As for the students, they are a self-selected group of potential scientists prepared to undertake this arduous new course of study. And what a curriculum they have! In the first semester the students embark on challenging experiments to establish such fundamental notions as Avogadro’s number and the Boltzmann constant, and interpret their data using JAVA language programs of their own devising. In the second semester, having heuristically confirmed the basic laws of physics and chemistry, with the aid of differential calculus, functional analysis, and probability models, they move on to the study of the quantum world of molecular structure, and derive Einstein’s equations that explain Brownian motion. By the end of their sophomore year, these extraordinary young students are on the frontiers of scientific knowledge, and the results of their studies are appearing in scientific journals. For the final exam, students all wore T-shirts emblazoned with their war cry, “Desperately Fighting Against Ignorance, One Integral at a Time.”
Prof. Botstein was clearly delighted by his young protégées, and we should be pleased, too, for these are surely the scientists who will revolutionize our understanding of living organisms and human disease in the years to come.
A brief question and answer period followed, and the meeting adjourned at 11:30 a.m.
Respecfully submitted,
Roger Moseley
Prof. Botstein planned to begin with an account of his own research and follow with a description of an innovative program in introductory science education he has developed. He stressed that time constraints would make details true to only a first approximation, so I hope you will forgive my précis of that approximation, which cannot do justice to such a fact-filled presentation.
To introduce his research, Prof. Botstein gave a whirlwind review of a science that began as biochemistry in the 19th century, became genetics during the first half of the 20th century, morphed into molecular biology in the 1950s, and in the last 25 years gave birth to the new science of genomics. He observed that it is no coincidence that this new field evolved contemporaneously with computer science, for it is on this new technology that genomics depends for its very existence.
Introducing key concepts, Prof. Botstein explained that DNA encodes inherited biological information as a sequence of nucleotide bases. He likened DNA to a recording tape and the cell to a tape recorder. The cell reads information on the tape and translates it into amino acids, which are the building blocks of proteins. And it is proteins that do the work of the cell: metabolizing food, contracting muscles, conducting nerve impulses and so on. Although the concept of the genome is at least 70 years old, it was not until Watson and Crick established the structure of DNA in 1952, and the subsequent ability to read the code, that the science of genomics began to flourish. Initial research studied viral phage particles, yeast, microscopic worms, and fruit flies. But even these simple organisms turned out to have strings of tens of millions of bases in their genetic code. When the human genome was finally sequenced at the turn of the century, some three billion bases encoding some 30 thousand genes were found.
When Prof. Botstein showed a small segment of the code, a few hundred seemingly random letters representing amino acids in a simple protein molecule, he dramatically illustrated the truth that the unaided human brain is unable to read, let alone interpret, the message in such a code. And this is the challenge of genomics, an information science that analyzes raw biochemical data and reformats it in a way we can understand. With this technology some remarkable insights have been obtained. In dramatic confirmation of Darwin’s theory, we have learned that much of the basic cellular machinery in yeast and bacteria, organisms thought to be some three billion years old, is powering our bodies today. For instance, actin, the protein that allows our muscles to contract, is also found in yeast. In another illustration, Prof. Botstein described how a yeast, fatally deprived of a vital gene for producing the protein HMGCoA reductase (the very same enzyme inhibited by cholesterol-lowering drugs) can be reanimated by inserting the corresponding gene from a human cell. He wryly commented that these experiments are relatively easy to do: there are no yeast-rights groups to interfere.
So what is a human muscle gene doing in a sedentary yeast culture? And since every cell in the human body has the same complement of genes, how is it that the muscle cells differ from liver cells, brain cells from blood cells? It appears that the answer lies in the differential activation of genes and the seemingly infinite combinatorial possibilities in the interaction thousands of different proteins. It is the understanding of these mechanisms that is Prof. Botstein’s Holy Grail, and what genomics is all about. He summarized the objectives of the Institute as addressing two problems: “What’s going on in the cell to begin with, and getting the computer to explain it to us.”
The gene chip is an example of the kind of technology that, combined with computer analysis, is helping to answer these questions. The chip resembles a microscope slide on which thousands of different gene sequences are arrayed. When an unknown cell sample is applied to the surface of the chip, active biochemical states are recognized by a change in color, somewhat analogous to the way litmus paper changes from red to blue as an acid solution becomes alkaline. Instead of recognizing acidity, the gene chip recognizes which genes are activated, and as a camera scans the chip, the computer converts colors to digital information. At this point, Prof. Botstein observed, he is doing experiments in the computer, long after the test tubes have been put away. And it is the development of algorithms that sort and recognize patterns in this enormous wealth of raw data that is Prof. Botstein’s brave new world. In his words, “We educate the computer and the computer educates us.”
To illustrate the way in which computer algorithms can help to analyze complex data, Prof. Botstein showed a painting by Raphael. He decomposed the image into a random mixture of pixels, and then, with a sorting algorithm that correlated colors, vectors, and their relationship to each other, reconstructed the picture. I have to admit I appreciated this demonstration in the same way I enjoy a magic show. I have no idea how it works, and I can only marvel that a rabbit really does come out of the hat. A direct application of these techniques to the study of human disease, is seen in breast cancer, which often appears as a single entity under the microscope, but can have a variety of genetic patterns, that result in different outcomes, and call for different treatments.
After limning these exciting prospects of the future, and dazzling us with a startling array of mathematical equations that flashed across the screen, Prof. Botstein moved to the second half of his talk and described the new course of study he has established at Princeton. It was clear that the traditional departmental course structure is ill-suited to creating scientists who must be equally at ease with evolutionary dynamics, cellular metabolism, computer programming, and quantum mechanics, not to mention arcane areas of mathematical analysis.
So, with a group of like-minded adventurous colleagues representing these varied disciplines, an integrated approach was formulated. Bravely they triaged their cherished teaching traditions into two piles: Pile A consisted of truly fundamental concepts, and Pile B, all those wonderful ideas that are usually taught but could be set aside. And since the teachers in the course receive half their salary from the Institute, Prof. Botstein gleefully noted, they are beholden to him. As for the students, they are a self-selected group of potential scientists prepared to undertake this arduous new course of study. And what a curriculum they have! In the first semester the students embark on challenging experiments to establish such fundamental notions as Avogadro’s number and the Boltzmann constant, and interpret their data using JAVA language programs of their own devising. In the second semester, having heuristically confirmed the basic laws of physics and chemistry, with the aid of differential calculus, functional analysis, and probability models, they move on to the study of the quantum world of molecular structure, and derive Einstein’s equations that explain Brownian motion. By the end of their sophomore year, these extraordinary young students are on the frontiers of scientific knowledge, and the results of their studies are appearing in scientific journals. For the final exam, students all wore T-shirts emblazoned with their war cry, “Desperately Fighting Against Ignorance, One Integral at a Time.”
Prof. Botstein was clearly delighted by his young protégées, and we should be pleased, too, for these are surely the scientists who will revolutionize our understanding of living organisms and human disease in the years to come.
A brief question and answer period followed, and the meeting adjourned at 11:30 a.m.
Respecfully submitted,
Roger Moseley