SNP Count Up to 1.4 Million: Map Accelerates Discovery of Disease Genes and Human Population History

February 12, 2001

CAMBRIDGE, Mass. – In a companion volume to the "Book of Life," scientists have created the largest publicly available catalog of single letter DNA differences (SNPs)—1.4 million SNPs—with their exact location in the human genome. The SNP map promises to revolutionize both mapping diseases and tracing human history. Already, it is accelerating discovery of disease genes and providing a "fossil record" of human population history, which suggests that we are all descended from a small group of about 10,000 people.

The current SNP map results from the coordinated efforts of an international team of industry and academic scientists, termed the International SNP Map Working Group. More than 95% of SNPs on the map come from two large-scale efforts: The SNP Consortium and the public Human Genome Project.

"With the SNP map, we have overcome a huge barrier to doing human genetic studies—whether it is to study a particular defect in a gene or to trace ancestry. For a vast majority of genes, scientists can turn to the database of SNPs instead of wasting precious time and money hunting them down," says Eric Lander, director of the Whitehead Center for Genome Research. "Instead scientists can spend their time doing the actual research."

SNPs are the bedrock of human genetics: they can be used to track inheritance of any gene, contribute to the traits that make us unique, and underlie our susceptibilities to common diseases such as cancer, diabetes, and heart disease. It is also believed that SNPs may help explain why individuals respond differently to drugs.

Though the 1.4 million SNPs identified thus far represent a fraction of the total, every last SNP isn't necessary to get started with a genetic study because SNPs travel together—one SNP carries information about its nearby SNP neighbors.

Today scientists are using the SNP map as a tool to correlate diseases to their genetic origins. Instead of spending a year collecting variants in a disease gene, the variants are a click away and researchers can start a study in one afternoon.

"Last month, we were able to examine how a gene, which affects testosterone levels in the body, affects prostate cancer risk. We pulled 15 SNPs off the web, and looked for them in our patients. The 15 SNPs came in only four combinations. So that gene can now be reduced to four flavors, and each tested for a role in influencing disease. A massive, costly project was reduced to about two weeks," explains David Altshuler, who led the Whitehead team and is senior author on the paper. Altshuler is Director of Medical and Population Genetics at the Whitehead Center for Genome Research, and an Assistant Professor of Genetics and Medicine at Harvard Medical School and the Massachusetts General Hospital.

Altshuler and David Bentley of the Sanger Centre in Hinxton, England, serve as communicating authors.

The quality of the SNPs are exceptionally high, report the scientists. It is critical for SNP research that the vast majority of SNPs on the map represent true polymorphisms rather than laboratory or computer errors. To confirm the validity of the data, the group tested more than 2,700 SNPs in a range of tests. These tests—in part performed by independent laboratories—showed that 95% of claimed SNPs represent true differences (rather than errors), and that 82% of the SNPs have frequencies of greater than 10% in human populations.

The SNP map described in the Nature paper is more than just a reference for disease genes. It provides the first genome-wide view of how SNPs are distributed throughout the genome. By examining this pattern, the scientists could observe the "fossil record" of human population history. This record tells the tale of small group—about 10,000 people—expanding rapidly to populate the whole earth in the last 50,000 to 100,000 years.

This story of human history explains the distribution of SNPs observed throughout the genome. The scientists report that SNPs aren't evenly distributed, but rather vary widely in their density if different neighborhoods of the human genome. Some areas are desserts without a single SNP, while others have a great number of variants. "In order to interpret studies of SNPs and disease, we first need to understand the landscape of human genome variation," says Dr. Altshuler. "You could easily be mislead into thinking a particular gene was unusual because it had too few or too many SNPs. It turns out that such variability is entirely normal."

The SNP map was created by a collaborative effort of five major centers, termed The International SNP Map Working Group. The contributing centers were (in alphabetical order) Cold Spring Harbor Labs, National Center for Biotechnology Information, The Sanger Centre, Washington University in St. Louis, and the Whitehead/MIT Center for Genome Research. In addition to SNPs produced by these groups as part of The SNP Consortium and Human Genome Project, SNPs deposited by over 50 additional labs were integrated into the map.

CONTACT

Communications and Public Affairs
Phone: 617-258-6851
Email: newsroom@wi.mit.edu

Whitehead Institute is a world-renowned non-profit research institution dedicated to improving human health through basic biomedical research.
Wholly independent in its governance, finances, and research programs, Whitehead shares a close affiliation with Massachusetts Institute of Technology
through its faculty, who hold joint MIT appointments.

© Whitehead Institute for Biomedical Research              455 Main Street          Cambridge, MA 02142