A massive effort to sequence and compare 29 mammalian genomes has shed new light on the “dark matter” of the genome, the over 98% of DNA that doesn’t code for proteins. The DNA that lies outside of gene sequences was once called “junk DNA.” But researchers now know that these non-coding regions have important biological functions. Many disease-causing mutations have been found in these areas, and scientists have pieced together some clues to their functions. For example, some regions regulate the expression of genes, controlling when genes are turned on and off. Nevertheless, this vast genetic dark matter remains largely uncharted.To gain new insights, an international team of researchers set out to compare the sequences of several mammalian species. Regions that remain the same or have only gradually evolved, they reasoned, must have some function.
In the early online edition of Nature on October 12, 2011, the researchers reported the sequencing of 20 new mammalian genomes, including rabbit, dolphin and elephant. They compared these new sequences with 9 others that were previously described, including humans. The scientists found that at least 5% of the genome appears to be constrained by evolution. They were able to identify 3.6 million specific elements under constraint, which together make up over 4% of the human genome. These elements include hundreds of new families of RNA, thousands of previously undetected segments of protein-coding DNA, and 2.7 million elements thought to play a role in controlling gene expression.
Significantly, the researchers found that many of the elements they identified overlap with variants that were linked to diseases and conditions in previous genomics studies.