DNA :: The bumper book of DNA no-no’s

Most genome sequencers are looking for genes inside living species to understand their function. But one genome project is deliberately searching for DNA sequences that are absent from species — perhaps because they are too dangerous to life to exist. The US team have developed software that calculates all the possible sequences of nucleotides and then scans sequence databases to identify sequences that aren’t present. They believe their results will have far-reaching applications.

COULD there be forbidden sequences in the genome ? ones so harmful that they are not compatible with life? One group of researchers thinks so. Unlike most genome sequencing projects which set out to search for genes that are conserved within and between species, theirgoal is to identify “primes”: DNA sequences and chains of amino acids so dangerous to life that they do not exist.

“It’s like looking for a needle that’s not actually in the haystack,” says Greg Hampikian, professor of genetics at Boise State University in Idaho, who is leading the project. “There must be some DNA or protein sequences that are not compatible with life, perhaps because they bind some essential cellular component, for example, and have therefore been selected outof circulation. There may also be some that are lethal in some species, but not others. We’relooking for those sequences.”

To do this, Hampikian and his colleage Tim Anderson, also at Boise, have developed softwarethat calculates all the possible sequences of nucleotides ? the “letters” of DNA ? up to a certainlength, and then scans sequence databases such as the US National Institutes of Health’s Genbank to identify the smallest sequences that aren’t present. Those that don’t occur in one species but do in others are termed “nullomers”, while those that aren’t found in any species are termed primes.

Hampikian’s team is deliberately searching for the shortest absent sequences in order to minimise the possibility that absent sequences are missing simply due to chance. So far they have found 86 sequences of 11 nucleotides long that have never been reported in humans.

They have also identified more than 60,000 primes of 15 nucleotides in length and 746 protein “peptoprime” strings of five amino acids that have never been reported in any species. “These represent the largest possible set of lethal sequences,” says Hampikian, who expects the numbers to shrink as more sequence information is added to the database. He is presenting his results at the Pacific Symposium on Biocomputing in Maui, Hawaii, this week.

Whether these sequences have any biological significance in living organisms is not yet known ? the next step is to test 20 of the peptoprimes in bacteria and human cells to see whether they have any effect such as causing death or provoking an immune reaction.

Hampikian believes the applications of his work could be wide-ranging. He has already received a $1 million grant from the US Department of Defense to develop a DNA “safety tag” that could be added to voluntary DNA reference samples in criminal cases to distinguish them from forensic samples. Such tags would not necessarily have to consist of lethal sequences, but could be based on primes that would be easy to detect using a simple kit.

Further down the line there is the possibility of constructing a “suicide gene” to code for deadlyamino acid primes. It could be attached to genetically modified organisms and activated to destroy them at a later date if they turned out to be dangerous, Hampikian suggests.

Leave a Comment