Mammalian Genome Project

Identification of the functional elements in the human genome – including both coding and non-coding – is a key foundation for biomedical research. One of the most powerful ways to discover these elements is through cross-species comparisons with other mammalian genomes – in effect, deciphering evolution's laboratory notebook containing the results of 100 million years of evolution.

The mammalian genome project is a NIH-funded effort to expand the current genome coverage of the mammals (human, chimpanzee, mouse, dog, opposum) by sequencing 24 additional mammals to low-coverage (2x). The goal is to create low coverage genome assemblies and align resulting sequence to the human genome to permit comparative genomic analysis.

The Broad Institute is sequencing 15 mammals, while two other centers are sequencing the other 9 mammals. We are also developing algorithms to identify regions of sequence similarity across species, which have persisted through evolution and are indicative of genomic functionality. These regions include genes and smaller regulatory elements, such as transcription factor binding sites, which play key roles in determining the activation of genes and pathways in different cellular contexts.

The mammals receiving low coverage sequence were chosen primarily to maximize the total branch length of the evolutionary tree. Emphasis was also placed on organisms that represent the diversity of the mammalian tree and, where possible, are biologically useful models.

Though effective for use in identifying features of the human genome shared across most mammals, we recognize the inherent limitations associated with low coverage genome analyses. We will obtain higher quality sequence data (6-7X coverage) from a limited set (8 of 24) of mammals picked for low coverage which will significantly aid in the annotation and understanding of the human genome.

In addition to mammalian genomes, we are also sequencing several other model organisms for comparative genomic analysis:

Additional Vertebrates and Invertebrates


Project Data Release Summary


Organism Status Genbank Accession Notes
African savannah elephant
Loxodonta africana
2x assembled AAGU00000000 scheduled for deep coverage
Broad
Nine-banded armadillo
Dasypus novemicinctus
2x assembled AAGV00000000 scheduled for deep coverage
Baylor
European rabbit
Oryctolagus cuniculus
2x assembled AAGW00000000 scheduled for deep coverage
Broad
Lesser hedgehog (Tenrec)
Echinops telfari
2x assembled AAIY00000000 2X
Common shrew
Sorex araneus
2x assembled AALT00000000 2X
Guinea Pig
Cavia porcellus
7x assembled AAKN00000000 7x
European hedgehog
Erinaceus europeaus
2x assembled AANN00000000 2X
Cat
Felis catus
2x assembled AANG00000000 sequenced at Agencourt
scheduled for deep coverage
WashU
Little brown bat
Myotis lucifugus
2x assembled AAPE00000000 scheduled for deep coverage
Broad
Ground squirrel
Spermophilis tridecemlineatus
2x assembled AAQQ00000000 2X
Bushbaby
Otolemur garnetti
2x assembled AAQR00000000 2X
Tree shrew
Tupaia belangeri
2x assembled AAPY00000000 scheduled for deep coverage
WashU
Horse
Equus caballus
7x assembled AAWR02000000 no 2X assembly
Broad
Pika
Ochotona princeps
2X assembled AAYZ01700000 2X
Mouse lemur
Microcebus murinus

2X assembled

2X
Hyrax
Procavia capensis
Baylor 2X
Megabat
Pteropus vampyrus
Baylor 2X
Dolphin
Tursiops truncatus
Baylor 2X
Tarsier
Tarsier syrichta
Baylor 2X
Kangaroo rat
Dipodomys panamintinus or ordii
Baylor 2X
Chinese pangolin
Manis pentadactyla
Washington 2X
Two-toed sloth
Choloepus hoffmanni
Washington 2X
Llama
Lama glama
Washington 2X
Flying lemur
Cynocelhapus spp.
Washington 2X