FAQ

Questions

Answers

  • Sequencing
    1. How big is the Magnaporthe grisea genome?

      Our current total unique contig length is 39,429,053 bp (base pairs). The estimated genome size of M. grisea is x.

    2. What strain was sequenced?

      Magnaporthe grisea strain 70-15.

    3. What is the current state of the assembly?

      The current assembly contains 739 sequence contigs in 197 supercontigs (scaffolds). There are no current plans for additional sequencing or finishing. See Assembly for detail.

    4. What is finishing?

      Finishing entails laboratory and analytic steps that make use of existing and new sequence data to close gaps between contigs and between scaffolds, to resolve sequence ambiguities and to bring all sequence to a uniform high quality standard.

    5. Will finishing close all gaps in the genome sequence?

      Eukaryotic genomes contain regions of sequence that cannot be resolved by current technology. Often finished genomes contain gaps at regions that cannot be cloned or contain structures that inhibit sequencing reactions. In addition, regions that contain highly repetitive, high-identity sequence elements, such as centromeres, telomeres or arrays of rDNA repeats remain gaps in finished sequence.

    6. How has the sequence been generated for the Magnaporthe grisea project?

      Our data consist of over 0 individual sequencing reads obtained by sequencing each end of plasmids and Fosmids from libraries containing randomly sheared fragments of 4 kb, 10 kb, and 40 kb average size respectively. See Assembly for details.

    7. How will we know the assembly is correct?

      The quality of the assembly will be assessed in several ways. In addition to requiring that the paired plasmid and Fosmid ends occur in a logical manner, our assembly of the Magnaporthe grisea genome will be verified through comparison with available genomic sequences.

    8. What data are available?

      In this version of our data release, all sequence contigs and supercontigs are available. Sequence data can be accessed in several ways: either through a searching with BLASTN or TBLASTN, retrieving of a specific region of the assembly, or by downloading the entire genome. These functions apply to this particular release of Magneporthe. Other M. grisea sites at the Broad offer different functionality, based on data available for a particular release.

    9. This sequence release looks different from previous releases, like Neurospora crassa. What's different?

      Important information about this release can be found here.

  • Annotated Genes
    1. Is gene XXX annotated in the sequence?

      Maybe. We have run automated tools for finding putative genes, relying on ab initio gene finders and sequence similarity to known proteins.

      You can search for a gene by name, or by a blastx hit to a known gene. However, the gene names are extremely preliminary, and you will find most genes are either named 'predicted protein' (meaning no or weak homology to known genes), 'hypothetical protein' (indicating weak homology to known genes), or 'hypothetical protein (name)' (indicating strong homology).

      We are not yet in a position to curate manual annotations.

    2. Gene XXX is annotated incorrectly in your sequence - can I submit an update to your gene name?

      Unfortunately, we are not yet in a position to curate manual annotations. We are currently still discussing future annotation plans.

    3. How were the genes annotated?

      See Automated Gene Calling.

    4. There seem to be two different ways of seeing my gene in the FeatureMap - what's going on here?

      There are two different ways of looking at a gene in the FeatureMap or GenomeBrowser.

      1. Gene within a supercontig (e.g. title "supercontig 1.1")

        If you bring up the FeatureMap or GenomeBrowser on a region of a supercontig, then you are seeing the result of DNA-based analyses. You'll know you are in this mode if the title of the FeatureMap gives a contig number.

        This graphical view shows the results of analyses performed on the nucleotide sequence. For example:

        1. De novo gene prediction programs: Fgenesh, Genscan
        2. Blastn searches against NT
        3. Blastn searches against ESTs
        4. Blastx searches of the translated nucleotide sequences against proteins in NR
        5. HMMER searches of the translated DNA against PFAM

        You can get to these FeatureMaps by using the Search Regions page.

      2. A single gene by itself (e.g. title "MGG#####")

        You can also bring up the FeatureMap or GenomeBrowser on a protein sequence corresponding to a particular gene. In this view you will see the results of protein-based analyses on the amino-acid sequence. For example:

        1. Blastp searches of the protein against proteins in NR
        2. HMMER searches of the protein against PFAM

        You can get the FeatureMap of a particular gene from the Feature Detail page corresponding to that locus

      However the HMMER searches found at the DNA level can be misleading, since they do not take the exon structure of the gene into account.

      In addition to the HMMER searches of the DNA, we also perform HMMER searches against our predicted gene set. These HMMER protein searches are likely to be more accurate, thus we present the protein based PFAM results in the "Feature Detail" summary. We also used the protein-based PFAM results when searching for genes by PFAM domain, in the Advanced Search for Annotated Genes and the gene index of Genes by PFAM

      The Feature Search mechanism provides access to the results of the DNA analyses, thus the HMMER Feature Search will show the results of the DNA-based HMMER program.

      DNA-based HMMER results:

      Protein-based HMMER results:

  • Misc
    1. How do I cite the sequence for publication?

      Publications should include the following citation:

      Magnaporthe grisea Sequencing Project. Broad Institute of Harvard and MIT (http://www.broad.mit.edu)

    2. Who do I contact with questions about the sequencing?

      Our email address is annotation-webmaster@broad.mit.edu.