Table I In silico translation of assembled transcripts allows MS analysis of unsequenced genomes. Assembled transcript reads from 454 pyro sequencing and Illumina were annotated using BlastX. The homology search was done with the following criteria: E-value threshold of 1.0 × 10−3 and >20% similarity. This produces 9,445 BLAST identified proteins (60%), of which 3,442 (36%) have Gene Ontology annotation mappings. Searches against the RNA-Seq provisional “Anopheles protein database” using albimanus and gambiae MS data yielded 252 Anopheles albimanus proteins and 192 Anopheles gambiae proteins at 5% peptide false discovery rate with a minimum of two peptides per protein.
Transcript readsOases/Velvet assembled contigsBlastX predicted proteinsProteins with ontology mappingProvisional protein database search identifications
ALBIMANUSGAMBIAE
Proteins identifiedPeptideSpectraProteins identifiedPeptideSpectra
∼210 million Illumina and ∼430,000 454 sequence reads15,7649,4453,442*2521,62422,484*1921,18315,601
  • * Indicates number of protein groups (see “Experimental Procedures”).