Prophage Finder
This web application was designed to provide researchers with a tool to quickly predict potential prophage loci in prokaryotic genome sequences. However, this web application does not make any predictions as to whether the identified prophage is functional and it is also important to note the identified prophage region will most likely not represent the entire prophage. More information on Prophage Finder and its various Output files can be found here.
Prophage Finder initially uses BLASTX to compare input DNA sequences to a database of predicted proteins from all sequenced phage genomes as of April 2005, available at http://www.ncbi.nlm.nih.gov/genomes/static/phg.html (1). The BLAST results are then processed by a Perl program to determine potential prophage. This Perl program predicts prophage by clustering BLAST hits based on the number of base pairs between them and then reporting the clusters greater than a specified size.
Prophage Finder has been used to analyze all of the DOE draft microbial genome sequences. The genome sequences are available at http://genome.ornl.gov/microbial/. The Prophage Finder results for these genome sequences are available here.
Options:
The default options have been chosen based upon testing a variety of prokaryotic genome sequences. However, the various options can be changed to alter the specificity and selectivity of this web application.
E-value: This is the value that BLAST uses as a cutoff. Decreasing this value will increase the specificity of the program. This can help eliminate false positives, but may likely increase the number of false negatives.
Hits per Prophage: This option defines the minimum number of hits in a region required for a prophage to be identified. Increasing this number can help eliminate false positives or allow the user to search primarily for larger prophage sequences.
Hit Spacing: This option determines the maximum number of base pairs allowed between two BLAST hits to be grouped together into a region. Decreasing this value can help eliminate false positives.
tRNA Scan: When the tRNA Scan box is checked, the program tRNA Scan, available at http://www.genetics.wustl.edu/eddy/tRNAscan-SE/, predicts the location of tRNAs in DNA sequences (3). This option is useful because it has been observed that prophages often insert into their host genomes next to tRNAs (2). The identification of potential tRNAs can provide further support for prophages predicted by Prophage Finder. Please note, an absence of tRNAscan output is could be due to an improperly formatted input DNA sequence.
Predicted prophage codon usage information can be compared to genome codon usage information obtained here.
References:
Bose, M. and Barber, R. (2006). Prophage Finder: a prophage loci prediction tool for prokaryotic genome sequences. In Silico Biol. 6, 0020. </isb/2006/06/0020/>