|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Submitted on April 24, 2006
Chemistry and Biochemistry, University of Colorado, Boulder, CO 80309
Corresponding Author: rob{at}spot.colorado.edu
DivergentSet addresses the important but so far neglected bioinformatics task of choosing a representative set of sequences from a larger collection. We found that using a phylogenetic tree to guide the construction of divergent sets of sequences can be up to two orders of magnitude faster than the naive method of using a full distance matrix. By providing a user-friendly interface (available at http://bmf.colorado.edu/divergentset) that integrates the tasks of finding additional sequences, building and refining the divergent set, producing random divergent sets from the same sequences, and exporting identifiers, this software facilitates a wide range of bioinformatics analyses including finding significant motifs and covariations. As an example application of DivergentSet, we demonstrate that the motifs identified by the motif-finding package MEME are highly unstable with respect to the specific choice of sequences. This instability suggests that the types of sensitivity analysis enabled by DivergentSet may be widely useful for identifying the motifs of biological significance.
Revised on May 26, 2006
Accepted on June 11, 2006
DivergentSet: picking non-redundant sequences from large sequence collections
![]()
CiteULike
Complore
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
C. A. Lozupone and R. Knight Global patterns in bacterial diversity PNAS, July 3, 2007; 104(27): 11436 - 11440. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |
| All ASBMB Journals | Journal of Biological Chemistry |
| Journal of Lipid Research | ASBMB Today |