I just browsed into this message, but I wonder if galaxy users who make this request are aware of the MEME/MAST and other related motif-searching tools. They may not be exactly what's being requested of Galaxy but they could be useful.
Dear Stuart:
Unfortunately there are no currently tools for actually producing
multiple alignments, yet it is being requested more and more. I have
created a ticket for this issue (http://bitbucket.org/galaxy/galaxy-central/issue/218/multiple-alignemnts
), so you can follow its status.
Thanks,
anton
On Nov 2, 2009, at 1:41 PM, Brown, Stuart wrote:
>
> I am trying to come up with a nice workflow/tutorial for the use of
> Galaxy to search for Transcription Factor binding sites on a genome
> wide scale using pattern search tools. I want to train my students
> to think genomically and to use clever tools to leverage their
> abilities.
>
> Galaxy is absolutely awesome for grabbing the upstream promoter
> regions for all genes from any organism with a whole genome in UCSC.
> It is also possible to use the integrated EMBOSS tools such as
> fuzznuc and dreg to search for a known TFBS (or any other simple
> nucleotide pattern). However, I can't get past the simple search
> into a more clever infomation-based search. In particular I have the
> following workflow in mind:
>
>
> 1. Collect upstream regions for all mouse (or human) genes
> 2. Search for a published TF binding site with a single base
> mismatch using FUZZNUC
> 3. Make a multiple alignment of the sequences returned by FUZZNUC
> (not possible in any way that I have been able to find)
> 4. Make a logo from the alignment to identify informative positions
> and conserved substitutions (not in Galaxy)
> 5. Make a PSSM profile, HMM profile, or other smart searching tool
> from the aligned sequences (not in Galaxy)
> 6. Search the upstream regions again with this more sensitive
> pattern search method. (not in Galaxy).
> 7. Make a list of genes targeted with this TFBS,
> 8. Compare list of genes to microarray data showing co-regulation
> of this gene set, or to pathways
>
> I am frustrated at step 3. Even if I bring the FUZZNUC results to my
> desktop, there is no easy way to extract just sequences and make a
> multiple alignment. Many of the 'allowed' Fuzznuc optional output
> formats produce an error, or no useable output.
>
> Thanks for any suggestions.
>
> Stuart M. Brown, Ph.D.
> Associate Professor
> Center for Health Informatics and Bioinformatics
> NYU School of Medicine
> 550 First Ave, NY, NY 10016
> stuart.brown@med.nyu.edu
> (212)263-7689 FAX (212) 263-8139
>
> ------------------------------------------------------------
> This email message, including any attachments, is for the sole use
> of the intended recipient(s) and may contain information that is
> proprietary, confidential, and exempt from disclosure under
> applicable law. Any unauthorized review, use, disclosure, or
> distribution is prohibited. If you have received this email in error
> please notify the sender by return email and delete the original
> message. Please note, the recipient should check this email and any
> attachments for the presence of viruses. The organization accepts no
> liability for any damage caused by any virus transmitted by this
> email.
> =================================
>
> _______________________________________________
> galaxy-user mailing list
> galaxy-user@bx.psu.edu
> http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
Anton Nekrutenko
http://nekrut.bx.psu.edu
http://galaxyproject.org
_______________________________________________
galaxy-user mailing list
galaxy-user@bx.psu.edu
http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user