Re: [galaxy-user] RNA seq analysis

6 May 2011

      Hi,

You need to run fastq groomer on your rna-seq data.  Your reference is fine
as a fasta.

Austin

On Fri, May 6, 2011 at 10:26 AM, <puvan001@umn.edu> wrote:
...
Hi David,
Thanks!When I tried to run Tophat, it doesn't recognise my FASTA file and
it says "History does not include a dataset of the required format / build".
Do you have any thoughts about this?
Now it makes more sense about "multihits". Thanks for sharing your
workflow.
With regards
Sumathy
On May 6 2011, David Matthews wrote:
Hi,
...
I have done exactly the same kind of thing for adenovirus so I can help
with it. In answer to question 1 you do not need to index it will be done
for you when tophat is called. Secondly you should leave the 40 multihits as
it is and post analysis filter out the multihits - this will allow you to
determine if you do have a multihit problem or not and if so whether it is a
big problem and where it is on the genome. I have a workflow on Galaxy which
you can use called "Bristol workflow to get sorted unique proper pair mapped
reads". If you plug in your sam file it should give you files listing only
unique hits and those which map more than once. This workflow assumes you
have paired end data but it can be modified to work with single end reads as
well.
...
Hope this helps.
Best Wishes,
David.
__________________________________
Dr David A. Matthews
Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.
Tel. +44 117 3312058
Fax. +44 117 3312091
D.A.Matthews@bristol.ac.uk
On 6 May 2011, at 17:09, puvan001@umn.edu wrote:
Hi
...
I have a couple of questions regarding RNA seq analysis. My questions are
1.I need to use a viral genome (very small, ~2kb ) as a reference genome
and it is not available in Galaxy. I guess I can use this data from my
history. I have a fasta file but I am not sure whether I have to do some
kind of indexing or not.
...
...
2. In Tophat, default for "maximum number of alignments to be allowed"
is 40. What my understanding is a single read can be aligned maximum 40
different places. I am wondering why this is 40. Is there any specific
reason? If I need unique mapping, I have to use 1 instead of 40. Am I
correct?
...
...
Thanks
SP
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Sumathy Puvanendiran
Graduate student
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/

Re: [galaxy-user] RNA seq analysis

Austin Paul