Hi Jen, I have RNA-seq data for 2 biologial samples loaded in to galaxy. The samples are from human skin biopsies from genital herpes reactivation. Using Tophat I will align the reads to human genome. But I am NOT how to build genital herpes genome (HSV-2) and aligh reads to HSV-2 genome? Any suggestion is greatly appreciated!
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Friday, August 12, 2011 11:54 AM To: Peng, Tao; galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello,
Glad to hear that you were able to load your data.
When logged in, FTP loaded data will initially be in the FTP upload area
under the "Get Data -> Upload" tool (in the center pane). From here, load data into the history that you wish to work with. If you are not sure where this is exactly, please note the graphics in the FTP wiki and
screencast.
As a reminder, please send all new and followup questions with a to or cc to the mailing list, not to individual team members. This is important for our team to be able to track and answer questions. We would appreciate your helping out with this going forward.
Hopefully this helps.
Jen Galaxy team
On 8/12/11 11:39 AM, Peng, Tao wrote:
Hi jen, I just finished FTP of R4 data set with 8.4 GB. Why can't I
see
the data on the right panel of galaxy after loggin in?
Tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, August 09, 2011 3:26 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
It sounds like the load works with an uncompressed file but is failing when compressed? Perhaps there is a problem with the compression
itself.
We can't really help with this part of the process except to note
which
compression types we accept. The help wiki link again is: http://wiki.g2.bx.psu.edu/Learn/Upload%20via%20FTP
The final goal would be to have all of the data compressed and then
load
using FTP in a batch. If this takes too long to run, then the next option is to run either a local or cloud install. Help can be found
at:
http://getgalaxy.org http://galaxyproject.org/Admin/Cloud
Please send all questions related to local or cloud installations to
the
galaxy-dev@bx.psu.edu mailing list and not to individual team members
or
to the galaxy-bugs@bs.psu.edu mailing list.
Best wishes for your project,
Jen Galaxy team
On 8/9/11 2:25 PM, Peng, Tao wrote:
If I up-load one file (uncompressed) at one time, I will have 4x8=32 files to up-load. Each file takes about 1 hour to load. This is untenable. Any suggestion?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, August 09, 2011 2:04 PM To: Peng, Tao; galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
On 8/9/11 1:55 PM, Peng, Tao wrote:
Hi Jen, if I have 8 fastq.gz files for each of my 4 samples, how
should
I load the data for analysis in galaxy?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, August 09, 2011 1:31 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
For your other question sent to me directly, Galaxy will accept only
one
file per archive, so sending a multi-file .gz will result in only
the
first file in the archive loaded into the "Get Data -> Upload"
FTP
area.
Given the current issues, please try a restart. Log out of Galaxy
and
FileZilla (and restart your computer if possible). Then begin again, testing with a single, uncompressed file. You do not have to be
logged
into your Galaxy account to load with FileZilla, but you will need
to
be
logged in to access the files on the "Get Data -> Upload" form
after
upload is complete.
If this fails, perhaps try reinstalling FileZilla or even a
different
FTP client. This tutorial covers an alternative that can also be
used
from the desktop: http://galaxyproject.org/Learn/Upload via FTP
If you do have more questions, please leave galaxy-bugs@bx.psu.edu
on
the cc list so that our entire team can help contribute to replies.
This
may also be a question that you want to ask the larger user
community
at
galaxy-user@bx.psu.edu, as this sounds like an external issue.
Another
user may have encountered this problem using a PC and have
suggestions.
Thanks!
Jen Galaxy team
On 8/9/11 1:16 PM, Peng, Tao wrote:
Hi jen, I attached a screen shot of FileZilla. Do you know why the
top
panel says "disconnected from server" while the bottom panel
indicates
it is transferring the data?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Monday, August 08, 2011 4:55 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
The loading times do seem to be long for the size of the files,
perhaps
the internet connection is the problem. That said, FileZilla can
restart
an FTP if it is interrupted and a message would be reported. I
didn't
see any restarts reported in the log in the screenshot you sent.
The compression is another place to look for a problem. You might
try
to
use an alternate compression or no compression and load that way.
Try
one file, through FileZilla, as a test to see if that performs
better.
Winzip can compress in a few different formats, .bz2 is also
accepted
by
Galaxy.
Hopefully one of these will work. Please keep galaxy-bugs in the cc
for
any follow-up so that our team can help contribute to replies,
Thanks,
Jen
On 8/8/11 4:04 PM, Peng, Tao wrote:
Hi jen, when I FTP 9 of the fasq.gz files (each about 350 MB) for sample, the FTP program (FileZilla) keep coping the files again
and
again, this is strange to me. Do you know what went wrong here.
I attached a powerpoint slide that is the screen shot of FileZila (R1_002.fasq.gz has been copied 3x so far??)
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Friday, July 29, 2011 1:20 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
Our apologies, the public Galaxy instance at http://usegalaxy.org
has
been experiencing higher than usual usage the last few days. This
should
now be improved. Please try your FTP load again - it is the best
way
to
load files and the sizes you mention are well within the accepted limits.
Thank you for your patience,
Best,
Jen Galaxy team
On 7/29/11 9:47 AM, Peng, Tao wrote:
Hi I work at FHCRC in Seattle. I am trying to load the FASQ data
from
Illumina HySeq. I have 4 samples with 2.4 GB of seq data per
sample.
For
the overnight FTP, I can't even finish loading half of one
sample.
Is
there any faster way to load the data? Can GALAXY do the analysis
for
this type of large data?
Thanks,
tao
Hellp Tao,
When using a custom reference genome, the file must be in fasta format and the datatype must be set as fasta (use pencil icon to modify "Edit Attributes" as needed). Any fasta file is allowed, however those with simplified identifiers (no pipes "|", in particular) tend to have less problems with certain tools. If there are errors, the most likely cause is formatting.
If there are problems and you cannot resolve the format issue, please send a bug report associated with a failed mapping run using the custom reference genome. Use the green bug icon for the failed dataset when reporting the error (instead of screenshots).
I will watch for your bug report. If you use a different email than this one for your galaxy account, please note that in the bug report comments area so that I can link the two threads.
Thanks,
Jen Galaxy team
On 8/15/11 10:38 AM, Peng, Tao wrote:
Hi Jen, I have RNA-seq data for 2 biologial samples loaded in to galaxy. The samples are from human skin biopsies from genital herpes reactivation. Using Tophat I will align the reads to human genome. But I am NOT how to build genital herpes genome (HSV-2) and aligh reads to HSV-2 genome? Any suggestion is greatly appreciated!
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Friday, August 12, 2011 11:54 AM To: Peng, Tao; galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello,
Glad to hear that you were able to load your data.
When logged in, FTP loaded data will initially be in the FTP upload area
under the "Get Data -> Upload" tool (in the center pane). From here, load data into the history that you wish to work with. If you are not sure where this is exactly, please note the graphics in the FTP wiki and
screencast.
As a reminder, please send all new and followup questions with a to or cc to the mailing list, not to individual team members. This is important for our team to be able to track and answer questions. We would appreciate your helping out with this going forward.
Hopefully this helps.
Jen Galaxy team
On 8/12/11 11:39 AM, Peng, Tao wrote:
Hi jen, I just finished FTP of R4 data set with 8.4 GB. Why can't I
see
the data on the right panel of galaxy after loggin in?
Tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, August 09, 2011 3:26 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
It sounds like the load works with an uncompressed file but is failing when compressed? Perhaps there is a problem with the compression
itself.
We can't really help with this part of the process except to note
which
compression types we accept. The help wiki link again is: http://wiki.g2.bx.psu.edu/Learn/Upload%20via%20FTP
The final goal would be to have all of the data compressed and then
load
using FTP in a batch. If this takes too long to run, then the next option is to run either a local or cloud install. Help can be found
at:
http://getgalaxy.org http://galaxyproject.org/Admin/Cloud
Please send all questions related to local or cloud installations to
the
galaxy-dev@bx.psu.edu mailing list and not to individual team members
or
to the galaxy-bugs@bs.psu.edu mailing list.
Best wishes for your project,
Jen Galaxy team
On 8/9/11 2:25 PM, Peng, Tao wrote:
If I up-load one file (uncompressed) at one time, I will have 4x8=32 files to up-load. Each file takes about 1 hour to load. This is untenable. Any suggestion?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, August 09, 2011 2:04 PM To: Peng, Tao; galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
On 8/9/11 1:55 PM, Peng, Tao wrote:
Hi Jen, if I have 8 fastq.gz files for each of my 4 samples, how
should
I load the data for analysis in galaxy?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, August 09, 2011 1:31 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
For your other question sent to me directly, Galaxy will accept only
one
file per archive, so sending a multi-file .gz will result in only
the
first file in the archive loaded into the "Get Data -> Upload"
FTP
area.
Given the current issues, please try a restart. Log out of Galaxy
and
FileZilla (and restart your computer if possible). Then begin again, testing with a single, uncompressed file. You do not have to be
logged
into your Galaxy account to load with FileZilla, but you will need
to
be
logged in to access the files on the "Get Data -> Upload" form
after
upload is complete.
If this fails, perhaps try reinstalling FileZilla or even a
different
FTP client. This tutorial covers an alternative that can also be
used
from the desktop: http://galaxyproject.org/Learn/Upload via FTP
If you do have more questions, please leave galaxy-bugs@bx.psu.edu
on
the cc list so that our entire team can help contribute to replies.
This
may also be a question that you want to ask the larger user
community
at
galaxy-user@bx.psu.edu, as this sounds like an external issue.
Another
user may have encountered this problem using a PC and have
suggestions.
Thanks!
Jen Galaxy team
On 8/9/11 1:16 PM, Peng, Tao wrote:
Hi jen, I attached a screen shot of FileZilla. Do you know why the
top
panel says "disconnected from server" while the bottom panel
indicates
it is transferring the data?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Monday, August 08, 2011 4:55 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
The loading times do seem to be long for the size of the files,
perhaps
the internet connection is the problem. That said, FileZilla can
restart
an FTP if it is interrupted and a message would be reported. I
didn't
see any restarts reported in the log in the screenshot you sent.
The compression is another place to look for a problem. You might
try
to
use an alternate compression or no compression and load that way.
Try
one file, through FileZilla, as a test to see if that performs
better.
Winzip can compress in a few different formats, .bz2 is also
accepted
by
Galaxy.
Hopefully one of these will work. Please keep galaxy-bugs in the cc
for
any follow-up so that our team can help contribute to replies,
Thanks,
Jen
On 8/8/11 4:04 PM, Peng, Tao wrote:
Hi jen, when I FTP 9 of the fasq.gz files (each about 350 MB) for sample, the FTP program (FileZilla) keep coping the files again
and
again, this is strange to me. Do you know what went wrong here.
I attached a powerpoint slide that is the screen shot of FileZila (R1_002.fasq.gz has been copied 3x so far??)
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Friday, July 29, 2011 1:20 PM To: Peng, Tao Cc: galaxy-bugs@bx.psu.edu Subject: Re: [galaxy-bugs] loading data
Hello Tao,
Our apologies, the public Galaxy instance at http://usegalaxy.org
has
been experiencing higher than usual usage the last few days. This
should
now be improved. Please try your FTP load again - it is the best
way
to
load files and the sizes you mention are well within the accepted limits.
Thank you for your patience,
Best,
Jen Galaxy team
On 7/29/11 9:47 AM, Peng, Tao wrote: > Hi I work at FHCRC in Seattle. I am trying to load the FASQ data
from
> Illumina HySeq. I have 4 samples with 2.4 GB of seq data per
sample.
For > the overnight FTP, I can't even finish loading half of one
sample.
Is
> there any faster way to load the data? Can GALAXY do the analysis
for
> this type of large data? > > Thanks, > > tao > > > >
Hi jen, I have used BOWTIE to align my RNA-seq reads to HSV2 genome; out of 35,000,000 lines, only 621 lines left when I chose to have mapped reads only. How can visualize these aligned reads to HSV-2 genome?
In the panel of converted SAM to BAM, I tried to use the data in trickster, but I am not sure to how to build a HSV genome as a reference?
I appreciate your help,
tao
Hello Tao,
For the Bowtie results, the aligned results may be low because the data is RNA and not DNA. TopHat is generally considered a better choice for RNA since it allows for bridges over splice sites (introns). The full documentation for each program is on each tool's form and/or you can contact the tool authors with scientific questions at tophat.cufflinks@gmail.com.
Also, a tutorial and FAQ are available here: http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq
For visualization, an update that allows the use of a user-specified fasta reference genome is coming out very soon. For now, you can view annotation by creating a custom genome build, but the actual reference will be not included. Use "Visualization -> New Track Browser" and follow the instructions for "Is the build not listed here? Add a Custom Build".
Help for using the tool is available here: http://galaxyproject.org/Learn/Visualization
As stated before, please email the mailing list directly and not individual team members. Specifically, with a "to" to the mailing list (only) and not including team members as a "to" or "cc" unless ask to do so when sharing private data. Our internal tracking system and public archives rely on this method. Thank you for your future corporation.
Best,
Jen Galaxy team
On 8/18/11 3:15 PM, Peng, Tao wrote:
Hi jen, I have used BOWTIE to align my RNA-seq reads to HSV2 genome; out of 35,000,000 lines, only 621 lines left when I chose to have mapped reads only. How can visualize these aligned reads to HSV-2 genome?
In the panel of converted SAM to BAM, I tried to use the data in trickster, but I am not sure to how to build a HSV genome as a reference?
I appreciate your help,
tao
Hi jen, I followed the GALAXY web cast to check the quality of RNA-seq data: one sample seem to have score above 20 in most bases (R2); but the other one is around 6-8 in most bases (R4) (see the attached PDF files).
Does this mean R4 RNA-seq data are BAD? What exactly does it mean anyway?
Thanks for your help,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thu 8/18/2011 3:46 PM To: galaxy-user@bx.psu.edu Cc: Peng, Tao Subject: visualization of alignment
Hello Tao,
For the Bowtie results, the aligned results may be low because the data is RNA and not DNA. TopHat is generally considered a better choice for RNA since it allows for bridges over splice sites (introns). The full documentation for each program is on each tool's form and/or you can contact the tool authors with scientific questions at tophat.cufflinks@gmail.com.
Also, a tutorial and FAQ are available here: http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq
For visualization, an update that allows the use of a user-specified fasta reference genome is coming out very soon. For now, you can view annotation by creating a custom genome build, but the actual reference will be not included. Use "Visualization -> New Track Browser" and follow the instructions for "Is the build not listed here? Add a Custom Build".
Help for using the tool is available here: http://galaxyproject.org/Learn/Visualization
As stated before, please email the mailing list directly and not individual team members. Specifically, with a "to" to the mailing list (only) and not including team members as a "to" or "cc" unless ask to do so when sharing private data. Our internal tracking system and public archives rely on this method. Thank you for your future corporation.
Best,
Jen Galaxy team
On 8/18/11 3:15 PM, Peng, Tao wrote:
Hi jen, I have used BOWTIE to align my RNA-seq reads to HSV2 genome; out of 35,000,000 lines, only 621 lines left when I chose to have mapped reads only. How can visualize these aligned reads to HSV-2 genome?
In the panel of converted SAM to BAM, I tried to use the data in trickster, but I am not sure to how to build a HSV genome as a reference?
I appreciate your help,
tao
===> Please use "Reply All" when responding to this email! <===
Hi jen, I followed the GALAXY web cast to check the quality of RNA-seq data: one sample seem to have score above 20 in most bases (R2); but the other one is around 6-8 in most bases (R4) (see the attached PDF files).
Does this mean R4 RNA-seq data are BAD? What exactly does it mean anyway?
Thanks for your help,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thu 8/18/2011 3:46 PM To: galaxy-user@bx.psu.edu Cc: Peng, Tao Subject: visualization of alignment
Hello Tao,
For the Bowtie results, the aligned results may be low because the data is RNA and not DNA. TopHat is generally considered a better choice for RNA since it allows for bridges over splice sites (introns). The full documentation for each program is on each tool's form and/or you can contact the tool authors with scientific questions at tophat.cufflinks@gmail.com.
Also, a tutorial and FAQ are available here: http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq
For visualization, an update that allows the use of a user-specified fasta reference genome is coming out very soon. For now, you can view annotation by creating a custom genome build, but the actual reference will be not included. Use "Visualization -> New Track Browser" and follow the instructions for "Is the build not listed here? Add a Custom Build".
Help for using the tool is available here: http://galaxyproject.org/Learn/Visualization
As stated before, please email the mailing list directly and not individual team members. Specifically, with a "to" to the mailing list (only) and not including team members as a "to" or "cc" unless ask to do so when sharing private data. Our internal tracking system and public archives rely on this method. Thank you for your future corporation.
Best,
Jen Galaxy team
On 8/18/11 3:15 PM, Peng, Tao wrote:
Hi jen, I have used BOWTIE to align my RNA-seq reads to HSV2 genome; out of 35,000,000 lines, only 621 lines left when I chose to have mapped reads only. How can visualize these aligned reads to HSV-2 genome?
In the panel of converted SAM to BAM, I tried to use the data in trickster, but I am not sure to how to build a HSV genome as a reference?
I appreciate your help,
tao
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
===> Please use "Reply All" when responding to this email!<===
Hello Tao,
The tool "NGS: QC and manipulation -> FastQC" (last tool in group) may be helpful for your project.
In general, sequence with quality scores this low would be considered unusable. Perhaps double check the options used with the Fastq Groomer tool? Or check/filter the data before grooming?
This may not be the case for your data, but just in case, please note that CASAVA 1.8+ now produces both filtered and unfiltered results and would need to be used with the "Sanger" option with the "Fastq Groomer" tool.
This prior Q&A explains the filtering: http://gmod.827538.n3.nabble.com/Filtering-Illumina-CASAVA-1-8-FASTQ-files-t...
Hopefully this helps. Please send future questions directly to the mailing list as the "to" recipient. There is no need to send directly "to" or as "cc" any of the Galaxy team directly. This helps us to track and address questions quickly and as a team.
Best,
Jen Galaxy team
Hi jen, I followed the GALAXY web cast to check the quality of RNA-seq data: one sample seem to have score above 20 in most bases (R2); but the other one is around 6-8 in most bases (R4) (see the attached PDF files).
Does this mean R4 RNA-seq data are BAD? What exactly does it mean anyway?
Thanks for your help,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thu 8/18/2011 3:46 PM To: galaxy-user@bx.psu.edu Cc: Peng, Tao Subject: visualization of alignment
Hello Tao,
For the Bowtie results, the aligned results may be low because the data is RNA and not DNA. TopHat is generally considered a better choice for RNA since it allows for bridges over splice sites (introns). The full documentation for each program is on each tool's form and/or you can contact the tool authors with scientific questions at tophat.cufflinks@gmail.com.
Also, a tutorial and FAQ are available here: http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq
For visualization, an update that allows the use of a user-specified fasta reference genome is coming out very soon. For now, you can view annotation by creating a custom genome build, but the actual reference will be not included. Use "Visualization -> New Track Browser" and follow the instructions for "Is the build not listed here? Add a Custom Build".
Help for using the tool is available here: http://galaxyproject.org/Learn/Visualization
As stated before, please email the mailing list directly and not individual team members. Specifically, with a "to" to the mailing list (only) and not including team members as a "to" or "cc" unless ask to do so when sharing private data. Our internal tracking system and public archives rely on this method. Thank you for your future corporation.
Best,
Jen Galaxy team
On 8/18/11 3:15 PM, Peng, Tao wrote:
Hi jen, I have used BOWTIE to align my RNA-seq reads to HSV2 genome; out of 35,000,000 lines, only 621 lines left when I chose to have mapped reads only. How can visualize these aligned reads to HSV-2 genome?
In the panel of converted SAM to BAM, I tried to use the data in trickster, but I am not sure to how to build a HSV genome as a reference?
I appreciate your help,
tao
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Thanks, jen. I have asked informatic scientists at Hutch to do the QC for me and both R2 and R4 are ok from FASTQC analysis.
My question is: Do I still need to use the groomer in GALAXY and use the groomed data for further analysis such as TOPHAT? Should I skip the steps to compute quality statistics and draw boxplots using the groomed data?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Fri 8/26/2011 7:19 PM To: galaxy-user Cc: Peng, Tao Subject: Re: [galaxy-user] quality score
===> Please use "Reply All" when responding to this email!<===
Hello Tao,
The tool "NGS: QC and manipulation -> FastQC" (last tool in group) may be helpful for your project.
In general, sequence with quality scores this low would be considered unusable. Perhaps double check the options used with the Fastq Groomer tool? Or check/filter the data before grooming?
This may not be the case for your data, but just in case, please note that CASAVA 1.8+ now produces both filtered and unfiltered results and would need to be used with the "Sanger" option with the "Fastq Groomer" tool.
This prior Q&A explains the filtering: http://gmod.827538.n3.nabble.com/Filtering-Illumina-CASAVA-1-8-FASTQ-files-t...
Hopefully this helps. Please send future questions directly to the mailing list as the "to" recipient. There is no need to send directly "to" or as "cc" any of the Galaxy team directly. This helps us to track and address questions quickly and as a team.
Best,
Jen Galaxy team
Hi jen, I followed the GALAXY web cast to check the quality of RNA-seq data: one sample seem to have score above 20 in most bases (R2); but the other one is around 6-8 in most bases (R4) (see the attached PDF files).
Does this mean R4 RNA-seq data are BAD? What exactly does it mean anyway?
Thanks for your help,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thu 8/18/2011 3:46 PM To: galaxy-user@bx.psu.edu Cc: Peng, Tao Subject: visualization of alignment
Hello Tao,
For the Bowtie results, the aligned results may be low because the data is RNA and not DNA. TopHat is generally considered a better choice for RNA since it allows for bridges over splice sites (introns). The full documentation for each program is on each tool's form and/or you can contact the tool authors with scientific questions at tophat.cufflinks@gmail.com.
Also, a tutorial and FAQ are available here: http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq
For visualization, an update that allows the use of a user-specified fasta reference genome is coming out very soon. For now, you can view annotation by creating a custom genome build, but the actual reference will be not included. Use "Visualization -> New Track Browser" and follow the instructions for "Is the build not listed here? Add a Custom Build".
Help for using the tool is available here: http://galaxyproject.org/Learn/Visualization
As stated before, please email the mailing list directly and not individual team members. Specifically, with a "to" to the mailing list (only) and not including team members as a "to" or "cc" unless ask to do so when sharing private data. Our internal tracking system and public archives rely on this method. Thank you for your future corporation.
Best,
Jen Galaxy team
On 8/18/11 3:15 PM, Peng, Tao wrote:
Hi jen, I have used BOWTIE to align my RNA-seq reads to HSV2 genome; out of 35,000,000 lines, only 621 lines left when I chose to have mapped reads only. How can visualize these aligned reads to HSV-2 genome?
In the panel of converted SAM to BAM, I tried to use the data in trickster, but I am not sure to how to build a HSV genome as a reference?
I appreciate your help,
tao
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi how can I specify a GTF gene annotation file when running tophat to guide the alignment to human genome? What is the best way to visualize the tophat results in the context of annotated human genome, i.e. RefSeq?
Thanks,
tao
===> Please use "Reply All" when responding to this email! <===
Hello Tao,
Sorry for the delayed reply, your question did not post to the mailing list since the "to" was not _only_ to galaxy-user.
Going forward, please leave off any "to" or "cc" to team members when asking a question. Send all questions directly "to" "galaxy-user@bx.psu.edu" and do not include any "Re" or "Fwd" text in the subject line.
Regarding RNA-seq analysis and reference GTF files, the place to incorporate the GTF file is in the Cufflinks step, the option to select the GTF file from your history is on the tool's form. If you have questions about the tools that are not addressed by these help links:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise
then contacting the tool authors would be the next step: email tophat.cufflinks@gmail.com
To visualize the data, the available options will be links associated with each dataset (expand the dataset box to locate these). The Galaxy Track Browser (GTB) aka "Trackster", UCSC Genome Browser, Ensembl, and GeneTrack are potential options; the datatype will determine which links are provided.
Hopefully this helps,
Best,
Jen Galaxy team
-------- Original Message -------- Subject: run tophat in galaxy Date: Sun, 28 Aug 2011 08:50:04 -0700 From: Peng, Tao tpeng@fhcrc.org To: Jennifer Jackson jen@bx.psu.edu, galaxy-user galaxy-user@lists.bx.psu.edu
Hi how can I specify a GTF gene annotation file when running tophat to guide the alignment to human genome? What is the best way to visualize the tophat results in the context of annotated human genome, i.e. RefSeq?
Thanks,
tao
===> Please use "Reply All" when responding to this email! <===
Hi Tao,
I made an error in my prior reply, it is possible to guide assembly in TopHat. To do this, on the TopHat form, change "TopHat settings to use:" from "Use Defaults" to "Full parameter list". In the expanded form:
1 - change "Use Own Junctions:" to be "yes". 2 - change "Use Gene Annotation Model:" to be "yes" 3 - in the new pull-down menu, select the GTF file from your history
Great question! Glad that we were able to provide you with the correct instruction,
Best,
Jen Galaxy team
On 9/15/11 1:38 PM, Jennifer Jackson wrote:
===> Please use "Reply All" when responding to this email! <===
Hello Tao,
Sorry for the delayed reply, your question did not post to the mailing list since the "to" was not _only_ to galaxy-user.
Going forward, please leave off any "to" or "cc" to team members when asking a question. Send all questions directly "to" "galaxy-user@bx.psu.edu" and do not include any "Re" or "Fwd" text in the subject line.
Regarding RNA-seq analysis and reference GTF files, the place to incorporate the GTF file is in the Cufflinks step, the option to select the GTF file from your history is on the tool's form. If you have questions about the tools that are not addressed by these help links:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise
then contacting the tool authors would be the next step: email tophat.cufflinks@gmail.com
To visualize the data, the available options will be links associated with each dataset (expand the dataset box to locate these). The Galaxy Track Browser (GTB) aka "Trackster", UCSC Genome Browser, Ensembl, and GeneTrack are potential options; the datatype will determine which links are provided.
Hopefully this helps,
Best,
Jen Galaxy team
-------- Original Message -------- Subject: run tophat in galaxy Date: Sun, 28 Aug 2011 08:50:04 -0700 From: Peng, Tao tpeng@fhcrc.org To: Jennifer Jackson jen@bx.psu.edu, galaxy-user galaxy-user@lists.bx.psu.edu
Hi how can I specify a GTF gene annotation file when running tophat to guide the alignment to human genome? What is the best way to visualize the tophat results in the context of annotated human genome, i.e. RefSeq?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi when I am done with cuffdiff analysis in GALAXY, I got many tabular data output for gene, transcript, promoter and CDS differential expression testing, I wonder if there are some systematic ways to look at the data output from cuffdiff analysis?
Thanks,
tao
I wrote a python script and am in the process of writing the xml interface to do just this at this very moment but it uses matplotlib to draw the images. I would be happy to share it with you, but it probably would not work on the public site unless they have matplotlib installed or would be willing to install it.
The result is attached and the "error" bars are the 95% confid intervals. File type can be pdf or png (or pretty much anything with small code change).
Gus
On Wed, Sep 21, 2011 at 1:37 PM, Peng, Tao tpeng@fhcrc.org wrote:
Hi when I am done with cuffdiff analysis in GALAXY, I got many tabular data output for gene, transcript, promoter and CDS differential expression testing, I wonder if there are some systematic ways to look at the data output from cuffdiff analysis?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Jen, thank you for following up on my question. Is any tool in GALAXY to visualize the coverage of aligned reads from TopHat on human chromosomes (histogram or density plot)?
Tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thursday, September 15, 2011 2:37 PM To: galaxy-user Cc: Peng, Tao Subject: Re: [galaxy-user] run tophat in galaxy
===> Please use "Reply All" when responding to this email! <===
Hi Tao,
I made an error in my prior reply, it is possible to guide assembly in TopHat. To do this, on the TopHat form, change "TopHat settings to use:"
from "Use Defaults" to "Full parameter list". In the expanded form:
1 - change "Use Own Junctions:" to be "yes". 2 - change "Use Gene Annotation Model:" to be "yes" 3 - in the new pull-down menu, select the GTF file from your history
Great question! Glad that we were able to provide you with the correct instruction,
Best,
Jen Galaxy team
On 9/15/11 1:38 PM, Jennifer Jackson wrote:
===> Please use "Reply All" when responding to this email! <===
Hello Tao,
Sorry for the delayed reply, your question did not post to the mailing list since the "to" was not _only_ to galaxy-user.
Going forward, please leave off any "to" or "cc" to team members when asking a question. Send all questions directly "to" "galaxy-user@bx.psu.edu" and do not include any "Re" or "Fwd" text in the subject line.
Regarding RNA-seq analysis and reference GTF files, the place to incorporate the GTF file is in the Cufflinks step, the option to
select
the GTF file from your history is on the tool's form. If you have questions about the tools that are not addressed by these help links:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise
then contacting the tool authors would be the next step: email tophat.cufflinks@gmail.com
To visualize the data, the available options will be links associated with each dataset (expand the dataset box to locate these). The Galaxy Track Browser (GTB) aka "Trackster", UCSC Genome Browser, Ensembl, and GeneTrack are potential options; the datatype will determine which
links
are provided.
Hopefully this helps,
Best,
Jen Galaxy team
-------- Original Message -------- Subject: run tophat in galaxy Date: Sun, 28 Aug 2011 08:50:04 -0700 From: Peng, Tao tpeng@fhcrc.org To: Jennifer Jackson jen@bx.psu.edu, galaxy-user galaxy-user@lists.bx.psu.edu
Hi how can I specify a GTF gene annotation file when running tophat to guide the alignment to human genome? What is the best way to visualize the tophat results in the context of annotated human genome, i.e.
RefSeq?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Tao,
Yes, the resulting SAM dataset can be converted to BAM and viewed in the GTB (Galaxy Track Browser).
http://galaxyproject.org/wiki/Learn -> scroll to "Visualization" to find: http://galaxyproject.org/wiki/Learn/Visualization
The GTB can be reached through a different links, but one quick way to do this is:
1 - start with TopHat's output SAM dataset 2 - use the tool "NGS: SAM Tools -> SAM-to-BAM" 3 - hover over "Visualization" in the top menu bar then click on "New Track Browser" 4 - at the prompt, name the visualization and specify the reference genome and click on "Continue" 5 - once the browser opens, click on the "Add Datasets to Visualization" prompt to add datasets. Histories and Libraries can be navigated, selected, then individual datasets selected and loaded. 6 - default view for BAM datasets a coverage histogram. 7 - adding more tracks and other functions can be performed by using the left menu "Actions" on the GTB interface. Be sure to use "Save" before navigating away if you want to use the same browser again.
If datasets are not available to add to a browser, then likely there is a mismatch between the browser's reference database and the database assigned to the dataset. In some cases this can and should be adjusted (perhaps database was unassigned after an analysis).
The SAM file can also be converted to interval, then BED format for visualization, but BAM is the most direct route and preserves the sequence content when zoomed in at the base level.
Hopefully this helps you and others to learn more about the GTB. This tool is under active development. Screencasts and more example documentation is on the way soon to offer more help.
Best,
Jen Galaxy team
On 9/27/11 12:45 PM, Peng, Tao wrote:
Jen, thank you for following up on my question. Is any tool in GALAXY to visualize the coverage of aligned reads from TopHat on human chromosomes (histogram or density plot)?
Tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thursday, September 15, 2011 2:37 PM To: galaxy-user Cc: Peng, Tao Subject: Re: [galaxy-user] run tophat in galaxy
===> Please use "Reply All" when responding to this email!<===
Hi Tao,
I made an error in my prior reply, it is possible to guide assembly in TopHat. To do this, on the TopHat form, change "TopHat settings to use:"
from "Use Defaults" to "Full parameter list". In the expanded form:
1 - change "Use Own Junctions:" to be "yes". 2 - change "Use Gene Annotation Model:" to be "yes" 3 - in the new pull-down menu, select the GTF file from your history
Great question! Glad that we were able to provide you with the correct instruction,
Best,
Jen Galaxy team
On 9/15/11 1:38 PM, Jennifer Jackson wrote:
===> Please use "Reply All" when responding to this email!<===
Hello Tao,
Sorry for the delayed reply, your question did not post to the mailing list since the "to" was not _only_ to galaxy-user.
Going forward, please leave off any "to" or "cc" to team members when asking a question. Send all questions directly "to" "galaxy-user@bx.psu.edu" and do not include any "Re" or "Fwd" text in the subject line.
Regarding RNA-seq analysis and reference GTF files, the place to incorporate the GTF file is in the Cufflinks step, the option to
select
the GTF file from your history is on the tool's form. If you have questions about the tools that are not addressed by these help links:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise
then contacting the tool authors would be the next step: email tophat.cufflinks@gmail.com
To visualize the data, the available options will be links associated with each dataset (expand the dataset box to locate these). The Galaxy Track Browser (GTB) aka "Trackster", UCSC Genome Browser, Ensembl, and GeneTrack are potential options; the datatype will determine which
links
are provided.
Hopefully this helps,
Best,
Jen Galaxy team
-------- Original Message -------- Subject: run tophat in galaxy Date: Sun, 28 Aug 2011 08:50:04 -0700 From: Peng, Taotpeng@fhcrc.org To: Jennifer Jacksonjen@bx.psu.edu, galaxy-user galaxy-user@lists.bx.psu.edu
Hi how can I specify a GTF gene annotation file when running tophat to guide the alignment to human genome? What is the best way to visualize the tophat results in the context of annotated human genome, i.e.
RefSeq?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Thanks a lot. Please see the attached screen shot from using Browser in GALAXY for viewing results of Tophat. What do those numbers mean? Is there any way to adjust Y-axis to have the bar taller so it will be easier to see?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, September 27, 2011 1:13 PM To: Peng, Tao Cc: galaxy-user Subject: Re: [galaxy-user] run tophat in galaxy
Hi Tao,
Yes, the resulting SAM dataset can be converted to BAM and viewed in the
GTB (Galaxy Track Browser).
http://galaxyproject.org/wiki/Learn -> scroll to "Visualization" to find: http://galaxyproject.org/wiki/Learn/Visualization
The GTB can be reached through a different links, but one quick way to do this is:
1 - start with TopHat's output SAM dataset 2 - use the tool "NGS: SAM Tools -> SAM-to-BAM" 3 - hover over "Visualization" in the top menu bar then click on "New Track Browser" 4 - at the prompt, name the visualization and specify the reference genome and click on "Continue" 5 - once the browser opens, click on the "Add Datasets to Visualization"
prompt to add datasets. Histories and Libraries can be navigated, selected, then individual datasets selected and loaded. 6 - default view for BAM datasets a coverage histogram. 7 - adding more tracks and other functions can be performed by using the
left menu "Actions" on the GTB interface. Be sure to use "Save" before navigating away if you want to use the same browser again.
If datasets are not available to add to a browser, then likely there is a mismatch between the browser's reference database and the database assigned to the dataset. In some cases this can and should be adjusted (perhaps database was unassigned after an analysis).
The SAM file can also be converted to interval, then BED format for visualization, but BAM is the most direct route and preserves the sequence content when zoomed in at the base level.
Hopefully this helps you and others to learn more about the GTB. This tool is under active development. Screencasts and more example documentation is on the way soon to offer more help.
Best,
Jen Galaxy team
On 9/27/11 12:45 PM, Peng, Tao wrote:
Jen, thank you for following up on my question. Is any tool in GALAXY to visualize the coverage of aligned reads from TopHat on human chromosomes (histogram or density plot)?
Tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thursday, September 15, 2011 2:37 PM To: galaxy-user Cc: Peng, Tao Subject: Re: [galaxy-user] run tophat in galaxy
===> Please use "Reply All" when responding to this email!<===
Hi Tao,
I made an error in my prior reply, it is possible to guide assembly in TopHat. To do this, on the TopHat form, change "TopHat settings to
use:"
from "Use Defaults" to "Full parameter list". In the expanded form:
1 - change "Use Own Junctions:" to be "yes". 2 - change "Use Gene Annotation Model:" to be "yes" 3 - in the new pull-down menu, select the GTF file from your history
Great question! Glad that we were able to provide you with the correct instruction,
Best,
Jen Galaxy team
On 9/15/11 1:38 PM, Jennifer Jackson wrote:
===> Please use "Reply All" when responding to this email!<===
Hello Tao,
Sorry for the delayed reply, your question did not post to the
mailing
list since the "to" was not _only_ to galaxy-user.
Going forward, please leave off any "to" or "cc" to team members when asking a question. Send all questions directly "to" "galaxy-user@bx.psu.edu" and do not include any "Re" or "Fwd" text in the subject line.
Regarding RNA-seq analysis and reference GTF files, the place to incorporate the GTF file is in the Cufflinks step, the option to
select
the GTF file from your history is on the tool's form. If you have questions about the tools that are not addressed by these help links:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise
then contacting the tool authors would be the next step: email tophat.cufflinks@gmail.com
To visualize the data, the available options will be links associated with each dataset (expand the dataset box to locate these). The
Galaxy
Track Browser (GTB) aka "Trackster", UCSC Genome Browser, Ensembl,
and
GeneTrack are potential options; the datatype will determine which
links
are provided.
Hopefully this helps,
Best,
Jen Galaxy team
-------- Original Message -------- Subject: run tophat in galaxy Date: Sun, 28 Aug 2011 08:50:04 -0700 From: Peng, Taotpeng@fhcrc.org To: Jennifer Jacksonjen@bx.psu.edu, galaxy-user galaxy-user@lists.bx.psu.edu
Hi how can I specify a GTF gene annotation file when running tophat
to
guide the alignment to human genome? What is the best way to
visualize
the tophat results in the context of annotated human genome, i.e.
RefSeq?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi I wonder how I can quantitatively visualize expression of a gene/transcript which has a significant p-value/q value from splicing differential testing from Cuffdiff analysis? I tried track browser in GALAXY and UCSC genome browser and none of them really provide a quantitative analysis of gene/transcript expression from Cuffdiff results.
Thanks,
tao
-----Original Message----- From: Peng, Tao Sent: Friday, September 30, 2011 3:13 PM To: 'Jennifer Jackson' Cc: galaxy-user Subject: RE: [galaxy-user] run tophat in galaxy
Thanks a lot. Please see the attached screen shot from using Browser in GALAXY for viewing results of Tophat. What do those numbers mean? Is there any way to adjust Y-axis to have the bar taller so it will be easier to see?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, September 27, 2011 1:13 PM To: Peng, Tao Cc: galaxy-user Subject: Re: [galaxy-user] run tophat in galaxy
Hi Tao,
Yes, the resulting SAM dataset can be converted to BAM and viewed in the
GTB (Galaxy Track Browser).
http://galaxyproject.org/wiki/Learn -> scroll to "Visualization" to find: http://galaxyproject.org/wiki/Learn/Visualization
The GTB can be reached through a different links, but one quick way to do this is:
1 - start with TopHat's output SAM dataset 2 - use the tool "NGS: SAM Tools -> SAM-to-BAM" 3 - hover over "Visualization" in the top menu bar then click on "New Track Browser" 4 - at the prompt, name the visualization and specify the reference genome and click on "Continue" 5 - once the browser opens, click on the "Add Datasets to Visualization"
prompt to add datasets. Histories and Libraries can be navigated, selected, then individual datasets selected and loaded. 6 - default view for BAM datasets a coverage histogram. 7 - adding more tracks and other functions can be performed by using the
left menu "Actions" on the GTB interface. Be sure to use "Save" before navigating away if you want to use the same browser again.
If datasets are not available to add to a browser, then likely there is a mismatch between the browser's reference database and the database assigned to the dataset. In some cases this can and should be adjusted (perhaps database was unassigned after an analysis).
The SAM file can also be converted to interval, then BED format for visualization, but BAM is the most direct route and preserves the sequence content when zoomed in at the base level.
Hopefully this helps you and others to learn more about the GTB. This tool is under active development. Screencasts and more example documentation is on the way soon to offer more help.
Best,
Jen Galaxy team
On 9/27/11 12:45 PM, Peng, Tao wrote:
Jen, thank you for following up on my question. Is any tool in GALAXY to visualize the coverage of aligned reads from TopHat on human chromosomes (histogram or density plot)?
Tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thursday, September 15, 2011 2:37 PM To: galaxy-user Cc: Peng, Tao Subject: Re: [galaxy-user] run tophat in galaxy
===> Please use "Reply All" when responding to this email!<===
Hi Tao,
I made an error in my prior reply, it is possible to guide assembly in TopHat. To do this, on the TopHat form, change "TopHat settings to
use:"
from "Use Defaults" to "Full parameter list". In the expanded form:
1 - change "Use Own Junctions:" to be "yes". 2 - change "Use Gene Annotation Model:" to be "yes" 3 - in the new pull-down menu, select the GTF file from your history
Great question! Glad that we were able to provide you with the correct instruction,
Best,
Jen Galaxy team
On 9/15/11 1:38 PM, Jennifer Jackson wrote:
===> Please use "Reply All" when responding to this email!<===
Hello Tao,
Sorry for the delayed reply, your question did not post to the
mailing
list since the "to" was not _only_ to galaxy-user.
Going forward, please leave off any "to" or "cc" to team members when asking a question. Send all questions directly "to" "galaxy-user@bx.psu.edu" and do not include any "Re" or "Fwd" text in the subject line.
Regarding RNA-seq analysis and reference GTF files, the place to incorporate the GTF file is in the Cufflinks step, the option to
select
the GTF file from your history is on the tool's form. If you have questions about the tools that are not addressed by these help links:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise
then contacting the tool authors would be the next step: email tophat.cufflinks@gmail.com
To visualize the data, the available options will be links associated with each dataset (expand the dataset box to locate these). The
Galaxy
Track Browser (GTB) aka "Trackster", UCSC Genome Browser, Ensembl,
and
GeneTrack are potential options; the datatype will determine which
links
are provided.
Hopefully this helps,
Best,
Jen Galaxy team
-------- Original Message -------- Subject: run tophat in galaxy Date: Sun, 28 Aug 2011 08:50:04 -0700 From: Peng, Taotpeng@fhcrc.org To: Jennifer Jacksonjen@bx.psu.edu, galaxy-user galaxy-user@lists.bx.psu.edu
Hi how can I specify a GTF gene annotation file when running tophat
to
guide the alignment to human genome? What is the best way to
visualize
the tophat results in the context of annotated human genome, i.e.
RefSeq?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Tao,
Cuffdiff output is tabular data, and hence any statistical tool that supports tabular data can be used to analyze/visualize Cuffdiff outputs. In Galaxy, using tools in the categories Statistics and Graph/Display Data should enable you to visualize basic aspects of your data.
Good luck, J.
On Oct 4, 2011, at 4:05 PM, Peng, Tao wrote:
Hi I wonder how I can quantitatively visualize expression of a gene/transcript which has a significant p-value/q value from splicing differential testing from Cuffdiff analysis? I tried track browser in GALAXY and UCSC genome browser and none of them really provide a quantitative analysis of gene/transcript expression from Cuffdiff results.
Thanks,
tao
-----Original Message----- From: Peng, Tao Sent: Friday, September 30, 2011 3:13 PM To: 'Jennifer Jackson' Cc: galaxy-user Subject: RE: [galaxy-user] run tophat in galaxy
Thanks a lot. Please see the attached screen shot from using Browser in GALAXY for viewing results of Tophat. What do those numbers mean? Is there any way to adjust Y-axis to have the bar taller so it will be easier to see?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Tuesday, September 27, 2011 1:13 PM To: Peng, Tao Cc: galaxy-user Subject: Re: [galaxy-user] run tophat in galaxy
Hi Tao,
Yes, the resulting SAM dataset can be converted to BAM and viewed in the
GTB (Galaxy Track Browser).
http://galaxyproject.org/wiki/Learn -> scroll to "Visualization" to find: http://galaxyproject.org/wiki/Learn/Visualization
The GTB can be reached through a different links, but one quick way to do this is:
1 - start with TopHat's output SAM dataset 2 - use the tool "NGS: SAM Tools -> SAM-to-BAM" 3 - hover over "Visualization" in the top menu bar then click on "New Track Browser" 4 - at the prompt, name the visualization and specify the reference genome and click on "Continue" 5 - once the browser opens, click on the "Add Datasets to Visualization"
prompt to add datasets. Histories and Libraries can be navigated, selected, then individual datasets selected and loaded. 6 - default view for BAM datasets a coverage histogram. 7 - adding more tracks and other functions can be performed by using the
left menu "Actions" on the GTB interface. Be sure to use "Save" before navigating away if you want to use the same browser again.
If datasets are not available to add to a browser, then likely there is a mismatch between the browser's reference database and the database assigned to the dataset. In some cases this can and should be adjusted (perhaps database was unassigned after an analysis).
The SAM file can also be converted to interval, then BED format for visualization, but BAM is the most direct route and preserves the sequence content when zoomed in at the base level.
Hopefully this helps you and others to learn more about the GTB. This tool is under active development. Screencasts and more example documentation is on the way soon to offer more help.
Best,
Jen Galaxy team
On 9/27/11 12:45 PM, Peng, Tao wrote:
Jen, thank you for following up on my question. Is any tool in GALAXY to visualize the coverage of aligned reads from TopHat on human chromosomes (histogram or density plot)?
Tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thursday, September 15, 2011 2:37 PM To: galaxy-user Cc: Peng, Tao Subject: Re: [galaxy-user] run tophat in galaxy
===> Please use "Reply All" when responding to this email!<===
Hi Tao,
I made an error in my prior reply, it is possible to guide assembly in TopHat. To do this, on the TopHat form, change "TopHat settings to
use:"
from "Use Defaults" to "Full parameter list". In the expanded form:
1 - change "Use Own Junctions:" to be "yes". 2 - change "Use Gene Annotation Model:" to be "yes" 3 - in the new pull-down menu, select the GTF file from your history
Great question! Glad that we were able to provide you with the correct instruction,
Best,
Jen Galaxy team
On 9/15/11 1:38 PM, Jennifer Jackson wrote:
===> Please use "Reply All" when responding to this email!<===
Hello Tao,
Sorry for the delayed reply, your question did not post to the
mailing
list since the "to" was not _only_ to galaxy-user.
Going forward, please leave off any "to" or "cc" to team members when asking a question. Send all questions directly "to" "galaxy-user@bx.psu.edu" and do not include any "Re" or "Fwd" text in the subject line.
Regarding RNA-seq analysis and reference GTF files, the place to incorporate the GTF file is in the Cufflinks step, the option to
select
the GTF file from your history is on the tool's form. If you have questions about the tools that are not addressed by these help links:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise
then contacting the tool authors would be the next step: email tophat.cufflinks@gmail.com
To visualize the data, the available options will be links associated with each dataset (expand the dataset box to locate these). The
Galaxy
Track Browser (GTB) aka "Trackster", UCSC Genome Browser, Ensembl,
and
GeneTrack are potential options; the datatype will determine which
links
are provided.
Hopefully this helps,
Best,
Jen Galaxy team
-------- Original Message -------- Subject: run tophat in galaxy Date: Sun, 28 Aug 2011 08:50:04 -0700 From: Peng, Taotpeng@fhcrc.org To: Jennifer Jacksonjen@bx.psu.edu, galaxy-user galaxy-user@lists.bx.psu.edu
Hi how can I specify a GTF gene annotation file when running tophat
to
guide the alignment to human genome? What is the best way to
visualize
the tophat results in the context of annotated human genome, i.e.
RefSeq?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
I have related question If I have to use Ensembl mouse GTF file (Mus_musculus.NCBIM37.64) Do I have to download and reformat it or Galaxy can take it from the source directly?
Thanks
On Sun, Aug 28, 2011 at 8:50 AM, Peng, Tao tpeng@fhcrc.org wrote:
===> Please use "Reply All" when responding to this email! <===
**
Hi how can I specify a GTF gene annotation file when running tophat to guide the alignment to human genome? What is the best way to visualize the tophat results in the context of annotated human genome, i.e. RefSeq?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
===> Please use "Reply All" when responding to this email! <===
Hello,
Chromosome names will need to be modified once the file is imported, as explained in #5 of the FAQ: http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq
Hopefully this helps,
Best, Jen Galaxy team
On 9/15/11 8:22 PM, shamsher jagat wrote:
I have related question If I have to use Ensembl mouse GTF file (Mus_musculus.NCBIM37.64) Do I have to download and reformat it or Galaxy can take it from the source directly?
Thanks
On Sun, Aug 28, 2011 at 8:50 AM, Peng, Tao <tpeng@fhcrc.org mailto:tpeng@fhcrc.org> wrote:
===> Please use "Reply All" when responding to this email! <=== __ Hi how can I specify a GTF gene annotation file when running tophat to guide the alignment to human genome? What is the best way to visualize the tophat results in the context of annotated human genome, i.e. RefSeq? Thanks, tao ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org <http://usegalaxy.org>. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Hi jen and galaxy users, when I try to use cufflinks in GALAXY and select "use reference annotation", I am NOT sure how to get the annotation file in GTF format for cufflinks.
I was trying to down-load the annotation file from Illumina donated package in Cufflink website, I am NOT sure this is the right way to go for?
Thanks,
tao
If using Ensembl make sure it is properly formatted. Vasu
--- On Tue, 8/30/11, Peng, Tao tpeng@fhcrc.org wrote:
From: Peng, Tao tpeng@fhcrc.org Subject: [galaxy-user] annotation file for cufflinks To: "Jennifer Jackson" jen@bx.psu.edu, "galaxy-user" galaxy-user@lists.bx.psu.edu Date: Tuesday, August 30, 2011, 1:52 PM
Hi jen and galaxy users, when I try to use cufflinks in GALAXY and select "use reference annotation", I am NOT sure how to get the annotation file in GTF format for cufflinks.
I was trying to down-load the annotation file from Illumina donated package in Cufflink website, I am NOT sure this is the right way to go for?
Thanks,
tao
-----Inline Attachment Follows-----
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
In the cufflink site, they have annotation from ucsc,ncbi & ensemble. not sure which one to download and then load into galaxy? thanks, Tao
Sent from my iPhone
On Aug 30, 2011, at 5:52 PM, "vasu punj" punjv@yahoo.com wrote:
If using Ensembl make sure it is properly formatted. Vasu
--- On Tue, 8/30/11, Peng, Tao tpeng@fhcrc.org wrote:
From: Peng, Tao tpeng@fhcrc.org Subject: [galaxy-user] annotation file for cufflinks To: "Jennifer Jackson" jen@bx.psu.edu, "galaxy-user" galaxy-user@lists.bx.psu.edu Date: Tuesday, August 30, 2011, 1:52 PM
Hi jen and galaxy users, when I try to use cufflinks in GALAXY and select "use reference annotation", I am NOT sure how to get the annotation file in GTF format for cufflinks.
I was trying to down-load the annotation file from Illumina donated package in Cufflink website, I am NOT sure this is the right way to go for?
Thanks,
tao
-----Inline Attachment Follows-----
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi I have down-loaded UCSC HG19 annotation files from Cufflinks; Should I just load genes.GTF file (about 100 MB) to galaxy for cufflink analysis?
Thanks,
tao
-----Original Message----- From: vasu punj [mailto:punjv@yahoo.com] Sent: Tue 8/30/2011 5:52 PM To: Jennifer Jackson; galaxy-user; Peng, Tao Subject: Re: [galaxy-user] annotation file for cufflinks
If using Ensembl make sure it is properly formatted. Vasu
--- On Tue, 8/30/11, Peng, Tao tpeng@fhcrc.org wrote:
From: Peng, Tao tpeng@fhcrc.org Subject: [galaxy-user] annotation file for cufflinks To: "Jennifer Jackson" jen@bx.psu.edu, "galaxy-user" galaxy-user@lists.bx.psu.edu Date: Tuesday, August 30, 2011, 1:52 PM
Hi jen and galaxy users, when I try to use cufflinks in GALAXY and select "use reference annotation", I am NOT sure how to get the annotation file in GTF format for cufflinks.
I was trying to down-load the annotation file from Illumina donated package in Cufflink website, I am NOT sure this is the right way to go for?
Thanks,
tao
-----Inline Attachment Follows-----
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
UCSC should work fine. Vasu
--- On Tue, 8/30/11, Peng, Tao tpeng@fhcrc.org wrote:
From: Peng, Tao tpeng@fhcrc.org Subject: RE: [galaxy-user] annotation file for cufflinks To: "vasu punj" punjv@yahoo.com, "Jennifer Jackson" jen@bx.psu.edu, "galaxy-user" galaxy-user@lists.bx.psu.edu Date: Tuesday, August 30, 2011, 10:53 PM
Hi I have down-loaded UCSC HG19 annotation files from Cufflinks; Should I just load genes.GTF file (about 100 MB) to galaxy for cufflink analysis?
Thanks,
tao
-----Original Message----- From: vasu punj [mailto:punjv@yahoo.com] Sent: Tue 8/30/2011 5:52 PM To: Jennifer Jackson; galaxy-user; Peng, Tao Subject: Re: [galaxy-user] annotation file for cufflinks
If using Ensembl make sure it is properly formatted. Vasu
--- On Tue, 8/30/11, Peng, Tao tpeng@fhcrc.org wrote:
From: Peng, Tao tpeng@fhcrc.org Subject: [galaxy-user] annotation file for cufflinks To: "Jennifer Jackson" jen@bx.psu.edu, "galaxy-user" galaxy-user@lists.bx.psu.edu Date: Tuesday, August 30, 2011, 1:52 PM
Hi jen and galaxy users, when I try to use cufflinks in GALAXY and select "use reference annotation", I am NOT sure how to get the annotation file in GTF format for cufflinks.
I was trying to down-load the annotation file from Illumina donated package in Cufflink website, I am NOT sure this is the right way to go for?
Thanks,
tao
-----Inline Attachment Follows-----
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
I had a similar question, after i downloaded a bacterial ref genome GFF file and loaded it into Galaxy, the "use reference annotation" feature of Cufflinks kept telling it need to be the GTF file instead, When I check the Cufflinks manual at http://cufflinks.cbcb.umd.edu/manual.html It looks like both GFF and GTF should be ok. Any insight? Gary
On Aug 30, 2011, at 12:52 PM, Peng, Tao wrote:
Hi jen and galaxy users, when I try to use cufflinks in GALAXY and select "use reference annotation", I am NOT sure how to get the annotation file in GTF format for cufflinks.
I was trying to down-load the annotation file from Illumina donated package in Cufflink website, I am NOT sure this is the right way to go for?
Thanks,
tao
The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
This http://bioperl.org/pipermail/bioperl-l/2005-November/020109.html might help to convert gff to gtf.
J
From: galaxy-user-bounces@lists.bx.psu.edu [mailto:galaxy-user-bounces@lists.bx.psu.edu] On Behalf Of Gary Xie Sent: Friday, September 23, 2011 4:41 PM To: Peng, Tao Cc: galaxy-user Subject: Re: [galaxy-user] annotation file for cufflinks
I had a similar question, after i downloaded a bacterial ref genome GFF file and loaded it into Galaxy, the "use reference annotation" feature of Cufflinks kept telling it need to be the GTF file instead, When I check the Cufflinks manual at http://cufflinks.cbcb.umd.edu/manual.html It looks like both GFF and GTF should be ok. Any insight? Gary
On Aug 30, 2011, at 12:52 PM, Peng, Tao wrote:
Hi jen and galaxy users, when I try to use cufflinks in GALAXY and select "use reference annotation", I am NOT sure how to get the annotation file in GTF format for cufflinks.
I was trying to down-load the annotation file from Illumina donated package in Cufflink website, I am NOT sure this is the right way to go for?
Thanks,
tao ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi I am NOT sure why running cufflinks failed here. Thanks for your suggestion,
tao
----------------------- Tool: Cufflinks Name: Cufflinks on data 6 and data 26: assembled transcripts Created: Sep 01, 2011 Filesize: 81.3 Mb Dbkey: hg19 Format: gtf Tool Version:
Input Parameter Value SAM or BAM file of aligned RNA-Seq reads 6: Tophat for R4_CG_wh_accepted_hits Max Intron Length 300000 Min Isoform Fraction 0.05 Pre MRNA Fraction 0.05 Perform quartile normalization Yes Conditional (reference_annotation) 1 Reference Aonnotation 26: Homo_sapiens.GRCh37.63.gtf Conditional (bias_correction) 0 Conditional (seq_source) 0 Conditional (singlePaired) 0
----------------------------------------------------------
Message from History panel in GALAXY:
An error occurred running this job: cufflinks v1.0.3 cufflinks -q --no-update-check -I 300000 -F 0.050000 -j 0.050000 -p 8 -N -b /galaxy/data/hg19/sam_index/hg19.fa Error running cufflinks. [18:40:45] Inspecting reads and determining fragment length distribution. Processed 915556 loci.
===> Please use "Reply All" when responding to this email <===
Hello,
This is the same reply as for the bug report, but for others who may run into the same problem job that fails with this error:
terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
the reason is explained in #3 in the RNA-seq FAQ: http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq#faq3
A local or cloud instance may be the solution. These options are explained here: http://galaxyproject.org/wiki/Big%20Picture/Choices
Our apologies for any inconvenience,
Best,
Jen Galaxy team
On 9/1/11 4:55 PM, Peng, Tao wrote:
Hi I am NOT sure why running cufflinks failed here. Thanks for your suggestion,
tao
Tool: Cufflinks Name: Cufflinks on data 6 and data 26: assembled transcripts Created: Sep 01, 2011 Filesize: 81.3 Mb Dbkey: hg19 Format: gtf Tool Version:
Input Parameter Value SAM or BAM file of aligned RNA-Seq reads 6: Tophat for R4_CG_wh_accepted_hits Max Intron Length 300000 Min Isoform Fraction 0.05 Pre MRNA Fraction 0.05 Perform quartile normalization Yes Conditional (reference_annotation) 1 Reference Aonnotation 26: Homo_sapiens.GRCh37.63.gtf Conditional (bias_correction) 0 Conditional (seq_source) 0 Conditional (singlePaired) 0
Message from History panel in GALAXY:
An error occurred running this job: cufflinks v1.0.3 cufflinks -q --no-update-check -I 300000 -F 0.050000 -j 0.050000 -p 8 -N -b /galaxy/data/hg19/sam_index/hg19.fa Error running cufflinks. [18:40:45] Inspecting reads and determining fragment length distribution. Processed 915556 loci.
Hi I had 3 GTF files from ensemble, UCSC and NCBI for annotation; ONLY ensemble were recognized by GALAXY cufflinks as a GTF file although they all have .GTF. I am NOT sure why UCSC and NCBI GTF files were seen as GFF files?
Thx,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thursday, September 01, 2011 5:08 PM To: galaxy-user Cc: Peng, Tao Subject: running cufflinks
===> Please use "Reply All" when responding to this email <===
Hello,
This is the same reply as for the bug report, but for others who may run
into the same problem job that fails with this error:
terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
the reason is explained in #3 in the RNA-seq FAQ: http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq#faq3
A local or cloud instance may be the solution. These options are explained here: http://galaxyproject.org/wiki/Big%20Picture/Choices
Our apologies for any inconvenience,
Best,
Jen Galaxy team
On 9/1/11 4:55 PM, Peng, Tao wrote:
Hi I am NOT sure why running cufflinks failed here. Thanks for your suggestion,
tao
Tool: Cufflinks Name: Cufflinks on data 6 and data 26: assembled transcripts Created: Sep 01, 2011 Filesize: 81.3 Mb Dbkey: hg19 Format: gtf Tool Version:
Input Parameter Value SAM or BAM file of aligned RNA-Seq reads 6: Tophat for R4_CG_wh_accepted_hits Max Intron Length 300000 Min Isoform Fraction 0.05 Pre MRNA Fraction 0.05 Perform quartile normalization Yes Conditional (reference_annotation) 1 Reference Aonnotation 26: Homo_sapiens.GRCh37.63.gtf Conditional (bias_correction) 0 Conditional (seq_source) 0 Conditional (singlePaired) 0
Message from History panel in GALAXY:
An error occurred running this job: cufflinks v1.0.3 cufflinks -q --no-update-check -I 300000 -F 0.050000 -j 0.050000 -p 8
-N
-b /galaxy/data/hg19/sam_index/hg19.fa Error running cufflinks. [18:40:45] Inspecting reads and determining fragment length distribution. Processed 915556 loci.
On Tue, Sep 6, 2011 at 9:29 PM, Peng, Tao tpeng@fhcrc.org wrote:
Hi I had 3 GTF files from ensemble, UCSC and NCBI for annotation; ONLY ensemble were recognized by GALAXY cufflinks as a GTF file although they all have .GTF. I am NOT sure why UCSC and NCBI GTF files were seen as GFF files?
Thx,
I'm pretty sure Galaxy ignores the original filename extension (it will be stored on disk as *.dat once uploaded to Galaxy).
If you could post the start of each file (or links to the complete files) that would be very helpful for working out why Galaxy has misidentified the GTF files as GFF.
Peter
Hi Jen, I have been using the following link for learning to visualize TOPHAT results:
http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise
How can get the RefGENE for the whole human genome instead of Ch19 listed in tutorial?
Thanks,
tao
-----Original Message----- From: Jennifer Jackson [mailto:jen@bx.psu.edu] Sent: Thu 8/18/2011 3:46 PM To: galaxy-user@bx.psu.edu Cc: Peng, Tao Subject: visualization of alignment
Hello Tao,
For the Bowtie results, the aligned results may be low because the data is RNA and not DNA. TopHat is generally considered a better choice for RNA since it allows for bridges over splice sites (introns). The full documentation for each program is on each tool's form and/or you can contact the tool authors with scientific questions at tophat.cufflinks@gmail.com.
Also, a tutorial and FAQ are available here: http://usegalaxy.org/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq
For visualization, an update that allows the use of a user-specified fasta reference genome is coming out very soon. For now, you can view annotation by creating a custom genome build, but the actual reference will be not included. Use "Visualization -> New Track Browser" and follow the instructions for "Is the build not listed here? Add a Custom Build".
Help for using the tool is available here: http://galaxyproject.org/Learn/Visualization
As stated before, please email the mailing list directly and not individual team members. Specifically, with a "to" to the mailing list (only) and not including team members as a "to" or "cc" unless ask to do so when sharing private data. Our internal tracking system and public archives rely on this method. Thank you for your future corporation.
Best,
Jen Galaxy team
On 8/18/11 3:15 PM, Peng, Tao wrote:
Hi jen, I have used BOWTIE to align my RNA-seq reads to HSV2 genome; out of 35,000,000 lines, only 621 lines left when I chose to have mapped reads only. How can visualize these aligned reads to HSV-2 genome?
In the panel of converted SAM to BAM, I tried to use the data in trickster, but I am not sure to how to build a HSV genome as a reference?
I appreciate your help,
tao
galaxy-user@lists.galaxyproject.org