Hi: My name is Ying, a postdotoral associate working in Yale university. I am trying to use tophat in galaxy to analyze paired-end RNA-seq data, however, after I groomer and use tophat to analyze them, I only got an empty file. I am wondering what is the problem here. One thing I am concerned is that my files is kind of large, e.g., for each end I got a file with 20.7 gb, so totally 41.4 gb for both ends of this sample. So do you think it is because the files are too big? Also when the genome center sent me the data, they already separate the two ends into two files, so I uploaded two files for one sample, but when do tophat analysis, do I need to merge those two files into one file? do you know what is the usual parameter set up for paried-end RNA-seq analysis(with a 300 bp fragment)? Thank you very much! I really appreciate your help! Best Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286
Hi Ying, One thing I am concerned is that my files
is kind of large, e.g., for each end I got a file with 20.7 gb, so totally 41.4 gb for both ends of this sample. So do you think it is because the files are too big?
You'll want to use the FTP upload process to upload data this large to Galaxy. See the upload page for details; we recommend FileZilla as a nice GUI for performing FTP: http://filezilla-project.org/
Also when the genome center sent me the data, they already separate the two ends into two files, so I uploaded two files for one sample, but when do tophat analysis, do I need to merge those two files into one file?
You'll want to perform a single Tophat analysis on both files since they're paired. However, no merging is required: Tophat will accept both files in a single run; see the Tophat tool for details.
do you know what is the usual parameter set up for paried-end RNA-seq analysis(with a 300 bp fragment)?
Tophat has a set of default parameters, but you'll likely want to experiment with different values to see what works best for your data. Note that the mean inner distance between pairs is (Insert_size-N*2 ) where N is the size of each read fragment. Best, J.
participants (2)
-
Jeremy Goecks
-
Ying Zhang