Dear Noa and Bomba

i am also in situation like you, hardly any programming experiance but want to analyse, bacterial tranmscriptome before and after an stress.

I am very new to galaxy, would be obliged if any one can give me more hints, i have few queries

In Brief; we did our sequnecing by Hiseq 2000, from our partner i got two files (FASTQ) for each treatment, for example R1 and R2

i was sucessful to upload these files (Fastq) and reference genome file (Fasta) via FTP,

after upload, i run the Fastq groomer, with my understanding i must quality filter my sequences, but i am not sure what shoud be the cutoff.

after reading Noa email, seems like i should do Bowtie mapping, after groomer , and filter is is there any other middle step also before i go to bowtie

Another question is, should i join both the files (fastq joiner) at any stage of analysis

Bomba could you please send me link as you stated, for RNA seqq analysis on Univerity of albama website, although it is for eukaryote, but may be will help me to understand the steps

Thanking you all

From: Noa Sher <noa.sher@gmail.com>
To: Jeremy Goecks <jeremy.goecks@emory.edu>
Cc: "galaxy-user@lists.bx.psu.edu (galaxy-user@lists.bx.psu.edu)" <galaxy-user@lists.bx.psu.edu>; closeticket@galaxyproject.org; Bomba Dam <bomba.dam@mpi-marburg.mpg.de>
Sent: Thursday, February 16, 2012 8:31 PM
Subject: Re: [galaxy-user] Using galaxy for Bacterial RNA-seq

Hi Bomba

I have been recently struggling with the same issue (RNA-Seq on a bacteria; no programming experience).

I was in touch with the authors of the tophat-cufflinks suite and they all suggested that given that bacteria have little or no introns, you can do bowtie followed by cufflinks and skip the tophat. Alternatively, if you do decide to do tophat and then cufflinks, make SURE to change the intron size parameters, otherwise you will get a mess. I used min intron distance 10bp, max size 1000bp. If you just want to do comparative work you can do bowtie and then cuffdiff directly. Look at your data with the Galaxy genome browser after you run tophat or bowtie to make sure it looks OK. (I found it easier to convert bowtie data from SAM to BAM file and then to view it). One thing I havent fully looked into is what happens at the ends of the mapping (since bowtie assumes linear chromosomes and not circularized, so just be aware of that difference as not all reads will align properly at the ends).

Feel free to contact me if you need more help - I am definitely not an expert but have been struggling through doing RNA-Seq on galaxy for the past month or so, so may be able to help with some things. Make sure you use matching gtf reference annotation (if you have this) and genome!

Good luck!

noa

On 16/02/2012 20:01, Jeremy Goecks wrote:

Bomba,

I'm not familiar enough with bacterial/prokaryotic transcriptomes to suggest a possible workflow. You might try the standard Tophat-Cufflinks-Cuffcompare/merge-Cuffdiff workflow and see whether you get meaningful results; Tophat runs Bowtie internally, so there's no reason to run Bowtie separately unless there are Bowtie-specific parameters that you need to modify. I've had very little experience with PALMapper and can't speak to its efficacy, either for eukaryotic or prokaryotic transcriptome analyses.

Finally, I've cc'd the galaxy-user mailing list. Using this list is the best way to reach the Galaxy user community and get in touch with someone that has used Galaxy to analyze bacterial transcriptomes.

Good luck,
J.


On Feb 16, 2012, at 9:17 AM, Bomba Dam wrote:

Dear Dr. Goecks,

I am working as a post-doctoral fellow in MPI Marburg, Germany. We am trying to understand the differential expression of genes in a methanotrophic bacterium under different growth conditions. We are sequencing the transcriptome using Illumina Hiseq. As I dont have expertise in programming languages, I found the Galaxy interface very user-friendly for doing such transcriptome analysis. However, I could not find a step wise protocol\workflow for mapping bacterial RNA-seq against the reference genome (we have the completely sequenced genome of our model organism). I have found a detailed step by step workflow for RNA-seq analysis from the University of Alabama web-site (uab.edu). However, it refers to the eukaryotic system.
Most examples provided and used for analysis are from eukaryotic systems. I am a bit confused weather the same workflow will also work well for bacterial systems as there are no splicing events or I should make some modifications. Can you kindly suggest me which workflow should I follow for mapping the bacterial reads (Bowtie, Tophat or PALMapper) and subsequent quantification steps. I want some guidance in this regard.

With kind regards,

Bomba Dam
--
Dr. BOMBA DAM
Alexander von Humboldt Postdoctoral Research Fellow
Max-Planck-Institut für terrestrische Mikrobiologie
Karl-von-Frisch-Straße 10
D-35043 Marburg, Germany
E mail: bomba.dam@mpi-marburg.mpg.de
PHONE: +49 176 321 321 75 (Mobile); +49 6421 178 721 (LAB); +49 6421 2828516 (ROOM)

Assistant Professor of Microbiology
Department of Botany, Institute of Science
Visva-Bharati (A Central University)
Santiniketan, West Bengal 731235, India.
E mail: bumba_micro@visva-bhatari.ac.in, bumba_micro@rediffmail.com;

___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

http://lists.bx.psu.edu/