Hi Vasu,

Please cc the galaxy-user email list so that everyone can benefit from this discussion and so that it is archived.

On to your questions:

OK I have two replicates of sample. My question is when I have to run Cuffcompare or cuffdiff I have to use them as single file in RNA seq analysis. Either I can combine bam files or combine before running Tophat. What is suggested.

The answer depends on what you're looking for. There are two options:

(1) If you're looking for differences between the two samples--e.g. the samples came from two different tissues or from two different time periods in the same tissue--you should run each sample through Tophat & Cufflinks and then run Cuffcompare and Cuffdiff on the GTF files generated by Cufflinks.

(2) If you're looking for differences within the two samples--e.g. the samples are two lanes of sequencing data from the same biological replicate--then you should combine all your reads before running Tophat-->Cufflinks-->Cuffcompare.


Secondly I have downloaded Ensembl file as suggested by You but than how to make it happen that when I do the analysis Cufflink or cuffcompare read this file.

The Ensemble gene annotation GTF should be used as the "reference" for Cuffcompare; optionally, you can also use the GTF as a reference for Cufflinks as well.

Best,
J.