Re: [galaxy-user] question about Filtering Cufflink files
Jagat, First, a couple housekeeping issues: (a) the questions you're asking are better suited to the galaxy-user list (questions about using Galaxy and performing analyses) rather than galaxy-dev (questions about installing Galaxy locally and tool development), so I've moved this thread to galaxy-user; (b) please start new threads when appropriate rather than replying to older threads as this makes threads shorter and more focused. Onto your questions:
I have another question when I filter gene list In the filtered list there are multiple rows per gene. I should have one gene per row? I have attached the snap shot of out put, but not sure if galaxy server will take it or not. I did se the discussion on other forum: http://seqanswers.com/forums/showthread.php?t=8830
GTF files have multiple lines per feature, so your output is reasonable.
which suggest that possible complications in getting one gene per row. My next question is in that scenario what should be the best way of representing one gene per FPKM value? should we take average of FPKM per gene? I think in the gene it is till giving the transcript FPKM value but these values are different from previous file filtered with transcript id.
As Vasu noted, this is an ongoing area of research. For some experiments, it may be reasonable to group alternatively-spliced isoforms of the same gene and jointly estimate FPKM, and for others it may not. Fortunately, if you do want to group transcripts to get gene FPKM values, Cuffdiff does this for you: see its gene FPKM expression file. Best, J.
participants (1)
-
Jeremy Goecks