Jeremy,

 

I have another question when  I filter gene  list In the filtered list there are multiple rows per gene. I should have one gene per row? I have attached the snap shot of out put, but not sure if galaxy server will take it or not. I did se the discussion on other forum:

http://seqanswers.com/forums/showthread.php?t=8830

 

which suggest that possible complications in getting one gene per row. My next question is in that scenario what should be the best way of representing one gene per FPKM value? should we take average of FPKM per gene? I think in the gene it is till giving the transcript FPKM value but these values are different from previous file filtered with transcript id.

 

Thanks.

Jagat
 
 

 
On Tue, May 3, 2011 at 3:08 AM, shamsher jagat <kanwarjag@gmail.com> wrote:
Jeremy,
 
I have been trying to follow  the steps in filtering Cufflink out put files you have  described in one of the previous messages (http://gmod.827538.n3.nabble.com/Re-downstream-analysis-of-cuffdiff-out-put-td2836457.html):
 
I have shared histroy with you, but in summary:
 

File 35: when Filter GTF data by attributes value list on data 11 (combined GTF) and data 33 (which is gene expr  file) . Will not this should have one gene per row. But it is not?

File 39:  Filter GTF file by attribute value list on data 11 and data 38 (Cuffdiff splicing expr) it failed. I would assume that it should filter  on the basis of TSSid . The error message is

Traceback (most recent call last):

  File "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", line 67, in

    filter( gff_file, attribute_name, ids_file, output_file )

  File "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", line 57, in filter

    if attributes[ attribute_name ] in ids_dict:

KeyError: 'tss_id'

40 : Filter GTF data by attribute list on data 11 and 34 (tss group exp) failed and error message is:

Traceback (most recent call last):
  File "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", line 67, in 
    filter( gff_file, attribute_name, ids_file, output_file )
  File "/var/opt/galaxy/g2test/galaxy_test/tools/filters/gff/gtf_filter_by_attribute_values_list.py", line 57, in filter
    if attributes[ attribute_name ] in ids_dict:
KeyError: 'tss_id'

 

I would consider that if one gene has different Id than there is splicing .

However in contrast isoform file with transcript Id is working fine (File 20)

 On a different note can I convert GTF file to txt tab delaminated file I tried to convert file 11 in txt (following Edit attributes) but the file is not properly formatted especially col-pid and TSS id. Am I doing something wrong.

Thanks.