Hi Jen: I still have a little problem with the chromosome names. It appears that the mitochondria genes and chloroplast genes are named "ChrC" and "ChrM" in the gff3 file which I need to change to "chrC" and "chrM". How do I change cases specifically for the initial letters and not the entire words? Thanks Yang ----- 原始邮件 ----- 发件人: "Jennifer Jackson" <jen@bx.psu.edu> 收件人: "Yang Bi" <beyond@stanford.edu> 抄送: galaxy-user@lists.bx.psu.edu 发送时间: 星期一, 2014年 1 月 13日 下午 6:54:53 主题: Re: [galaxy-user] all FPKMs are 0 in the tmap files produced by cuffcompare Hello Yang, Glad the problem was isolated - the mismatched chromosomes is definitely something to be fixed. The tools in 'Text Manipulation" can help. The tool "Change Case of selected columns" can change the case for you. Click on the pencil icon after running the tool to reassign the datatype correctly as needed. Take care, Jen Galaxy team On 1/13/14 6:31 PM, Yang Bi wrote:
Hi Jen:
Thank you for the prompt reply. RPKMs produced by cufflink look normal (from an assembled transcript file):
Seqname Source Feature Start End Score Strand Frame Attributes chr1 Cufflinks transcript 11960 13178 1000 . . gene_id "CUFF.180"; transcript_id "CUFF.180.1"; FPKM "6.5441928094"; frac "1.000000"; conf_lo "3.594986"; conf_hi "8.987465"; cov "2.413218"; full_read_support "yes"; chr1 Cufflinks exon 11960 13178 1000 . . gene_id "CUFF.180"; transcript_id "CUFF.180.1"; exon_number "1"; FPKM "6.5441928094"; frac "1.000000"; conf_lo "3.594986"; conf_hi "8.987465"; cov "2.413218"; chr1 Cufflinks transcript 4536 5314 1000 + . gene_id "CUFF.178"; transcript_id "CUFF.178.1"; FPKM "11.0556332840"; frac "1.000000"; conf_lo "3.645830"; conf_hi "13.216134"; cov "4.076844"; full_read_support "no"; chr1 Cufflinks exon 4536 4605 1000 + . gene_id "CUFF.178"; transcript_id "CUFF.178.1"; exon_number "1"; FPKM "11.0556332840"; frac "1.000000"; conf_lo "3.645830"; conf_hi "13.216134"; cov "4.076844"; chr1 Cufflinks exon 4706 5095 1000 + . gene_id "CUFF.178"; transcript_id "CUFF.178.1"; exon_number "2"; FPKM "11.0556332840"; frac "1.000000"; conf_lo "3.645830"; conf_hi "13.216134"; cov "4.076844"; chr1 Cufflinks exon 5174 5314 1000 + . gene_id "CUFF.178"; transcript_id "CUFF.178.1"; exon_number "3"; FPKM "11.0556332840"; frac "1.000000"; conf_lo "3.645830"; conf_hi "13.216134"; cov "4.076844";
I checked the chromosome names and I realized that the BAM outputs use lower cases for "RNAME", eg. "chr1" while my gff3 file uses initial capital letters for "seqId", eg "Chr1". Could this be the problem? What is the fastest way to convert the capital C in my gff3 file to lower case?
Thank you very much Yang
-- Jennifer Hillman-Jackson http://galaxyproject.org