how to find out the gene_ID correspond to CUFF ID
Dear Everyone: I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much! Best Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286
Hi, You need to supply a gene annotation file with cufflink to easily get the gene-id information. Without it, cufflinks simply tries its best to figure out what genes are present. The ensemble gtf file is quite a comprehensive one - there is a link to it on the cufflinks manual page. Good luck! David On 28 Feb 2011, at 21:33, Ying Zhang wrote:
Dear Everyone:
I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much!
Best
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ying, you could also try using the tools 'Fetch closest non-overlapping feature' and 'Intersect' to find genes nearby transcripts/genes/TSSes of interest; for both tools, you'll want a reference annotation, either from UCSC or Ensembl. Best, J. On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
Hi,
You need to supply a gene annotation file with cufflink to easily get the gene-id information. Without it, cufflinks simply tries its best to figure out what genes are present. The ensemble gtf file is quite a comprehensive one - there is a link to it on the cufflinks manual page.
Good luck! David
On 28 Feb 2011, at 21:33, Ying Zhang wrote:
Dear Everyone:
I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much!
Best
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
_______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi, Yeah, thats a good idea too!! I did not know about that tool, shows what I know (!) - thanks for the info! Cheers David On 1 Mar 2011, at 04:51, Jeremy Goecks wrote:
Ying, you could also try using the tools 'Fetch closest non-overlapping feature' and 'Intersect' to find genes nearby transcripts/genes/TSSes of interest; for both tools, you'll want a reference annotation, either from UCSC or Ensembl.
Best, J.
On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
Hi,
You need to supply a gene annotation file with cufflink to easily get the gene-id information. Without it, cufflinks simply tries its best to figure out what genes are present. The ensemble gtf file is quite a comprehensive one - there is a link to it on the cufflinks manual page.
Good luck! David
On 28 Feb 2011, at 21:33, Ying Zhang wrote:
Dear Everyone:
I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much!
Best
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
_______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Dear David: I followed your advices and downloaded reference sequence from Emsemble, then I uploaded this file into galaxy, and then I run the cufflinks using the file as a reference annotation, however I got error when I am running, the following the error message gave to me: An error occurred running this job: cufflinks v0.9.3 cufflinks -I 300000 -F 0.050000 -j 0.050000 -p 8 -Q 0 -G /galaxy/main_database/files/002/122/dataset_2122219.dat -r /galaxy/data/hg19/sam_index/hg19.fa Error running cufflinks. [11:47:14] Loading reference and sequence. GFF warning: mergi Do you have any idea of what is going wrong here? Best Ying Quoting David Matthews <D.A.Matthews@bristol.ac.uk>:
Hi,
Yeah, thats a good idea too!! I did not know about that tool, shows what I know (!) - thanks for the info!
Cheers David
On 1 Mar 2011, at 04:51, Jeremy Goecks wrote:
Ying, you could also try using the tools 'Fetch closest non-overlapping feature' and 'Intersect' to find genes nearby transcripts/genes/TSSes of interest; for both tools, you'll want a reference annotation, either from UCSC or Ensembl.
Best, J.
On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
Hi,
You need to supply a gene annotation file with cufflink to easily get the gene-id information. Without it, cufflinks simply tries its best to figure out what genes are present. The ensemble gtf file is quite a comprehensive one - there is a link to it on the cufflinks manual page.
Good luck! David
On 28 Feb 2011, at 21:33, Ying Zhang wrote:
Dear Everyone:
I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much!
Best
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
_______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286
I believe you need to format the Ensemble file Chromosome columns is not correct. Vasu --- On Tue, 3/1/11, Ying Zhang <ying.zhang.yz323@yale.edu> wrote: From: Ying Zhang <ying.zhang.yz323@yale.edu> Subject: Re: [galaxy-user] how to find out the gene_ID correspond to CUFF ID To: "David Matthews" <D.A.Matthews@bristol.ac.uk> Cc: galaxy-user@bx.psu.edu Date: Tuesday, March 1, 2011, 10:59 AM Dear David: I followed your advices and downloaded reference sequence from Emsemble, then I uploaded this file into galaxy, and then I run the cufflinks using the file as a reference annotation, however I got error when I am running, the following the error message gave to me: An error occurred running this job: cufflinks v0.9.3 cufflinks -I 300000 -F 0.050000 -j 0.050000 -p 8 -Q 0 -G /galaxy/main_database/files/002/122/dataset_2122219.dat -r /galaxy/data/hg19/sam_index/hg19.fa Error running cufflinks. [11:47:14] Loading reference and sequence. GFF warning: mergi Do you have any idea of what is going wrong here? Best Ying Quoting David Matthews <D.A.Matthews@bristol.ac.uk>:
Hi,
Yeah, thats a good idea too!! I did not know about that tool, shows what I know (!) - thanks for the info!
Cheers David
On 1 Mar 2011, at 04:51, Jeremy Goecks wrote:
Ying, you could also try using the tools 'Fetch closest non-overlapping feature' and 'Intersect' to find genes nearby transcripts/genes/TSSes of interest; for both tools, you'll want a reference annotation, either from UCSC or Ensembl.
Best, J.
On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
Hi,
You need to supply a gene annotation file with cufflink to easily get the gene-id information. Without it, cufflinks simply tries its best to figure out what genes are present. The ensemble gtf file is quite a comprehensive one - there is a link to it on the cufflinks manual page.
Good luck! David
On 28 Feb 2011, at 21:33, Ying Zhang wrote:
Dear Everyone:
I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much!
Best
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
_______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Dear Vasu: thank you for your information! I have checked the reference and do not find a specific column that include chromosome information, do you mean the first column(seqname)? Do you happen to have one with correct format and I can used for reference annotation? Thanks a lot! I onlg have limited experience in computing so I do not know how to format this file. Best Ying Quoting vasu punj <punjv@yahoo.com>:
I believe you need to format the Ensemble file Chromosome columns is not correct. Vasu
--- On Tue, 3/1/11, Ying Zhang <ying.zhang.yz323@yale.edu> wrote:
From: Ying Zhang <ying.zhang.yz323@yale.edu> Subject: Re: [galaxy-user] how to find out the gene_ID correspond to CUFF ID To: "David Matthews" <D.A.Matthews@bristol.ac.uk> Cc: galaxy-user@bx.psu.edu Date: Tuesday, March 1, 2011, 10:59 AM
Dear David:
I followed your advices and downloaded reference sequence from Emsemble, then I uploaded this file into galaxy, and then I run the cufflinks using the file as a reference annotation, however I got error when I am running, the following the error message gave to me:
An error occurred running this job: cufflinks v0.9.3 cufflinks -I 300000 -F 0.050000 -j 0.050000 -p 8 -Q 0 -G /galaxy/main_database/files/002/122/dataset_2122219.dat -r /galaxy/data/hg19/sam_index/hg19.fa Error running cufflinks. [11:47:14] Loading reference and sequence. GFF warning: mergi
Do you have any idea of what is going wrong here?
Best
Ying
Quoting David Matthews <D.A.Matthews@bristol.ac.uk>:
Hi,
Yeah, thats a good idea too!! I did not know about that tool, shows what I know (!) - thanks for the info!
Cheers David
On 1 Mar 2011, at 04:51, Jeremy Goecks wrote:
Ying, you could also try using the tools 'Fetch closest non-overlapping feature' and 'Intersect' to find genes nearby transcripts/genes/TSSes of interest; for both tools, you'll want a reference annotation, either from UCSC or Ensembl.
Best, J.
On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
Hi,
You need to supply a gene annotation file with cufflink to easily get the gene-id information. Without it, cufflinks simply tries its best to figure out what genes are present. The ensemble gtf file is quite a comprehensive one - there is a link to it on the cufflinks manual page.
Good luck! David
On 28 Feb 2011, at 21:33, Ying Zhang wrote:
Dear Everyone:
I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much!
Best
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
_______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286
Hi, Yes, column 1 refers to the chromosome name and it must be the same throughout (i.e. your hg19 reference file must call the chromosomes 1,2, 3 etc). A simpler solution is to use a copy of hg19 that lists the chromosomes as 1, 2, 3 etc instead of Chr1, Chr2 etc. Unfortunately I'm only in intermittent contact with the web - I might be able to help you properly next week when I am back at work. However, I've just publicly shared a history containing a hg19 file, a female hg19 (missing chromosome Y) and an ensembl gtf file that all work together (i.e. all use the same names for the chromosomes!) called "Bristol hg19..." just look under "shared data". However, you will probably need to repeat your tophat alignments using your reads and these files together. Good luck! David On 1 Mar 2011, at 20:06, Ying Zhang wrote:
Dear Vasu:
thank you for your information!
I have checked the reference and do not find a specific column that include chromosome information, do you mean the first column(seqname)? Do you happen to have one with correct format and I can used for reference annotation? Thanks a lot! I onlg have limited experience in computing so I do not know how to format this file.
Best
Ying
Quoting vasu punj <punjv@yahoo.com>:
I believe you need to format the Ensemble file Chromosome columns is not correct.
Vasu
--- On Tue, 3/1/11, Ying Zhang <ying.zhang.yz323@yale.edu> wrote:
From: Ying Zhang <ying.zhang.yz323@yale.edu> Subject: Re: [galaxy-user] how to find out the gene_ID correspond to CUFF ID To: "David Matthews" <D.A.Matthews@bristol.ac.uk> Cc: galaxy-user@bx.psu.edu Date: Tuesday, March 1, 2011, 10:59 AM
Dear David:
I followed your advices and downloaded reference sequence from Emsemble, then I uploaded this file into galaxy, and then I run the cufflinks using the file as a reference annotation, however I got error when I am running, the following the error message gave to me:
An error occurred running this job: cufflinks v0.9.3 cufflinks -I 300000 -F 0.050000 -j 0.050000 -p 8 -Q 0 -G /galaxy/main_database/files/002/122/dataset_2122219.dat -r /galaxy/data/hg19/sam_index/hg19.fa Error running cufflinks. [11:47:14] Loading reference and sequence. GFF warning: mergi
Do you have any idea of what is going wrong here?
Best
Ying
Quoting David Matthews <D.A.Matthews@bristol.ac.uk>:
Hi,
Yeah, thats a good idea too!! I did not know about that tool, shows what I know (!) - thanks for the info!
Cheers David
On 1 Mar 2011, at 04:51, Jeremy Goecks wrote:
Ying, you could also try using the tools 'Fetch closest non-overlapping feature' and 'Intersect' to find genes nearby transcripts/genes/TSSes of interest; for both tools, you'll want a reference annotation, either from UCSC or Ensembl.
Best, J.
On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
Hi,
You need to supply a gene annotation file with cufflink to easily get the gene-id information. Without it, cufflinks simply tries its best to figure out what genes are present. The ensemble gtf file is quite a comprehensive one - there is a link to it on the cufflinks manual page.
Good luck! David
On 28 Feb 2011, at 21:33, Ying Zhang wrote:
Dear Everyone:
I have got one output file after I run Cufflink which contain gene expression information. However, I found out for each gene_ID, it has the format like, CUFF.1151175, do you have idea of how to find out the offical gene ID correspond to this CUFF ID? Thank you very much!
Best
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
_______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286 _______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ying Zhang, M.D., Ph.D. Postdoctoral Associate Department of Genetics, Yale University School of Medicine 300 Cedar Street,S320 New Haven, CT 06519 Tel: (203)737-2616 Fax: (203)737-2286
participants (4)
-
David Matthews
-
Jeremy Goecks
-
vasu punj
-
Ying Zhang