Hi Aaron,

I don't - but I am not sure if it is there anyway (didn't find or missed it). This will take some research, unless someone else on the list knows where to look and can send info. My quick scan didn't find it.

I checked "Get Data -> GrameneMart Central server", since I saw some documentation at one of these sites yesterday (maybe the last link I sent, with all the rice resources listed) that noted that rice anno data was now available there. But when I just checked the builds, and it wasn't this strain. "BioMart Central server" also has the other strain.

I would look around at IRGSP or the other resources in that last link. I did see that Ensembl (next to last link before) had many files for download, including one where they mapped annotations (SNPs, I think) from the IRGSP build to the MSU build, which means they had the starting IRGSP file to begin with. So even if it is not in the download, they may have the sources documented somewhere (which would tell you where to fetch it and potentially gene annotations as well), or you could write them (or the IRGSP project) directly to find out what they recommend. You could also search seqanswers.com to see if anyone else was looking/found the annotation data.

Too bad iGenomes did not pick this up, since if you are using RNA-seq tools, the extra attributes will maximize what Cuffdiff results you can produce. It couldn't hurt to ask them to add it to the set, I'm sure they get requests all the time. I know the MSU build is popular, but both strains are important, and IRGSP is public without any restrictions, which has certain advantages of course. Galaxy only includes the IRGSP build as a native index for this reason. But you could use the MSU build if you really wanted - and meet the usage criteria - as a custom reference genome:
http://wiki.galaxyproject.org/Support#Custom_reference_genome

Wish I could just find this for you! But I think this will take a big of digging and then you'll have to make a call about reliability of the source, resolve format issues (if any), that sort of thing - so investigating this on your own is a better path in the absence of a definitive go-to source. I've tried to share all the leads I could find. If you do find a good file and it works well, please consider sharing on the Public server (with the source well documented).

Good luck!

Jen
Galaxy team


On 12/14/12 7:33 AM, Aaron P Smith wrote:

Hi Jen,

Do you have a direct link for the gene coordinates file (GFF?) from NCBI for the rice genome?

Thanks,

Aaron

 

______________________________

Aaron Smith, Ph.D.

Assistant Professor

Biological Sciences

Louisiana State University

202 Life Sciences Building

Baton Rouge, LA 70803

apsmith@lsu.edu

http://www.smith.biology.lsu.edu/lab

 

From: Jennifer Jackson [mailto:jen@bx.psu.edu]
Sent: Thursday, December 13, 2012 11:51 AM
To: galaxy-user
Cc: Aaron P Smith
Subject: rice reference genome

 

Hello Aaron,

The genome was obtained from Genbank:
http://www.ncbi.nlm.nih.gov/genome/10
http://www.ncbi.nlm.nih.gov/assembly/313038/

Download links can be found at the International Rice Genome Sequencing Project (IRGSP), but, we source from NCBI directly whenever possible.
http://rgp.dna.affrc.go.jp/E/IRGSP/index.html
http://rgp.dna.affrc.go.jp/E/IRGSP/Build4/build4.html

These resources may also be helpful in general:
http://www.gramene.org/resources/
http://plants.ensembl.org/Oryza_sativa/Info/Index
http://rice.genomics.org.cn/rice/link/link.jsp

Best,

Jen
Galaxy team

ps. Going forward, please use the mailing list, galaxy-user@bx.psu.edu, for new questions to help us out with tracking. Thanks!

On 12/13/12 9:11 AM, Aaron P Smith wrote:

Hi Jen,

I’m trying to obtain the coordinates for the rice genome used in Galaxy to compare some sequences from another source.  Can you direct me to the version of the genome used in Galaxy (oryza_sativa_japonica_nipponbare_IRGSP4.0)?

Thanks!

Aaron

 



-- 
Jennifer Jackson
http://galaxyproject.org

-- 
Jennifer Jackson
http://galaxyproject.org