Re: [galaxy-user] stuck

27 Sep 2010

      Hello Amy,

For your own SNP file, create it so that the start is "0-based" to be 
consistent with the BED (aka interval) file format used by UCSC and Galaxy.

This means that if a SNP is located at base "36" on chr1, then your file 
would be:

chr1 35 36

Double check that the chromosome naming format is exactly the same 
(capitalization matters) and this should fix the joining problems.

A "join" is probably what you want to do if you have the entire UCSC SNP 
file.

The "profile annotation" function would pull out UCSC's SNP information 
directly plus other features and may also be interesting to test out.

More help is at:
http://bitbucket.org/galaxy/galaxy-central/wiki/GopsDesc

Please let us know if this does not help,

Jen
Galaxy Team

On 9/26/10 11:21 AM, Hsu, Amy (NIH/NIAID) [E] wrote:
...
I'm sure this is a silly question, but I have been stuck since yesterday.
I have some whole genome sequencing data - What was given to me is an enormous XL file with all the single base changes identified in 4 people.
I don't want to track down all the changes, yet I've noticed some of them labeled as "novel" are in fact SNPs - particularly if I use build 130.
I imported just chr, start, end using same base number for start/end from part of my file (chr 1) and then pulled down all the SNPs from UCSC for Chr. 1.
What I would like to do is label the lines in my file that are snps.  I have tried intersect, join, subtract, all to no avail.
What am I doing wrong? Any help would be appreciated.
thanks -
Amy Hsu
_______________________________________________
galaxy-user mailing list
galaxy-user@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-user
-- 
Jennifer Jackson
http://usegalaxy.org