Hello Amy, For your own SNP file, create it so that the start is "0-based" to be consistent with the BED (aka interval) file format used by UCSC and Galaxy. This means that if a SNP is located at base "36" on chr1, then your file would be: chr1 35 36 Double check that the chromosome naming format is exactly the same (capitalization matters) and this should fix the joining problems. A "join" is probably what you want to do if you have the entire UCSC SNP file. The "profile annotation" function would pull out UCSC's SNP information directly plus other features and may also be interesting to test out. More help is at: http://bitbucket.org/galaxy/galaxy-central/wiki/GopsDesc Please let us know if this does not help, Jen Galaxy Team On 9/26/10 11:21 AM, Hsu, Amy (NIH/NIAID) [E] wrote:
I'm sure this is a silly question, but I have been stuck since yesterday. I have some whole genome sequencing data - What was given to me is an enormous XL file with all the single base changes identified in 4 people.
I don't want to track down all the changes, yet I've noticed some of them labeled as "novel" are in fact SNPs - particularly if I use build 130.
I imported just chr, start, end using same base number for start/end from part of my file (chr 1) and then pulled down all the SNPs from UCSC for Chr. 1.
What I would like to do is label the lines in my file that are snps. I have tried intersect, join, subtract, all to no avail.
What am I doing wrong? Any help would be appreciated.
thanks -
Amy Hsu _______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Jennifer Jackson http://usegalaxy.org