For your own SNP file, create it so that the start is "0-based" to be
consistent with the BED (aka interval) file format used by UCSC and Galaxy.
This means that if a SNP is located at base "36" on chr1, then your file
chr1 35 36
Double check that the chromosome naming format is exactly the same
(capitalization matters) and this should fix the joining problems.
A "join" is probably what you want to do if you have the entire UCSC SNP
The "profile annotation" function would pull out UCSC's SNP information
directly plus other features and may also be interesting to test out.
More help is at:
Please let us know if this does not help,
On 9/26/10 11:21 AM, Hsu, Amy (NIH/NIAID) [E] wrote:
I'm sure this is a silly question, but I have been stuck since
I have some whole genome sequencing data - What was given to me is an enormous XL file
with all the single base changes identified in 4 people.
I don't want to track down all the changes, yet I've noticed some of them labeled
as "novel" are in fact SNPs - particularly if I use build 130.
I imported just chr, start, end using same base number for start/end from part of my file
(chr 1) and then pulled down all the SNPs from UCSC for Chr. 1.
What I would like to do is label the lines in my file that are snps. I have tried
intersect, join, subtract, all to no avail.
What am I doing wrong? Any help would be appreciated.
galaxy-user mailing list