retaining or reclaiming bed ids

24 Dec 2013

      Hi!

Having provided a name (field 4) in a UCSC bed file ( http://www.genome.ucsc.edu/FAQ/FAQformat.html#format1 ) and sought a RefSeq name using the UCSC Table Browser ( http://www.genome.ucsc.edu/cgi-bin/hgTables ), I would now like to recover which line of the bed file delivered which line of the output file… However, I am told I need Galaxy to provide a workflow to do this.  Can anyone explain how?  eg, one line of my bedfile looks like:
chr2	2723752	2723777	seqid6354405	0	-
and one line of my intersected table browser output looks like:
chr1	176432306	176811970	NM_020318	0	+	176525458	176811590	0	23	248,1835,1072,146,294,193,122,490,129,92,194,147,136,217,172,178,214,169,136,110,72,99,455,	0,92236,131353,207799,226966,228955,232567,235929,239436,243188,246812,248664,276455,276809,302495,306436,307796,326638,328176,330389,336890,377002,379209,

Clearly the first line of my bed doesn't correspond to the first line of my intersection output, but as my bed is long, what reference can I use to unambiguously identify which line of output the first line of my intersection corresponds to?  How do I do this in Galaxy?

PS - I tried this workflow earlier today without success, aiming to achieve a similar objective: https://usegalaxy.org/u/james/w/workflow-from-ucsc-genes-and-symbols

PPS- I also note similar issues were raised in this discussion, with Galaxy promoted as the solution, but with no real details about how to achieve the desired results:
http://redmine.soe.ucsc.edu/forum/index.php?t=msg&goto=10615&S=0d1b303e6dfdceaf3b240804fd0f52aa

Bert Gold, Ph.D., FACMG
Staff Scientist
NCI-Frederick
Frederick, MD 21702
VOICE: 301-846-5098
EMAIL: golda@mail.nih.gov

Gold, Bert (NIH/NCI) [E]

Jennifer Jackson

tags

participants (2)