Gareth: Sorry for the delay. There are two way of dealing with this. I attached a link to a screencast that highlights the two approaches. In the first, you must upload datasets into galaxy and simply run the join tool. The second approach is to use a new galaxy tool called "Annotation profiler". It allows you to compare your set of genomic features against the entire UCSC database in a single pass (at this point it can only be used against hg18 annotations). Try it out and let us know if you have any further questions. The movie is here: http://screencast.g2.bx.psu.edu/SNPs_TFBS.mov anton On Jun 5, 2008, at 11:21 AM, Whiteley, Gareth wrote:
Hi Anton, That's exactly what I'm trying to do. I'm also trying to intersect the SNPs with the UCSC tracks - CpG islands, T-ScanS miRNA, PicTar miRNA - and any other regulatory sequence data. A step-by-step demo would be wonderful. Kind Regards, Gareth
-----Original Message----- From: Anton Nekrutenko [mailto:anton@bx.psu.edu] Sent: 05 June 2008 15:58 To: Whiteley, Gareth Cc: galaxy-user@bx.psu.edu Subject: Re: [galaxy-user] galaxy query
Gareth:
You can do all the intersects within Galaxy. If I understand correctly you are trying to intersect conserved TFBPs with SNPs? Right?
Let me know and will send you a detailed step-by-step demo.
anton galaxy team
On Jun 5, 2008, at 8:02 AM, Whiteley, Gareth wrote:
Hello,
If I was to choose the following search criteria Group:all tracks, track:SNP, region: chr5:7922217-7954237, then click INTERSECT and choose Group:all tracks, track:TFBS conserved and select GTF output format. This example searches for any SNPs that are in the Transcription factor binding sites within the genomic region that codes for the MTRR gene, but the output does not tell me that the SNP rs6868871 is found in the TFBS V$58_01 914, I have to work that out for myself by then searching the whole table again but from a TFBS start point and intersecting with SNPs.
I am trying to use galaxy to Join two Queries side by side on a specified field. I am trying to relate SNPs to the TFBS they are associated with. For example, SNP rs6868871 is found in TFBS V $58_01 914. However, i can not seem to get galaxy to work, i think this is because the SNP site is something like this ‘165878639 - 165878640’ and a TFBS site is something like this ‘165878630 - 165878641’ and although the positions overlap, galaxy can not tell that. Is this the case? Or do you know of how i can get around it? Regards,Gareth Whiteley
Gareth Whiteley
University of Liverpool
Department of Pharmacology and Therapeutics
The Sherrington Buildings
Ashton Street
Liverpool L69 7GE
Tel: 0151 795 4224
E-mail: g.whiteley@liverpool.ac.uk
_______________________________________________ galaxy-user mailing list galaxy-user@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
Anton Nekrutenko Asst. Professor Department of Biochemistry and Molecular Biology Center for Comparative Genomics and Bioinformatics Penn State University anton@bx.psu.edu http://nekrut.bx.psu.edu 814.865.4752
Anton Nekrutenko Asst. Professor Department of Biochemistry and Molecular Biology Center for Comparative Genomics and Bioinformatics Penn State University anton@bx.psu.edu http://nekrut.bx.psu.edu 814.865.4752