I have a set of genomic regions, about 8000 intervals, and I'd like to sort them by expression in certain tissues - does anyone know if that is possible? I am not having much luck in UCSC determining which track of data corresponds to a certain tissue. For example - I would like to filter on only those expressed in lymphoctyes (which may mean combining on several fields - whole blood, T cells, B cells, monocytes). Any insight or suggestions would be appreciated. Thanks - Amy
Hello Amy, For hg18, the full compliment of ENCODE experiments plus other expression data is available in the track group "Expression" at UCSC. Exploring the contents of these through the UCSC Browser's Gene Sorter or track description pages (click on track names) is advised. For detailed help, contacting UCSC directly (genome@soe.ucsc.edu) for help regarding which exact track/tables would be recommended. There are many choices and most will require some subjective stringency filtering. Also, since much of the data is normalized in the schema and will require file merges to create the final reference file, they can help you navigate to which tables are most important for your particular experimental goals. Once the tracks & tables are identified, move the data into Galaxy using "Get data -> Bx Main" with the Table browser (a mirror of the UCSC browser that will allow you to load larger files, the UCSC Table browser has a transfer limit at ~ 100k lines). Merge/format/filter as needed using the tools in "Text manipulation", load your intervals "Get data -> Upload File", and use the Interval comparison tools to perform the analysis. Save as a workflow to use the analysis path multiple times (useful as you work through the process of tuning parameters, reuse for future experiments, etc.). We know that UCSC is in the process of mapping ENCODE onto the hg19 genome, but that project is not completed yet. Currently there are only two expression tracks available and I don't think they will address your question, but it is worth a look. This is another question you could ask them about if hg19 is your ultimate target. Sorry that we could not specifically help with the data part of your question - expression data usually requires complex data reduction methods - and the ENCODE experts are the best qualified folks to offer help. But, we can certainly offer more help with Galaxy's tools once you have that part settled. Please let us know if we can assist you further during the analysis phase of your project. Best! Jen Galaxy team On 10/1/10 8:00 AM, Hsu, Amy (NIH/NIAID) [E] wrote:
I have a set of genomic regions, about 8000 intervals, and I'd like to sort them by expression in certain tissues - does anyone know if that is possible? I am not having much luck in UCSC determining which track of data corresponds to a certain tissue. For example - I would like to filter on only those expressed in lymphoctyes (which may mean combining on several fields - whole blood, T cells, B cells, monocytes).
Any insight or suggestions would be appreciated.
Thanks -
Amy
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Jennifer Jackson http://usegalaxy.org
participants (2)
-
Hsu, Amy (NIH/NIAID) [E]
-
Jennifer Jackson