Questions about column output in Operate on genomic Intervals -> Profile Annotations
Hello, I tried using this tool today after inputting a bed file containing 1509 intervals of 100 bp each, spread across all 22 autosomes. First of all, despite the fact that my input file contained intervals for 22 chromosomes, the value of "allCoverage" seemed to be the same as the value of the coverage of that table only for chr1. I was not really sure about the tableRegionCoverage column, as for most of the autosomes I had input data spread throughout the chromsome with points a few Mb away from either end, but I was getting a value in this column only about 1/3 of what I get when downloading the data directly from UCSC and summing the interval sizes. There were also many cases where nrCoverage > allCoverage, even when I reduced each input genomic interval to only 1 bp to avoid redundancy in the input file. Based on these descriptions of the columns I would expect allCoverage >= nrCoverage at all times. Just wondering if you could clarify what these columns are supposed to mean or how to reconcile these apparent inconsistencies.
Hi James, Full column descriptions are at the bottom of the Profile Annotations tool form. Are you working on the public Main Galaxy instance, or can you duplicate this on Main https://main.g2.bx.psu.edu/ (usegalaxy.org)? It would be helpful if you could share a history link and point to the dataset(s) with these values - at first pass they do seem off, but we can look into why. Leave all inputs and outputs undeleted in the history when you email back the share link please. You send email me directly to keep your data private. How to share a history: http://wiki.galaxyproject.org/Support#Shared_and_Published_data Thanks! Jen Galaxy team On 8/14/13 8:13 PM, James Wagner wrote:
Hello, I tried using this tool today after inputting a bed file containing 1509 intervals of 100 bp each, spread across all 22 autosomes.
First of all, despite the fact that my input file contained intervals for 22 chromosomes, the value of "allCoverage" seemed to be the same as the value of the coverage of that table only for chr1. I was not really sure about the tableRegionCoverage column, as for most of the autosomes I had input data spread throughout the chromsome with points a few Mb away from either end, but I was getting a value in this column only about 1/3 of what I get when downloading the data directly from UCSC and summing the interval sizes.
There were also many cases where nrCoverage > allCoverage, even when I reduced each input genomic interval to only 1 bp to avoid redundancy in the input file. Based on these descriptions of the columns I would expect allCoverage >= nrCoverage at all times.
Just wondering if you could clarify what these columns are supposed to mean or how to reconcile these apparent inconsistencies.
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson http://galaxyproject.org
Hello Jen and other members, here is a history with the interval dataset I uploaded and the results I get when doing a Profile Annotations summary. In particular I am concerned about why in some cases tableChromosomeCoverage < tableRegionCoverage and allCoverage < nrCoverage. https://main.g2.bx.psu.edu/u/jwag/h/unnamed-history Thanks so much On Thu, Aug 15, 2013 at 12:55 AM, Jennifer Jackson <jen@bx.psu.edu> wrote:
Hi James,
Full column descriptions are at the bottom of the Profile Annotations tool form.
Are you working on the public Main Galaxy instance, or can you duplicate this on Main https://main.g2.bx.psu.edu/ (usegalaxy.org)? It would be helpful if you could share a history link and point to the dataset(s) with these values - at first pass they do seem off, but we can look into why. Leave all inputs and outputs undeleted in the history when you email back the share link please. You send email me directly to keep your data private.
How to share a history: http://wiki.galaxyproject.org/Support#Shared_and_Published_data
Thanks!
Jen Galaxy team
On 8/14/13 8:13 PM, James Wagner wrote:
Hello, I tried using this tool today after inputting a bed file containing 1509 intervals of 100 bp each, spread across all 22 autosomes.
First of all, despite the fact that my input file contained intervals for 22 chromosomes, the value of "allCoverage" seemed to be the same as the value of the coverage of that table only for chr1. I was not really sure about the tableRegionCoverage column, as for most of the autosomes I had input data spread throughout the chromsome with points a few Mb away from either end, but I was getting a value in this column only about 1/3 of what I get when downloading the data directly from UCSC and summing the interval sizes.
There were also many cases where nrCoverage > allCoverage, even when I reduced each input genomic interval to only 1 bp to avoid redundancy in the input file. Based on these descriptions of the columns I would expect allCoverage >= nrCoverage at all times.
Just wondering if you could clarify what these columns are supposed to mean or how to reconcile these apparent inconsistencies.
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
-- Jennifer Hillman-Jacksonhttp://galaxyproject.org
Hi James, The problem seems to have come from using the default "all tables" when running the tool. The current assumption is that you will need to re-run all in this set, not just the ones you highlight with the odd calculations. For now, just run in smaller groups to obtain the correct results. We will be working out how to best address the "all data" query going forward. Running through every human track at UCSC (which includes ENCODE, for certain genomes, such as the one you are using) is a vast amount of data to pull and parse. Thank you for sharing your history and sorry for the inconvenience, Jen Galaxy team On 8/14/13 8:13 PM, James Wagner wrote:
Hello, I tried using this tool today after inputting a bed file containing 1509 intervals of 100 bp each, spread across all 22 autosomes.
First of all, despite the fact that my input file contained intervals for 22 chromosomes, the value of "allCoverage" seemed to be the same as the value of the coverage of that table only for chr1. I was not really sure about the tableRegionCoverage column, as for most of the autosomes I had input data spread throughout the chromsome with points a few Mb away from either end, but I was getting a value in this column only about 1/3 of what I get when downloading the data directly from UCSC and summing the interval sizes.
There were also many cases where nrCoverage > allCoverage, even when I reduced each input genomic interval to only 1 bp to avoid redundancy in the input file. Based on these descriptions of the columns I would expect allCoverage >= nrCoverage at all times.
Just wondering if you could clarify what these columns are supposed to mean or how to reconcile these apparent inconsistencies.
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson http://galaxyproject.org
participants (2)
-
James Wagner
-
Jennifer Jackson