Jen,
A couple of uninformed questions. I gather from your response that the author lab submitted a multiple track group .wig file instead of a single track group .wig file, and that I need to generate a single track group file before the bigwig conversion will work. So, with regard to the instructions below, I am to run the text manipulation on the original author submitted .wig file. Then run "filter and sort-- Select lines that match an expression" on the newly created file that: "Matching" the pattern: "track". This generates yet another file that has the following info: 88: Select on data 87 1 line, 1 comments format: wig, database: mm8 Info: Matching pattern: track
track type=wiggle_0 visibility=full name="Smc3_mES" autoScale=on color=100,0,100 1 track visibility=dense name="Smc3_mES enriched regions - 1e-09" color=100,0,100 892178 Is the number 892173 the number of track lines? If so, do I then do the "Remove beginning of a file" using 892178 on the original author .wig file? Mike
On Apr 16, 2012, at 10:35 AM, Jennifer Jackson wrote:
Hi Mike,
I apologize if I wasn't clear, but the 'Select' was to show you how to identify the multi-track group wig files. I wanted to give you a way to screen similar files going forward.
The wig-to-bigWig program in Galaxy comes from UCSC. It accepts .wig files with a single track group as input: http://genome.ucsc.edu/goldenPath/help/bigWig.html (see step #1)
The data author lab can either submit the data as single track group .wig files, or, if you are confident that the multiple track group .wig format is expected and OK from this source, split the file. There are no specific tools in Galaxy to do this, but something like this would work:
- Text Manipulation -> "Add column", "1", Iterate? = yes
- "Select", "track"
- note the line number of track lines
- "Remove beginning of a file", using line numbers, and the -
original- .wig file, to break up into individual .wig files.
Good luck!
Jen Galaxy team
On 4/16/12 6:57 AM, Michael Sikes wrote:
Jennifer,
Thanks for your help. I ran the filter and sort tool as advised, and then ran the wig to bigwig on the new history item generated by the filter. This time I got a different error: 84: Wig-to-bigWig on data 83 https://main.g2.bx.psu.edu/history 0 bytes An error occurred running this job:/stdin is empty of data Error running wigToBigWig. / https://main.g2.bx.psu.edu/dataset/errors?id=6818347<https://main.g2.bx.psu.edu/datasets/0f70746579b165e2/show_params
<https://main.g2.bx.psu.edu/datasets/b4fb2e8c767b4258/display/?preview=True
https://main.g2.bx.psu.edu/datasets/b4fb2e8c767b4258/edit<https://main.g2.bx.psu.edu/datasets/b4fb2e8c767b4258/delete?show_deleted_on_...
83: Select on data 49 https://main.g2.bx.psu.edu/history 1 line, 1 comments format: wig, database: mm8 Info: Matching pattern: track <https://main.g2.bx.psu.edu/datasets/b4fb2e8c767b4258/display?to_ext=wig
show_params><https://main.g2.bx.psu.edu/tool_runner/rerun? id=6818275>https://main.g2.bx.psu.edu/history <https://main.g2.bx.psu.edu/tag/retag?item_id=b4fb2e8c767b4258&item_class...
https://main.g2.bx.psu.edu/dataset/annotate?id=b4fb2e8c767b4258
Again, I'm sure I left off something obvious. Could you tell me what I did wrong?
Thanks, Mike
On Apr 13, 2012, at 1:27 PM, Jennifer Jackson wrote:
Hi Michael,
This particular .wig file has a data format problem that is the root cause of the conversion error. Specifically, there is an extra track line in the file. This can be found using unix tools with a grep or in Galaxy with the tool "Filter and Sort -> Select" by matching the pattern "track".
Ideally this would be corrected and resubmitted by the data author before use, since how/why this was inserted and what impact it has would need to be examined.
Since you noticed problems with other GEO files (conversion problems), verifying the .wig format and making any necessary corrections would also be advised.
Hopefully this helps!
Best,
Jen Galaxy team
Michael Sikes, Ph.D. Associate Professor of Immunology North Carolina State University Microbiology Department 4524A Gardner Hall Campus Box 7615 Raleigh, NC 27695 Ph: 919-513-0528 Fax: 919-515-7867 email: mlsikes@ncsu.edu