Dynamic GUI for genome-browser custom tracks (another long email)

3 Nov 2009

      Hello,

I'd like to suggest a feature that would make uploading custom tracks from Galaxy to the UCSC genome browser friendlier (IMHO).
I don't have the code for it, this is just a suggestion, open for discussion.

The way it works now - there are two options to upload custom tracks:
1. click on "Main" link - send the file as is to UCSC's genome browser. 
 Track attributes (color, name, visibility, score, etc) are not sent.
 Track name is "User Track". If another track is uploaded - it overrides the previously uploaded track.

2. Run the "Build Custom Track" in the "Graph/Display Data" category.
 This allows setting track options, but it duplicates the existing file,
 and the options as statically stored in the file - they can't be changed,
 and the file can't be further processed (unless one removes the custom track line).

My suggestion has four parts:

Part 1
------
Each viewable file (BED, intervals, PSL, Wiggle, BedGraph and in the future: SAM/BAM, bigWig, bigBed) will have another 'button' called "custom track attributes" (or a shorter, better name).
Clicking on this button will show the user a page in which the user can set the common custom track options.
See a mock-up here:
http://cancan.cshl.edu/labmembers/gordon/custom_tracks_ui/genome_browser_cus...

This page is purely client side - code is javascript.
The "output" of this page is just one text line - the custom track attributes line.

Part 2
------
A new database table in galaxy will store the "custom track line" for each eligible dataset.
When the user clicks on the above mentioned button, the line is read from the table.
When the user clicks "OK" on the above page, the line is saved back to galaxy.

Part 3
------
A new method in the datasets controller will "inject" the dataset's custom line before sending the file's content.

For example, if my dataset is a BED file, and contains:
 chr1 100 3000
 chr2 200 3000

And my custom-track-line for this dataset is:
 name="Hello World" visibility=4 color=255,0,255

The new method will first send the custom-track-line, followed by the actual dataset content:
 track name="Hello World" visibility=4 color=255,0,255
 chr1 100 3000
 chr2 200 3000

This method will require just a couple of tricks:
 adding the "track" keyword at the beginning of the line, and
 adding the "type" keyword if this is not a standard BED file (e.g. for Wiggle/PSL/BedGraph)

Part 4
-------
When click on the "main" button (to display the track in the genome browser), Galaxy will send this new method to the UCSC genome browser,
instead of the raw data file.

Outcome
-------
Uploading custom tracks (with more attributes) will be much easier - no need to run an extra tool.
One could them change display options are 'reload' a track (or by using the genome browser's reload button).
No need for duplicated data files in galaxy, and tracks can be further processed, filtered, etc (if the custom-track-line) is copied from one dataset to another like other metadata.

Further improvement
-------------------
When (if?) galaxy supports large binary files (BAM, bigWig, bigBed), it could take advantage of the genome browers's 'bigDataUrl' feature:

The new method (in part 3) doesn't need to transmit the entire dataset content into the genome browser.
Instead can return the custom track line, and add the 'bigDataUrl' key, with the URL pointing to the real "display" method of the dataset.
Something like:

 track name="Hello World" type=bigBed visibility=3 color=255,255,0 bigDataUrl=http://main.g2.bx.psu.edu/datasets/f8bca8dcf7f5c1d8/display/?showall=True

And then binary tracks are uploaded almost immediately, and the genome browser will query galaxy just for the parts of the file that are needed.

As always, comments are welcomed.
  -gordon.

Assaf Gordon

tags

participants (1)