Hi, I've searched the archives and cannot find my situation. I have an Illumina MiSeq dataset with ~8 million reads that was aligned to the reference genome on the MiSeq machine. (The machine uses BWA to allow overlapping reads, unlike Casava.) The alignment generated dataset.bam and dataset.bam.bai files in the same directlry. I used Get Data to upload both files. The problems, possibly related, are that Galaxy 1) would not recognize the .bam file as a bam file and 2) would not recognize the dataset.bam.bai file as the bam file metadata. After uploading the bam file, originally identified as a text file, I manually used the pen option to tell Galaxy that the data was a .bam file. I then uploaded the .bam.bai file. The data would not display the data at UCSC genome browser nor in Trackster, even though those options were displayed for the .bam file. After uploading, I also used the pen editor to change from dataset.bam.bai to dataset.bai. None of these options worked for Trackster, UCSC genome brower, nor IGV nor IGB. Interestingly, IGV would display it if the data was used directly form my hard drive but not through Galaxy. Any advice would be greatly appreciated. Ann CONFIDENTIALITY NOTICE: This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.
Hi Ann, When loading a BAM dataset, there is no need to load the index, just the BAM file itself. Galaxy will generate the .bai and create a composite dataset. For the problem with detection, my guess is that the file was not named with a ".bam" extension, making detection problematic. The other problems are likely derivatives of this, and the .bai problems are expected - distinct bai datasets are not supported. If it was named as .bam, or appears to be, this would be a very strange case. Maybe check that the file doesn't have a hidden extension on the server you are loading from? If the extension .bam was not used, please try loading using it, just BAM file, and see if that works now - maybe switching to FTP if not already using that. An example of how to do this is in this screencast (the 4th example, FTP, loads a BAM file): http://wiki.galaxyproject.org/Learn/Screencasts -> *Get Data: Upload File * http://screencast.g2.bx.psu.edu/usinggalaxy_upload/flow.html Hopefully this resolves the problem, Jen Galaxy team On 7/3/13 11:04 PM, Ann Holtz-Morris, M.S. wrote:
Hi, I've searched the archives and cannot find my situation. I have an Illumina MiSeq dataset with ~8 million reads that was aligned to the reference genome on the MiSeq machine. (The machine uses BWA to allow overlapping reads, unlike Casava.) The alignment generated dataset.bam and dataset.bam.bai files in the same directlry.
I used Get Data to upload both files. The problems, possibly related, are that Galaxy 1) would not recognize the .bam file as a bam file and 2) would not recognize the dataset.bam.bai file as the bam file metadata.
After uploading the bam file, originally identified as a text file, I manually used the pen option to tell Galaxy that the data was a .bam file. I then uploaded the .bam.bai file. The data would not display the data at UCSC genome browser nor in Trackster, even though those options were displayed for the .bam file. After uploading, I also used the pen editor to change from dataset.bam.bai to dataset.bai.
None of these options worked for Trackster, UCSC genome brower, nor IGV nor IGB. Interestingly, IGV would display it if the data was used directly form my hard drive but not through Galaxy.
Any advice would be greatly appreciated. Ann CONFIDENTIALITY NOTICE: This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
participants (2)
-
Ann Holtz-Morris, M.S.
-
Jennifer Jackson