On Tue, Mar 15, 2011 at 1:51 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Peter,
Breaking this issue into the following 2 parts, here is the status.
1. Don't alter the contents of files being uploaded to a data library if using the "upload_directory" or "upload_paths" options in conjunction with the "Link to files without copying into Galaxy" option. This issue has been resolved in change set 5221:b5ecb8f4839d.
2. Determine if a BAM file is sorted before it is introduced into the Galaxy environment so that it will only be sorted if necessary. We have a very simple test for this in the Bam class's _is_coordinate_sorted(0 method in ~/lib/galaxy/datatypes/binary.py, but this method obviously needs improvements. The improved implementation is a bit non-trivial, but it is high priority, so should be completed soon. In the meantime, Bam files cannot be uploaded to a data library using the combinations of options described in 1 above if they do not pass the current simple, rigid test in the Bam class's method.
Thanks for your message,
Greg Von Kuster
Thanks Greg - I'll have to retest with that update. I was thinking about this over the weekend, and perhaps you could assume (for the special case of a library import where the file is being linked to) that if the BAI index file already exists then the BAM file should be sorted already. i.e. Use both the BAM and BAI files as provided. Peter