BAM files loading and mandatory grooming
Hello everyone Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all. Is it this serious? Would there be a way to bypass that? Thanks! L-A
Hello L-A, Would you be able to help with a bit more detail and testing? #2 sounds like it may be the issue, but without knowing more right now, I'll provided the next troubleshooting steps. 1 - this is in your own install and it is current with the latest -dist or are you using -central (which pull?). Did something change between the last successful BAM load and the new problem? 2 - "without copying" them means maybe that you are using a symbolic link at your own site? This "load" is into a Library (this is how to move data transferred outside of the Galaxy mechanisms into a history, i.e. around a local file system by direct copy or similar). Load into a Library first, then copy to history. Instruction in wiki here: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files 3 - or do you mean "without copying" by using the FTP method set up locally, followed by a load into a history? 4 - BAM file metadata (pencil icon -> Edit options) looks OK? Or, this is the problem? 5 - SAMTools is loaded into your instance? Tools function on other BAM files or all? Some simple SAMTools commands (that work with BAM files) function line command OK on these (to rule out problem with BAM files themselves - missing .bai index could be a problem). 6 - if you start with the same data and convert SAM->BAM within Galaxy, does the SAM load and is the resulting BAM file OK? 7 - If you load the BAM file at the public Galaxy web site (using FTP), the load is successful and the BAM file once imported into a history appears OK? Please share link from this test in case we need to examine. (Options -> Share or Publish, generate link, email to me and I can share with dev team if needed). Thanks for providing more info or perhaps you will find that one of these uncovers the issue, Best, Jen Galaxy team On 7/18/11 2:05 AM, Louise-Amélie Schmitt wrote:
Hello everyone
Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all.
Is it this serious? Would there be a way to bypass that?
Thanks! L-A ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/
Hello Jennifer, and thanks for your answer
Would you be able to help with a bit more detail and testing? #2 sounds like it may be the issue, but without knowing more right now, I'll provided the next troubleshooting steps.
Yeah, I realise I was not very clear, sorry about that.
1 - this is in your own install and it is current with the latest -dist or are you using -central (which pull?). Did something change between the last successful BAM load and the new problem?
This is indeed a local install (two of them actually). $ hg head changeset: 5585:8c11dd28a3cf tag: tip user: Nate Coraor <nate@bx.psu.edu> date: Thu May 19 10:07:53 2011 -0400 summary: Add Picard and fastqc tools to Main I can't remember whether I ever could successfully load a bam file...
2 - "without copying" them means maybe that you are using a symbolic link at your own site? This "load" is into a Library (this is how to move data transferred outside of the Galaxy mechanisms into a history, i.e. around a local file system by direct copy or similar). Load into a Library first, then copy to history. Instruction in wiki here: http://wiki.g2.bx.psu.edu/Admin/Data%20Libraries/Uploading%20Library%20Files
No, it's just that when I add new datasets to a library from filesystem paths I choose the 'Link files without copying into Galaxy' option.
3 - or do you mean "without copying" by using the FTP method set up locally, followed by a load into a history?
4 - BAM file metadata (pencil icon -> Edit options) looks OK? Or, this is the problem?
It looks ok. Well I guess. When I click on the file in the library, in the 'Miscellaneous information' section I have: "The uploaded files need grooming, so change your Copy data into Galaxy? selection to be Copy files into Galaxy instead of Link to files without copying into Galaxy so grooming can be performed."
5 - SAMTools is loaded into your instance? Tools function on other BAM files or all? Some simple SAMTools commands (that work with BAM files) function line command OK on these (to rule out problem with BAM files themselves - missing .bai index could be a problem).
Samtools works like charm. I've been using it through these Galaxy instances for a long time now.
6 - if you start with the same data and convert SAM->BAM within Galaxy, does the SAM load and is the resulting BAM file OK?
I can't run anything on them, since they're tagged "error" no tool want them as an input. If I could, I would just ignore the issue.
7 - If you load the BAM file at the public Galaxy web site (using FTP), the load is successful and the BAM file once imported into a history appears OK? Please share link from this test in case we need to examine. (Options -> Share or Publish, generate link, email to me and I can share with dev team if needed).
If you mean when the data is actually copied in Galaxy, yes everything works fine since the grooming is perform during the upload process. I never tried the public instance though. If you need me to share one of the files with you there, I'll have to ask for permission first, since the data's not mine.
Thanks for providing more info or perhaps you will find that one of these uncovers the issue,
Thanks for your help :) Regards, L-A
Best,
Jen Galaxy team
On 7/18/11 2:05 AM, Louise-Amélie Schmitt wrote:
Hello everyone
Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all.
Is it this serious? Would there be a way to bypass that?
Thanks! L-A ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello L-A, The load is necessary to create/link the index .bam index. No work-around is available, however if you develop one, we would be glad to consider it as an addition. If there are any updates on our side, we will send more information. Thanks! Jen Galaxy team
On 7/18/11 2:05 AM, Louise-Amélie Schmitt wrote:
Hello everyone
Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all.
Is it this serious? Would there be a way to bypass that?
Thanks! L-A ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/
Hello, Please sign me off the Galaxy email list. I keep getting galaxy emails that I don't know of. I was in Genomic class and had taken galaxy project. Now, that I'm done. Please again sign me off. Thank you so much for your time. Best regards, Mai H. P.S. my email is mhuynh@uark.edu ----- Original Message ----- From: Jennifer Jackson <jen@bx.psu.edu> Date: Tuesday, July 26, 2011 12:58 pm Subject: Re: [galaxy-user] BAM files loading and mandatory grooming To: Louise-Amélie Schmitt <louise-amelie.schmitt@embl.de> Cc: galaxy-user@lists.bx.psu.edu, closeticket@galaxyproject.org
Hello L-A,
The load is necessary to create/link the index .bam index. No work-around is available, however if you develop one, we would be glad to consider it as an addition.
If there are any updates on our side, we will send more information.
Thanks!
Jen Galaxy team
On 7/18/11 2:05 AM, Louise-Amélie Schmitt wrote:
Hello everyone
Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all.
Is it this serious? Would there be a way to bypass that?
Thanks! L-A ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/ ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello Mai, I removed you from the galaxy-user list (you were not subscribed to galaxy-dev). For others who may be following, subscriptions can be managed here: http://lists.bx.psu.edu/listinfo/galaxy-user http://lists.bx.psu.edu/listinfo/galaxy-dev Take care, Jen Galaxy team On 7/26/11 11:35 AM, Mai Huynh wrote:
Hello,
Please sign me off the Galaxy email list. I keep getting galaxy emails that I don't know of. I was in Genomic class and had taken galaxy project. Now, that I'm done. Please again sign me off. Thank you so much for your time.
Best regards, Mai H.
P.S. my email is mhuynh@uark.edu
----- Original Message ----- From: Jennifer Jackson<jen@bx.psu.edu> Date: Tuesday, July 26, 2011 12:58 pm Subject: Re: [galaxy-user] BAM files loading and mandatory grooming To: Louise-Amélie Schmitt<louise-amelie.schmitt@embl.de> Cc: galaxy-user@lists.bx.psu.edu, closeticket@galaxyproject.org
Hello L-A,
The load is necessary to create/link the index .bam index. No work-around is available, however if you develop one, we would be glad to consider it as an addition.
If there are any updates on our side, we will send more information.
Thanks!
Jen Galaxy team
On 7/18/11 2:05 AM, Louise-Amélie Schmitt wrote:
Hello everyone
Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all.
Is it this serious? Would there be a way to bypass that?
Thanks! L-A ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/ ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/
Hello Jennifer, Aaaah so the grooming means creating the .bai file? Ok, I didn't know, thanks for the information. But that leaves me wondering why it is necessary to copy the data (unless the alignments are not sorted yet, which is possible knowing where the data comes from). I won't have much time left to look into it but if I can I'll let you know. Thanks, L-A Le 26/07/2011 19:50, Jennifer Jackson a écrit :
Hello L-A,
The load is necessary to create/link the index .bam index. No work-around is available, however if you develop one, we would be glad to consider it as an addition.
If there are any updates on our side, we will send more information.
Thanks!
Jen Galaxy team
On 7/18/11 2:05 AM, Louise-Amélie Schmitt wrote:
Hello everyone
Whenever we try to load BAM files without copying them, we get an error stating the files need grooming, and can't use them at all.
Is it this serious? Would there be a way to bypass that?
Thanks! L-A ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
2011/7/27 Louise-Amélie Schmitt <louise-amelie.schmitt@embl.de>:
Hello Jennifer,
Aaaah so the grooming means creating the .bai file? Ok, I didn't know, thanks for the information. But that leaves me wondering why it is necessary to copy the data (unless the alignments are not sorted yet, which is possible knowing where the data comes from). I won't have much time left to look into it but if I can I'll let you know.
Thanks, L-A
Sadly samtools sort doesn't set the header information, so it is not enough to look at that to say if and how a SAM/BAM file is sorted. I think what the Galaxy code does now is attempt to index the BAM file (as a safe way to find out if is really is suitably sorted), and if that fails sorts it by co-ordinate (which will make a new BAM file) and then indexes the sorted version. Peter
Le 27/07/2011 10:17, Peter Cock a écrit :
2011/7/27 Louise-Amélie Schmitt<louise-amelie.schmitt@embl.de>:
Hello Jennifer,
Aaaah so the grooming means creating the .bai file? Ok, I didn't know, thanks for the information. But that leaves me wondering why it is necessary to copy the data (unless the alignments are not sorted yet, which is possible knowing where the data comes from). I won't have much time left to look into it but if I can I'll let you know.
Thanks, L-A Sadly samtools sort doesn't set the header information, so it is not enough to look at that to say if and how a SAM/BAM file is sorted.
I think what the Galaxy code does now is attempt to index the BAM file (as a safe way to find out if is really is suitably sorted), and if that fails sorts it by co-ordinate (which will make a new BAM file) and then indexes the sorted version.
Peter Ah ok, I understand!
It should be fine though, since we won't need to load bam files anymore shortly. Thanks for all the information! Best, L-A
participants (4)
-
Jennifer Jackson
-
Louise-Amélie Schmitt
-
Mai Huynh
-
Peter Cock