Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
regards, Andreas
Btw. Do you have any plans to ease the pain on adding additional builds? Something simpler than having to add one line for each build*tool combo? These lines seem very redundant to me.
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Hi,
thank you for the pointer. I was only looking at this wiki page: http://wiki.g2.bx.psu.edu/Admin/Data%20Integration
Maybe this should point to your page?
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Well is not my page, I'm not directly related to Galaxy. Still, I think Galaxy Project would be happy to receive updates to the wiki, maybe a complete table can be added to the page you are mentioning.
On Thu, Nov 1, 2012 at 4:27 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
thank you for the pointer. I was only looking at this wiki page: http://wiki.g2.bx.psu.edu/Admin/Data%20Integration
Maybe this should point to your page?
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
-- Andreas Kuntzagk
SystemAdministrator
Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany
Sorry for my unclear sentence, I meant the page you pointed me too.
regards, Andreas
On 01.11.2012 14:50, Carlos Borroto wrote:
Well is not my page, I'm not directly related to Galaxy. Still, I think Galaxy Project would be happy to receive updates to the wiki, maybe a complete table can be added to the page you are mentioning.
On Thu, Nov 1, 2012 at 4:27 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
thank you for the pointer. I was only looking at this wiki page: http://wiki.g2.bx.psu.edu/Admin/Data%20Integration
Maybe this should point to your page?
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
-- Andreas Kuntzagk
SystemAdministrator
Berlin Institute for Medical Systems Biology at the Max-Delbrueck-Center for Molecular Medicine Robert-Roessle-Str. 10, 13125 Berlin, Germany
Hi,
It's still not working. I just noticed that the sam_index dir only contains links to some files in ../seq which is mostly empty except some 2bit files. I could not find any documentation how to obtain these data files.
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Andreas,
When setting up the rsync server, we decided that .fa files would be excluded from the listing, since the 2bit format contains the same data but takes up to 75% less space. I would recommend downloading the relevant .2bit file and converting it back to FASTA with twoBitToFa, then updating your all_fasta.loc file to point to the resulting .fa file.
--Dave B.
On 11/1/12 06:27:49.000, Andreas Kuntzagk wrote:
Hi,
It's still not working. I just noticed that the sam_index dir only contains links to some files in ../seq which is mostly empty except some 2bit files. I could not find any documentation how to obtain these data files.
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Dave,
In the meantime I found that out by myself howto generate the FASTA and also to rum "samtools faidx" on it. The info about "all_fasta.loc" was missing. But it's still not working.
Let me summarize what I did so far:
- tool-data/shared/ucsc/hg18/seq/ contains these files: hg18.2bit hg18.fa hg18.fa.fai
where hg18.2bit was downloaded from the rsync server and the other two generated from it.
- tool-data/shared/ucsc/builds.txt contains this line:
hg18 Human Mar. 2006 (NCBI36/hg18) (hg18)
- tool-data/all_fasta.loc contains this line:
hg18 hg18 Human (Homo sapiens): hg18 tool-data/shared/ucsc/hg18/seq/hg18.fa
- tool-data/sam_fa_indices.loc contains this line:
index hg18 tool-data/shared/ucsc/hg18/sam_index/hg18.fa
- tool-data/srma_index.loc contains this line:
hg18 hg18 hg18 tool-data/shared/ucsc/hg18/srma_index/hg18.fa
So any ideas where to look further?
regards, Andreas
On 01.11.2012 15:36, Dave Bouvier wrote:
Andreas,
When setting up the rsync server, we decided that .fa files would be excluded from the listing, since the 2bit format contains the same data but takes up to 75% less space. I would recommend downloading the relevant .2bit file and converting it back to FASTA with twoBitToFa, then updating your all_fasta.loc file to point to the resulting .fa file.
--Dave B.
On 11/1/12 06:27:49.000, Andreas Kuntzagk wrote:
Hi,
It's still not working. I just noticed that the sam_index dir only contains links to some files in ../seq which is mostly empty except some 2bit files. I could not find any documentation how to obtain these data files.
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Andreas,
I recommend moving hg18.fa.fai into tool-data/shared/ucsc/hg18/sam_index/, and then samtools should work. Also, if you're using picard tools, you'll want hg18.fa.fai and hg18.dict in tool-data/shared/ucsc/hg18/srma_index/, as well as links to hg18.fa in both directories.
--Dave B.
On 11/1/12 11:59:24.000, Andreas Kuntzagk wrote:
Dave,
In the meantime I found that out by myself howto generate the FASTA and also to rum "samtools faidx" on it. The info about "all_fasta.loc" was missing. But it's still not working.
Let me summarize what I did so far:
- tool-data/shared/ucsc/hg18/seq/ contains these files:
hg18.2bit hg18.fa hg18.fa.fai
where hg18.2bit was downloaded from the rsync server and the other two generated from it.
- tool-data/shared/ucsc/builds.txt contains this line:
hg18 Human Mar. 2006 (NCBI36/hg18) (hg18)
- tool-data/all_fasta.loc contains this line:
hg18 hg18 Human (Homo sapiens): hg18 tool-data/shared/ucsc/hg18/seq/hg18.fa
- tool-data/sam_fa_indices.loc contains this line:
index hg18 tool-data/shared/ucsc/hg18/sam_index/hg18.fa
- tool-data/srma_index.loc contains this line:
hg18 hg18 hg18 tool-data/shared/ucsc/hg18/srma_index/hg18.fa
So any ideas where to look further?
regards, Andreas
On 01.11.2012 15:36, Dave Bouvier wrote:
Andreas,
When setting up the rsync server, we decided that .fa files would be excluded from the listing, since the 2bit format contains the same data but takes up to 75% less space. I would recommend downloading the relevant .2bit file and converting it back to FASTA with twoBitToFa, then updating your all_fasta.loc file to point to the resulting .fa file.
--Dave B.
On 11/1/12 06:27:49.000, Andreas Kuntzagk wrote:
Hi,
It's still not working. I just noticed that the sam_index dir only contains links to some files in ../seq which is mostly empty except some 2bit files. I could not find any documentation how to obtain these data files.
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Oh,
I forgot to mention it:
hg18.fa.fai is already in tool-data/shared/ucsc/hg18/sam_index/ as a link to tool-data/shared/ucsc/hg18/seq
Replacing the link with the actual file did not help (not suprising).
regards, Andreas
On 01.11.2012 17:56, Dave Bouvier wrote:
Andreas,
I recommend moving hg18.fa.fai into tool-data/shared/ucsc/hg18/sam_index/, and then samtools should work. Also, if you're using picard tools, you'll want hg18.fa.fai and hg18.dict in tool-data/shared/ucsc/hg18/srma_index/, as well as links to hg18.fa in both directories.
--Dave B.
On 11/1/12 11:59:24.000, Andreas Kuntzagk wrote:
Dave,
In the meantime I found that out by myself howto generate the FASTA and also to rum "samtools faidx" on it. The info about "all_fasta.loc" was missing. But it's still not working.
Let me summarize what I did so far:
- tool-data/shared/ucsc/hg18/seq/ contains these files:
hg18.2bit hg18.fa hg18.fa.fai
where hg18.2bit was downloaded from the rsync server and the other two generated from it.
- tool-data/shared/ucsc/builds.txt contains this line:
hg18 Human Mar. 2006 (NCBI36/hg18) (hg18)
- tool-data/all_fasta.loc contains this line:
hg18 hg18 Human (Homo sapiens): hg18 tool-data/shared/ucsc/hg18/seq/hg18.fa
- tool-data/sam_fa_indices.loc contains this line:
index hg18 tool-data/shared/ucsc/hg18/sam_index/hg18.fa
- tool-data/srma_index.loc contains this line:
hg18 hg18 hg18 tool-data/shared/ucsc/hg18/srma_index/hg18.fa
So any ideas where to look further?
regards, Andreas
On 01.11.2012 15:36, Dave Bouvier wrote:
Andreas,
When setting up the rsync server, we decided that .fa files would be excluded from the listing, since the 2bit format contains the same data but takes up to 75% less space. I would recommend downloading the relevant .2bit file and converting it back to FASTA with twoBitToFa, then updating your all_fasta.loc file to point to the resulting .fa file.
--Dave B.
On 11/1/12 06:27:49.000, Andreas Kuntzagk wrote:
Hi,
It's still not working. I just noticed that the sam_index dir only contains links to some files in ../seq which is mostly empty except some 2bit files. I could not find any documentation how to obtain these data files.
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote:
Hi,
I'm still setting up a local galaxy. Currently I'm testing the setup of NGS tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as build I get a message that "Sequences are not currently available for the specified build." I guess that I have either to manipulate one of the .loc files (but which?) or have to download additional data from rsync server. (I already have the tool-data/shared/hg18 completely)
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ok, I've found the solution. sam_fa_indices.loc contained relative paths. When I changed to absolute paths it worked.
regards, Murple
On 02.11.2012 09:26, Andreas Kuntzagk wrote:
Oh,
I forgot to mention it:
hg18.fa.fai is already in tool-data/shared/ucsc/hg18/sam_index/ as a link to tool-data/shared/ucsc/hg18/seq
Replacing the link with the actual file did not help (not suprising).
regards, Andreas
On 01.11.2012 17:56, Dave Bouvier wrote:
Andreas,
I recommend moving hg18.fa.fai into tool-data/shared/ucsc/hg18/sam_index/, and then samtools should work. Also, if you're using picard tools, you'll want hg18.fa.fai and hg18.dict in tool-data/shared/ucsc/hg18/srma_index/, as well as links to hg18.fa in both directories.
--Dave B.
On 11/1/12 11:59:24.000, Andreas Kuntzagk wrote:
Dave,
In the meantime I found that out by myself howto generate the FASTA and also to rum "samtools faidx" on it. The info about "all_fasta.loc" was missing. But it's still not working.
Let me summarize what I did so far:
- tool-data/shared/ucsc/hg18/seq/ contains these files:
hg18.2bit hg18.fa hg18.fa.fai
where hg18.2bit was downloaded from the rsync server and the other two generated from it.
- tool-data/shared/ucsc/builds.txt contains this line:
hg18 Human Mar. 2006 (NCBI36/hg18) (hg18)
- tool-data/all_fasta.loc contains this line:
hg18 hg18 Human (Homo sapiens): hg18 tool-data/shared/ucsc/hg18/seq/hg18.fa
- tool-data/sam_fa_indices.loc contains this line:
index hg18 tool-data/shared/ucsc/hg18/sam_index/hg18.fa
- tool-data/srma_index.loc contains this line:
hg18 hg18 hg18 tool-data/shared/ucsc/hg18/srma_index/hg18.fa
So any ideas where to look further?
regards, Andreas
On 01.11.2012 15:36, Dave Bouvier wrote:
Andreas,
When setting up the rsync server, we decided that .fa files would be excluded from the listing, since the 2bit format contains the same data but takes up to 75% less space. I would recommend downloading the relevant .2bit file and converting it back to FASTA with twoBitToFa, then updating your all_fasta.loc file to point to the resulting .fa file.
--Dave B.
On 11/1/12 06:27:49.000, Andreas Kuntzagk wrote:
Hi,
It's still not working. I just noticed that the sam_index dir only contains links to some files in ../seq which is mostly empty except some 2bit files. I could not find any documentation how to obtain these data files.
regards, Andreas
On 31.10.2012 17:50, Carlos Borroto wrote:
On Wed, Oct 31, 2012 at 11:30 AM, Andreas Kuntzagk andreas.kuntzagk@mdc-berlin.de wrote: > Hi, > > I'm still setting up a local galaxy. Currently I'm testing the setup > of NGS > tools. If I try "SAM to BAM" for a BAM file that has "hg18" set as > build I > get a message that > "Sequences are not currently available for the specified build." I > guess > that I have either to manipulate one of the .loc files (but which?) > or have > to download additional data from rsync server. > (I already have the tool-data/shared/hg18 completely) >
The .loc file you want to modify is 'tool-data/sam_fa_indices.loc'. You can find information about this subject in the wiki[1]. Although the table there is not complete, so you could always find the right xml under 'tools' and poke inside to find a line like this one: <validator type="dataset_metadata_in_file" filename="sam_fa_indices.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." line_startswith="index" />
[1]http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
And I agree, dealing with .loc files is quite cumbersome.
Hope it helps, Carlos
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
galaxy-dev@lists.galaxyproject.org