From graham.etherington@sainsbury-laboratory.ac.uk Fri Jul 12 08:17:16 2013 From: "graham etherington (TSL)" To: galaxy-dev@lists.galaxyproject.org Subject: [galaxy-dev] No samtools build after building index through Data Manager. Date: Fri, 12 Jul 2013 12:14:46 +0000 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2305301055038727363==" --===============2305301055038727363== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Hi, I've tried using the Data Manager (Admin > Data > Manage local data (beta)) to install builds for BWA and Samtools on my local Galaxy instance. Previous to using the Data Manager, I used to add the build to tool-data/shared/ucsc/builds.txt, create the .fai indexes (for samtools) from the command line, add them to tool-data/sam_fa_indices.loc and restart Galaxy (obviously doing a similar thing for BWA and adding the build to bwa_index.loc). I thought I'd try using the Data Manager to add builds for BWA and Samtools. The BWA builds work fine (I can map to the build), but when I try to use SAM-to-BAM I get the error "Sequences are not currently available for the specified build." Using the Data Manager creates the directory tool-data/n_sylvestris/ which contains the sub-dirs 'seq', 'bwa_index' and 'sam_index'. 'seq' contains a symlink to the n_sylvestris.fa sequence. 'sam_index' and 'bwa_index' both contains the sub-directory 'n_sylvestris', which contains a symlink to the symlink for n_sylvestris.fa in 'seq' along with their respective n_sylvestris.fa.xxx index files. OK - all goodÅ  In tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/ there are three subdirectories: data_manager_bwa_index_builder, data_manager_sam_fa_index_builder and data_manager_fetch_genome_all_fasta All three directories contain all_fasta.loc, tool_data_table_conf.xml, tool_data_table_conf.xml.sample and (for sam and bam dirs) their pertinent index.loc file. The data_manager_fetch_genome_all_fasta/all_fasta.loc file contains the path to the fasta symlinks. The all_fasta.loc files in the sam and bwa data_manager_index_builder directories don't contain any uncommented lines. The index.loc files in the sam and bwa data_manager_index_builder directories point to: tool-data/n_sylvestris/bwa_index/n_sylvestris/n_sylvestris.fa tool-data/n_sylvestris/sam_index/n_sylvestris/n_sylvestris.fa As BWA runs fine, it's obviously reading the bwa_index.loc file from the directory: tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_bwa_ind ex_builder/fe6508204acc/bwa_index.loc ...but it's not reading the samtools indexes at: tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_sam_fa_ index_builder/926e50397b83/sam_fa_indices.loc For Galaxy to find the sam indexes, I have to go to the tool-data/sam_fa_indices.loc file and manually insert into it the contents of: tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_sam_fa_ index_builder/926e50397b83/sam_fa_indices.loc So, I guess my question is: other than inserting the genome builds into builds.txt, should I be doing any other configuration to get Data Manager to write and configure Galaxy to read it's newly created builds. I find it strange that the BWA builds work OK, but the Samtools ones don't. I've done a few greps for mentions of .loc files in Galaxy and the only difference between the bwa and sam .loc files is that there is a file tool-data/tool_data_table_conf.xml (plus a .sample version) which contains: line_type, value, path
Could Galaxy be reading this file and ignoring the one in tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/ ?? Best wishes, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 --===============2305301055038727363==-- From dan@bx.psu.edu Fri Jul 12 10:46:10 2013 From: Daniel Blankenberg To: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] No samtools build after building index through Data Manager. Date: Fri, 12 Jul 2013 10:45:00 -0400 Message-ID: <00AE4B7F-98AD-4433-BD5D-E423E7324E90@bx.psu.edu> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6570165232477240472==" --===============6570165232477240472== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Graham, Which revision of Galaxy are you currently using? Currently data managers req= uire at least 9952:a28faa6ac188 on the default branch. When installed from the tool shed, data managers use the shed_tool_data_table= _conf.xml file, could you check the contents of that file?=20 Also can you check the paster.log during server start up for any errors, its = most likely the case that this changeset https://bitbucket.org/galaxy/galaxy-= central/commits/af20b15f7eda (10145:af20b15f7eda) creates a non-compatiible t= ool data table entry in the main file. Thanks for using Galaxy, Dan On Jul 12, 2013, at 8:14 AM, graham etherington (TSL) wrote: > Hi, > I've tried using the Data Manager (Admin > Data > Manage local data > (beta)) to install builds for BWA and Samtools on my local Galaxy instance. > Previous to using the Data Manager, I used to add the build to > tool-data/shared/ucsc/builds.txt, create the .fai indexes (for samtools) > from the command line, add them to tool-data/sam_fa_indices.loc and > restart Galaxy (obviously doing a similar thing for BWA and adding the > build to bwa_index.loc). >=20 > I thought I'd try using the Data Manager to add builds for BWA and > Samtools. The BWA builds work fine (I can map to the build), but when I > try to use SAM-to-BAM I get the error "Sequences are not currently > available for the specified build." >=20 > Using the Data Manager creates the directory tool-data/n_sylvestris/ which > contains the sub-dirs 'seq', 'bwa_index' and 'sam_index'. > 'seq' contains a symlink to the n_sylvestris.fa sequence. > 'sam_index' and 'bwa_index' both contains the sub-directory > 'n_sylvestris', which contains a symlink to the symlink for > n_sylvestris.fa in 'seq' along with their respective n_sylvestris.fa.xxx > index files. >=20 > OK - all good=C5=A0 >=20 > In tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/ there are three > subdirectories: > data_manager_bwa_index_builder, data_manager_sam_fa_index_builder and > data_manager_fetch_genome_all_fasta > All three directories contain all_fasta.loc, tool_data_table_conf.xml, > tool_data_table_conf.xml.sample and (for sam and bam dirs) their pertinent > index.loc file.=20 >=20 > The data_manager_fetch_genome_all_fasta/all_fasta.loc file contains the > path to the fasta symlinks. >=20 > The all_fasta.loc files in the sam and bwa data_manager_index_builder > directories don't contain any uncommented lines. >=20 > The index.loc files in the sam and bwa data_manager_index_builder > directories point to: > tool-data/n_sylvestris/bwa_index/n_sylvestris/n_sylvestris.fa > tool-data/n_sylvestris/sam_index/n_sylvestris/n_sylvestris.fa >=20 > As BWA runs fine, it's obviously reading the bwa_index.loc file from the > directory: > tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_bwa_ind > ex_builder/fe6508204acc/bwa_index.loc >=20 > ...but it's not reading the samtools indexes at: > tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_sam_fa_ > index_builder/926e50397b83/sam_fa_indices.loc >=20 > For Galaxy to find the sam indexes, I have to go to the > tool-data/sam_fa_indices.loc file and manually insert into it the contents > of: > tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_sam_fa_ > index_builder/926e50397b83/sam_fa_indices.loc >=20 >=20 > So, I guess my question is: other than inserting the genome builds into > builds.txt, should I be doing any other configuration to get Data Manager > to write and configure Galaxy to read it's newly created builds. I find it > strange that the BWA builds work OK, but the Samtools ones don't. >=20 > I've done a few greps for mentions of .loc files in Galaxy and the only > difference between the bwa and sam .loc files is that there is a file > tool-data/tool_data_table_conf.xml (plus a .sample version) which contains: >=20 > > > > > line_type, value, path > >
>
>=20 >=20 > Could Galaxy be reading this file and ignoring the one in > tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/ ?? >=20 >=20 >=20 > Best wishes, > Graham >=20 >=20 >=20 >=20 > Dr. Graham Etherington > Bioinformatics Support Officer, > The Sainsbury Laboratory, > Norwich Research Park, > Norwich NR4 7UH. > UK > Tel: +44 (0)1603 450601 >=20 >=20 >=20 >=20 > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > http://lists.bx.psu.edu/ >=20 > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/ >=20 --===============6570165232477240472==-- From graham.etherington@sainsbury-laboratory.ac.uk Mon Jul 15 06:28:48 2013 From: "graham etherington (TSL)" To: galaxy-dev@lists.galaxyproject.org Subject: Re: [galaxy-dev] No samtools build after building index through Data Manager. Date: Mon, 15 Jul 2013 10:28:05 +0000 Message-ID: In-Reply-To: <00AE4B7F-98AD-4433-BD5D-E423E7324E90@bx.psu.edu> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1732753809775748908==" --===============1732753809775748908== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Hi Dan, Thanks for your reply. I don't handle the updates, but it was updated on 25 Jun 13. I'm pretty sure I can rule out the version number as the BWA builds, created using the Data Manager, work fine. Here is the pertinent contents of shed_tool_data_table_conf.xml value, dbkey, name, path testtoolshed.g2.bx.psu.edudata_manager_bwa_index_builderblankenbergfe6508204ac c
value, dbkey, name, path testtoolshed.g2.bx.psu.edudata_manager_sam_fa_index_builderblankenberg926e5039 7b83
line_type, value, path testtoolshed.g2.bx.psu.edudata_manager_sam_fa_index_builderblankenberg926e5039 7b83
value, dbkey, name, path testtoolshed.g2.bx.psu.edudata_manager_fetch_genome_all_fastablankenbergca8b37 09309e
in ${GALAXY_HOME}/tool_data_table_conf.xml.sample the entry for sam_fa reads as so: line_type, value, path
The file tool-data/sam_fa_new_indices.loc (and .sample) does not exist. If I keep the manually inserted builds listed in tool-data/sam_fa_indices.loc and restart Galaxy, then I get the following (abridged) entries in the paster.log: galaxy.tools.data DEBUG 2013-07-15 09:51:11,109 Loaded tool data table 'all_fasta' galaxy.tools.data DEBUG 2013-07-15 09:51:11,115 Loaded tool data table 'bwa_indexes' galaxy.tools.data DEBUG 2013-07-15 09:51:11,116 Loaded tool data table 'bwa_indexes_color' galaxy.tools.data DEBUG 2013-07-15 09:51:11,167 Loaded tool data table 'sam_fa_indexes' ... galaxy.tools.data DEBUG 2013-07-15 09:51:11,324 Loading another instance of data table 'all_fasta', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 09:51:11,340 Loading another instance of data table 'bwa_indexes', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 09:51:11,348 Loading another instance of data table 'bwa_indexes_color', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 09:51:11,410 Loading another instance of data table 'all_fasta', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 09:51:11,422 Loading another instance of data table 'sam_fa_indexes', attempting to merge content. galaxy.tools.data ERROR 2013-07-15 09:51:11,422 Attempted to add fields (['index', 'cfraxinea_s1v1', '/home/galaxy/software/galaxy-central/tool-data/cfraxinea_s1v1/sam_index/cf raxinea_s1v1/c_fraxinea_s1v1.fa']) to data table 'sam_fa_indexes', but this entry already exists and allow_duplicates is False. galaxy.tools.data ERROR 2013-07-15 09:51:11,422 Attempted to add fields (['index', 'b_distachyon', '/home/galaxy/software/galaxy-central/tool-data/b_distachyon/sam_index/b_di stachyon/b_distachyon.fa']) to data table 'sam_fa_indexes', but this entry already exists and allow_duplicates is False. galaxy.tools.data ERROR 2013-07-15 09:51:11,423 Attempted to add fields (['index', 'n_sylvestris', '/home/galaxy/software/galaxy-central/tool-data/n_sylvestris/sam_index/n_sy lvestris/n_sylvestris.fa']) to data table 'sam_fa_indexes', but this entry already exists and allow_duplicates is False. galaxy.tools.data ERROR 2013-07-15 09:51:11,423 Attempted to add fields (['index', 'n_tomentosiformis', '/home/galaxy/software/galaxy-central/tool-data/n_tomentosiformis/sam_index /n_tomentosiformis/n_tomentosiformis.fa']) to data table 'sam_fa_indexes', but this entry already exists and allow_duplicates is False. galaxy.tools.data DEBUG 2013-07-15 09:51:11,491 Loading another instance of data table 'all_fasta', attempting to merge content. The builds listed above with the 'entry already exists' error, are all the ones which I attempted to load using the Data Manager. If I comment out these in entries in tool-data/sam_fa_indices.loc and restart Galaxy, the errors disappear, but when I try to use those builds to SAM-to-BAM, I'm back to the "Sequences are not currently available for the specified build." error. paster.log entries after re-start: galaxy.tools.data DEBUG 2013-07-15 10:02:56,484 Loaded tool data table 'all_fasta' galaxy.tools.data DEBUG 2013-07-15 10:02:56,491 Loaded tool data table 'bwa_indexes' galaxy.tools.data DEBUG 2013-07-15 10:02:56,492 Loaded tool data table 'bwa_indexes_color' galaxy.tools.data DEBUG 2013-07-15 10:02:56,497 Loaded tool data table 'sam_fa_indexes' ... galaxy.tools.data DEBUG 2013-07-15 10:02:56,508 Loading another instance of data table 'all_fasta', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 10:02:56,509 Loading another instance of data table 'bwa_indexes', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 10:02:56,510 Loading another instance of data table 'bwa_indexes_color', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 10:02:56,512 Loading another instance of data table 'all_fasta', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 10:02:56,512 Loading another instance of data table 'sam_fa_indexes', attempting to merge content. galaxy.tools.data DEBUG 2013-07-15 10:02:56,514 Loading another instance of data table 'all_fasta', attempting to merge content. On both occasions the following lines are in the paster.log galaxy.tools.data_manager.manager DEBUG 2013-07-15 09:51:47,667 Loaded Data Manager: testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_bwa_index_builder /data_manager/bwa_index_builder/0.0.1 galaxy.tools.data_manager.manager DEBUG 2013-07-15 09:51:47,689 Loaded Data Manager: testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_bwa_index_builder /data_manager/bwa_color_space_index_builder/0.0.1 galaxy.tools.data_manager.manager DEBUG 2013-07-15 09:51:47,755 Loaded Data Manager: testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_sam_fa_index_buil der/data_manager/sam_fa_index_builder/0.0.1 Any suggestions? Cheers for now, Graham Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK Tel: +44 (0)1603 450601 On 12/07/2013 15:45, "Daniel Blankenberg" wrote: >Hi Graham, > >Which revision of Galaxy are you currently using? Currently data managers >require at least 9952:a28faa6ac188 on the default branch. > >When installed from the tool shed, data managers use the >shed_tool_data_table_conf.xml file, could you check the contents of that >file? > >Also can you check the paster.log during server start up for any errors, >its most likely the case that this changeset >https://bitbucket.org/galaxy/galaxy-central/commits/af20b15f7eda >(10145:af20b15f7eda) creates a non-compatiible tool data table entry in >the main file. > > >Thanks for using Galaxy, > >Dan > > >On Jul 12, 2013, at 8:14 AM, graham etherington (TSL) wrote: > >> Hi, >> I've tried using the Data Manager (Admin > Data > Manage local data >> (beta)) to install builds for BWA and Samtools on my local Galaxy >>instance. >> Previous to using the Data Manager, I used to add the build to >> tool-data/shared/ucsc/builds.txt, create the .fai indexes (for samtools) >> from the command line, add them to tool-data/sam_fa_indices.loc and >> restart Galaxy (obviously doing a similar thing for BWA and adding the >> build to bwa_index.loc). >> >> I thought I'd try using the Data Manager to add builds for BWA and >> Samtools. The BWA builds work fine (I can map to the build), but when I >> try to use SAM-to-BAM I get the error "Sequences are not currently >> available for the specified build." >> >> Using the Data Manager creates the directory tool-data/n_sylvestris/ >>which >> contains the sub-dirs 'seq', 'bwa_index' and 'sam_index'. >> 'seq' contains a symlink to the n_sylvestris.fa sequence. >> 'sam_index' and 'bwa_index' both contains the sub-directory >> 'n_sylvestris', which contains a symlink to the symlink for >> n_sylvestris.fa in 'seq' along with their respective n_sylvestris.fa.xxx >> index files. >> >> OK - all goodÅ  >> >> In tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/ there are >>three >> subdirectories: >> data_manager_bwa_index_builder, data_manager_sam_fa_index_builder and >> data_manager_fetch_genome_all_fasta >> All three directories contain all_fasta.loc, tool_data_table_conf.xml, >> tool_data_table_conf.xml.sample and (for sam and bam dirs) their >>pertinent >> index.loc file. >> >> The data_manager_fetch_genome_all_fasta/all_fasta.loc file contains the >> path to the fasta symlinks. >> >> The all_fasta.loc files in the sam and bwa data_manager_index_builder >> directories don't contain any uncommented lines. >> >> The index.loc files in the sam and bwa data_manager_index_builder >> directories point to: >> tool-data/n_sylvestris/bwa_index/n_sylvestris/n_sylvestris.fa >> tool-data/n_sylvestris/sam_index/n_sylvestris/n_sylvestris.fa >> >> As BWA runs fine, it's obviously reading the bwa_index.loc file from the >> directory: >> >>tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_bwa_i >>nd >> ex_builder/fe6508204acc/bwa_index.loc >> >> ...but it's not reading the samtools indexes at: >> >>tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_sam_f >>a_ >> index_builder/926e50397b83/sam_fa_indices.loc >> >> For Galaxy to find the sam indexes, I have to go to the >> tool-data/sam_fa_indices.loc file and manually insert into it the >>contents >> of: >> >>tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/data_manager_sam_f >>a_ >> index_builder/926e50397b83/sam_fa_indices.loc >> >> >> So, I guess my question is: other than inserting the genome builds into >> builds.txt, should I be doing any other configuration to get Data >>Manager >> to write and configure Galaxy to read it's newly created builds. I find >>it >> strange that the BWA builds work OK, but the Samtools ones don't. >> >> I've done a few greps for mentions of .loc files in Galaxy and the only >> difference between the bwa and sam .loc files is that there is a file >> tool-data/tool_data_table_conf.xml (plus a .sample version) which >>contains: >> >> >> >> >> >> line_type, value, path >> >>
>>
>> >> >> Could Galaxy be reading this file and ignoring the one in >> tool-data/testtoolshed.g2.bx.psu.edu/repos/blankenberg/ ?? >> >> >> >> Best wishes, >> Graham >> >> >> >> >> Dr. Graham Etherington >> Bioinformatics Support Officer, >> The Sainsbury Laboratory, >> Norwich Research Park, >> Norwich NR4 7UH. >> UK >> Tel: +44 (0)1603 450601 >> >> >> >> >> ___________________________________________________________ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> http://lists.bx.psu.edu/ >> >> To search Galaxy mailing lists use the unified search at: >> http://galaxyproject.org/search/mailinglists/ >> > --===============1732753809775748908==--