building NGS indexes on user sequences
Hi We want to allow users to index sequences in galaxy (for example in the data libraries) for use with NGS tools (Bowtie, BWA, etc). Has anyone made tools for NGS indexing? It will also require changing the .loc files for the NGS tools & dynamic re-configuring of tools, which is all possible. Users might want to index & map reads to either their own sequences or other sequences that arent available in the default list, and this would remove the need for admin intervention. Any one else doing something similar? Or can it be done now & I just dont see it? Cheers, Mike.
On Tue, Nov 30, 2010 at 2:14 PM, Michael Pheasant <mike@pheasant.co.nz> wrote:
Hi
We want to allow users to index sequences in galaxy (for example in the data libraries) for use with NGS tools (Bowtie, BWA, etc). Has anyone made tools for NGS indexing?
Hi Mike, Could you clarify what you mean by NGS indexing? There are lots of different index schemes - e.g. *.BAI are indexes for BAM files, similarly samtools has its own index scheme for FASTA files. Other examples include BioPerl flat file indexes (including OBDA text file and BSD indices) for sequential files (e.g. FASTA, GenBank, EMBL, ...). Peter
In particular I was thinking of the read-mappers like Bowtie & BWA. The indexes are made by the sysadmin and added to the .loc file 'by hand'. It would be good if a user could add their sequence of interest as a library - maybe a new genome, or refererence sequence like mitochondria or something - and then create the required index themself, then they can run their read mapping on that sequence. Cheers, Mike. On Wed, Dec 1, 2010 at 1:01 AM, Peter <peter@maubp.freeserve.co.uk> wrote:
On Tue, Nov 30, 2010 at 2:14 PM, Michael Pheasant <mike@pheasant.co.nz> wrote:
Hi
We want to allow users to index sequences in galaxy (for example in the data libraries) for use with NGS tools (Bowtie, BWA, etc). Has anyone made tools for NGS indexing?
Hi Mike,
Could you clarify what you mean by NGS indexing? There are lots of different index schemes - e.g. *.BAI are indexes for BAM files, similarly samtools has its own index scheme for FASTA files. Other examples include BioPerl flat file indexes (including OBDA text file and BSD indices) for sequential files (e.g. FASTA, GenBank, EMBL, ...).
Peter
Michael Pheasant wrote:
Hi
We want to allow users to index sequences in galaxy (for example in the data libraries) for use with NGS tools (Bowtie, BWA, etc). Has anyone made tools for NGS indexing?
It will also require changing the .loc files for the NGS tools & dynamic re-configuring of tools, which is all possible.
Users might want to index & map reads to either their own sequences or other sequences that arent available in the default list, and this would remove the need for admin intervention.
Any one else doing something similar? Or can it be done now & I just dont see it?
Hi Mike, With a little modification, I've been using Brad Chapman's script to do this on the cloud indices volume. It'll need to be updated to generate location files for the new data table formats, though. Here's the script: https://github.com/chapmanb/bcbb/blob/master/ec2/biolinux/data_fabfile.py --nate
Cheers,
Mike. _______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Nate and Mike; [Automated scripts for indexing and updating .loc files]
With a little modification, I've been using Brad Chapman's script to do this on the cloud indices volume. It'll need to be updated to generate location files for the new data table formats, though. Here's the script:
https://github.com/chapmanb/bcbb/blob/master/ec2/biolinux/data_fabfile.py
Awesome, really glad this is useful. The latest version, updated last Friday, produces location files that work with the new data table formats. Also, I'm happy to roll in changes to make the script more general. Nate, if you think there are useful things in your modifications definitely send along a diff or change request and we can get it rolled in. Brad
Brad Chapman wrote:
Nate and Mike;
[Automated scripts for indexing and updating .loc files]
With a little modification, I've been using Brad Chapman's script to do this on the cloud indices volume. It'll need to be updated to generate location files for the new data table formats, though. Here's the script:
https://github.com/chapmanb/bcbb/blob/master/ec2/biolinux/data_fabfile.py
Awesome, really glad this is useful. The latest version, updated last Friday, produces location files that work with the new data table formats.
Great!
Also, I'm happy to roll in changes to make the script more general. Nate, if you think there are useful things in your modifications definitely send along a diff or change request and we can get it rolled in.
Will do, thanks. --nate
Brad
participants (4)
-
Brad Chapman
-
Michael Pheasant
-
Nate Coraor
-
Peter