Re: [galaxy-dev] Salmon references and data manager

7 Sep 2018

      Yeah, I got confused about the data tables. Sorry about this. I too
would keep the transcriptome indices separate from the reference
genomes, it just makes sense.

 @Ignacio, I found that you need insert the following (in red)

if not os.path.exists( target_directory ):

    os.mkdir( target_directory )

args = ['salmon','index']

in order for anything to happen.

I think that's it...but I'll test some more.

Best regards,

Christopher

On 09/07/2018 10:46 AM, Ignacio EGUINOA wrote:
...
Hi Christopher and Björn,
I have some comments about this because I also came up with these
questions some time ago...
------------------------------------------------------------------------
*From: *"Björn Grüning" <bjoern.gruening@gmail.com>
    *To: *"Previti" <christopher.previti@dkfz-heidelberg.de>,
    "galaxy-dev" <galaxy-dev@lists.galaxyproject.org>
    *Sent: *Friday, September 7, 2018 9:56:41 AM
    *Subject: *Re: [galaxy-dev] Salmon references and data manager
Hi Christopher!
> Dear Björn,
    >
    > I just installed Salmon on our Galaxy instance and I have a
    couple of
    > basic questions.
Sure, thanks for getting in touch!
> Currently the reference transcriptomes are put in the same data
    table as
    > the genomes, would it be of interest to separate this and give the
    >
    > transcriptomes their own table? I could probably try to do this...
That I don't understand?
    Salmon is using this one here, isn't it?
https://github.com/bgruening/galaxytools/blob/master/tools/salmon/salmon.xml...
What he means, I think, is the table to build the index from. Data
managers that take a transcriptome as input get it from the all_fasta
table, I think that is what he means by the genomes table.
As I said at some point I also thought it may be useful to have a
separate table (e.g all_transcriptomes) so that the genome and
transcriptome entries of the same build don't get mixed. I think it
would be good to have a way of listing only the transcriptomes from
the all_gff but that would requiere some kind of standard on the
naming to filter. We had this in our instance at some point but didn't
help at all so I just modified the data manger to use the all_fasta
and that is what I published.
So, @Christopher ...having a separate table is not the solution
although it would be easier for the GUI. For now just giving the
entries a descriptive name to indicate the entries correspond to a
transcriptome is enough and works ok for us. In any case this is not
for users and at least for us its all handled through the API so,
again, it's just a matter of taking care of the entries names and you
are fine with using the all_fasta table.
> There is a data manager available that unfortunately has a bug.
    We fixed
    > that and it now populates the reference genome data table.
Do you mean this one?
https://github.com/ieguinoa/data_manager_salmon_index_builder
> I would probably modify this as well use the new table. Could
    this be
    > useful? I'm not sure how to proceed...would I give you the modified
    > Salmon wrapper for inclusion in the package?
If you can, please feel free to create PRs to the repositories, so we
    can all reviewed it. And then, when we merge, it gets automatically
    updated to the Tool Shed :)
As Björn said, if that's the one you are talking about please create a
PR or an isssue or contact me.
Cheers,
Ignacio
Thanks!
    Bjoern
> Best regards,
    >
    > Christopher
    >
    >
    > --
    > *Dr. Christopher Previti*
    > Genomics and Proteomics Core Facility
    > High Throughput Sequencing (W190)
    > Bioinformatician
    >
    > German Cancer Research Center (DKFZ)
    > Foundation under Public Law
    > Im Neuenheimer Feld 580
    > 69120 Heidelberg
    > Germany
    > Room: B2.102 (INF580/TP3)
    > Phone: +49 6221 42-4661
    >
    > christopher.previti@dkfz.de <http://www.dkfz.de/>
    > www.dkfz.de <http://www.dkfz.de/>
    >
    > Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
    > VAT-ID No.: DE143293537
    >
    > Vertraulichkeitshinweis: Diese Nachricht ist ausschließlich für die
    > Personen bestimmt, an die sie adressiert ist.
    > Sie kann vertrauliche und/oder nur für den/die Empfänger bestimmte
    > Informationen enthalten. Sollten Sie nicht
    > der bestimmungsgemäße Empfänger sein, kontaktieren Sie bitte den
    > Absender und löschen Sie die Mitteilung.
    > Jegliche unbefugte Verwendung der Informationen in dieser
    Nachricht ist
    > untersagt.
    >
    >
    ___________________________________________________________
    Please keep all replies on the list by using "reply all"
    in your mail client.  To manage your subscriptions to this
    and other Galaxy lists, please use the interface at:
      https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
      http://galaxyproject.org/search/
-- 
*Dr. Christopher Previti*
Genomics and Proteomics Core Facility
High Throughput Sequencing (W190)
Bioinformatician

German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 580
69120 Heidelberg
Germany
Room: B2.102 (INF580/TP3)
Phone: +49 6221 42-4661

christopher.previti@dkfz.de <http://www.dkfz.de/>
www.dkfz.de <http://www.dkfz.de/>

Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
VAT-ID No.: DE143293537

Vertraulichkeitshinweis: Diese Nachricht ist ausschließlich für die
Personen bestimmt, an die sie adressiert ist.
Sie kann vertrauliche und/oder nur für den/die Empfänger bestimmte
Informationen enthalten. Sollten Sie nicht
der bestimmungsgemäße Empfänger sein, kontaktieren Sie bitte den
Absender und löschen Sie die Mitteilung.
Jegliche unbefugte Verwendung der Informationen in dieser Nachricht ist
untersagt.