Re: [galaxy-dev] [galaxy-bugs] Fwd: Fwd: Installing data for Galaxy
Hi Iwe,
In the datatypes_conf.xml I added: <datatype extension="fastabench" type="galaxy.sequence:Fasta" display_in_upload="true"/>
All FASTA type files (fasta, fastabench, etc.) will be valid input for the 'fastabench' datatype. If you want to have a unique datatype that only includes 'fastabench' datatypes, you will need to create a new datatype class; if you want this datatype to be valid for 'fasta' input as well, this datatype should subclass Fasta. For example, in lib/galaxy/datatypes/sequence.py, add: class FastaBench( Fasta ): """Class representing a FASTA bench sequence""" file_ext = "fastabench" and change the entry in datatypes_conf.xml to (assuming you want it visible in the upload display): <datatype extension="fastabench" type="galaxy.sequence:FastaBench" display_in_upload="true"/> Also, confirm by looking at the screen output when starting the galaxy server via run.sh that the datatype is loading properly, if a datatype cannot be loaded properly it will default back to text.
Also when I choose an already existing (present in datatypes_conf.xml) format like "codcmp", and I add: <param format="codcmp" name="input5" type="data" label="codcmp list example"/> the list shows all the files in the file history tab.
Correct, the codcmp datatype is of class galaxy.datatypes.data:Text, so all text files are valid codcmp files. Let us know how it goes and if we can be of further assistance. Thanks, Dan On Feb 18, 2010, at 7:09 AM, Iwe Muiser wrote:
Hey Dan,
In the datatypes_conf.xml I added: <datatype extension="fastabench" type="galaxy.sequence:Fasta" display_in_upload="true"/>
Then in the XML file of the tool I added: <param format="fastabench" name="input2" type="data" label="fastabench list example"/>
Also when I choose an already existing (present in datatypes_conf.xml) format like "codcmp", and I add: <param format="codcmp" name="input5" type="data" label="codcmp list example"/> the list shows all the files in the file history tab. It seems to only work well with the very standard formats like fasta, BED, wig and so on. But I can not find the proper place to add my new datatype so that it will be shown in a list.
Iwe
2010/2/17 Daniel Blankenberg <dan@bx.psu.edu> Hi Iwe,
Can you provide copies of the relevant sections of your datatypes_conf.xml, the actual datatype you created as well as the tool's xml file?
Thanks,
Dan
On Feb 17, 2010, at 8:23 AM, Daniel Blankenberg wrote:
Begin forwarded message:
From: Iwe Muiser <e.c.muiser@gmail.com> Date: February 17, 2010 6:14:24 AM EST To: Daniel Blankenberg <dan@bx.psu.edu> Subject: Re: [galaxy-bugs] Fwd: [galaxy-dev] Installing data for Galaxy
Hello Dan,
I tried and it's not working. I also deleted the underscores which also doesn't seem to help. A form keeps showing all the files in my history instead of the files with the new datatype I set. I have the idea I did not set the datatype correctly in a certain file, but I can't find which one it should be.
Iwe
2010/2/16 Daniel Blankenberg <dan@bx.psu.edu> Hello Iwe,
Have you tried using all lowercase characters for the extension of your new datatype, including in the datatypes_conf.xml file?
Thanks,
Dan
On Feb 16, 2010, at 10:55 AM, Guruprasad Ananda wrote:
Hi Iwe,
I'm forwarding your message to galaxy-bugs mailing list so that the right person can help you.
Thanks, Guru.
Begin forwarded message:
---------- Forwarded message ---------- From: Iwe Muiser <e.c.muiser@gmail.com> Date: 2010/2/11 Subject: Re: [galaxy-dev] Installing data for Galaxy To: Guruprasad Ananda <gua110@bx.psu.edu>
Hello Guru,
I have another question concerning adding a data format. I added a more stringent version of fasta to the datatypes as instructed on the website (adding to data_types.xml, building a sniffer, adding this to registry.py). Also I added a PWM datatype. I also installed a tool that needs PWM data as input (I set "format="PFMsimple""). If I upload a PWM file it will be automatically recognized as such. However, when I want to make a file selection list for the tool that only shows the PWM files I run into problems. Instead of only showing the PWM format files, it shows all files present in the history. It does seem to work with fasta (or all regular file formats for that matter). I clearly forgot to set something right. Do you have a clue as to where to start?
Thanks!
Iwe
On Feb 5, 2010, at 4:49 AM, Iwe Muiser wrote:
Hey Guru,
Nate does not seem to answer. Is he absent or did something go wrong with the CC-ing?
Greetings,
Iwe
2010/2/3 Guruprasad Ananda <gua110@bx.psu.edu>: > Hey Iwe, > > I'm glad you got it running! Please don't hesitate to email us if you have any questions in the future. > > Thanks for using Galaxy, > Guru. > > On Feb 3, 2010, at 4:59 AM, Iwe Muiser wrote: > >> Hey Guru, >> >> Thanks for the good description and cc-ing my other issue. I have >> installed several organisms now and it seems to be working perfectly. >> >> Greetings, >> >> Iwe >> >> 2010/2/2 Guruprasad Ananda <gua110@bx.psu.edu>: >>> Hi Iwe, >>> >>> Yes, the path in alignseq.loc should contain the filename as well. For instance, here's how my alignseq.loc entry looks: >>> seq hg18 /Users/guru/Desktop/hg18.2bit.2bit >>> In fact, I tried downloading the 2bit file from UCSC just now, and updated my alignseq.loc and ran galaxy. The 'Extract genomic DNA' tool worked just fine and returned sequences. The only reason I could think why it isn't working for you is the alignseq.loc entry issue. Also, for some reason, the UCSC file has ".2bit" repeated twice in it's name. I'm sure you might have noticed that already. If not, please make sure the file name is ok. >>> >>> Also, I'm forwarding your email to Nate (on cc), who is the best person to help you with the issues you are facing with executing your binary. >>> >>> Please let us know if you have any further questions, >>> Guru. >>> >>> >>> On Feb 2, 2010, at 5:51 AM, Iwe Muiser wrote: >>> >>>> Hey Guru, >>>> >>>> There are some things I don't understand. I added the directory >>>> containing my binary executable to the PATH. if I open a terminal, it >>>> is able to find it by simply typing "32cotrasif_gw". When I make >>>> Galaxy execute the binary I get the following error: "An error >>>> occurred running this job: /bin/sh: 32cotrasif_gw: not found". it >>>> seems Galaxy uses sh to execute which does not seem to look at .bashrc >>>> or PATH variables. >>>> So I tried to set the interpreter to "bash". This gives the following >>>> error: "/home/muiser/Programs/galaxy_dist/tools/_motif_disc/32cotrasif_gw: >>>> /home/muiser/Programs/galaxy_dist/tools/_motif_disc/32cotrasif_gw: >>>> cannot execute binary file". >>>> >>>> "bash -c" needs quotes everywhere and makes the whole process >>>> unnecessarily complicated from my point of view (and has it never >>>> worked as well). >>>> >>>> A python wrapper script around cotrasif does make the output file >>>> (dataset_X.dat) but still gives me a red results item because it >>>> interprets the scripts standard output as faulty somehow. >>>> >>>> Do you have any ideas that might help me out? >>>> >>>> Also, the path to my hg18.2bit is set correctly. Mustn't I set the >>>> filename in this path as well? or just the path to the file itself? >>>> >>>> Greetings, >>>> >>>> Iwe >>>> >>>> 2010/2/1 Guruprasad Ananda <gua110@bx.psu.edu>: >>>>> Hi Iwe, >>>>> >>>>> No, you don't need fasta files. Just the 2bit file should do. On looking at the python code for the tool, it seems like your PATH_TO_hg18.2bit may not be set correctly. Please make sure the path is right (and absolute) and try running the tool again. >>>>> >>>>> About your tool integration issue- please try using this format in the xml file, and let me know if it works fine: >>>>> <command> >>>>> myBinary $input $output >>>>> </command> >>>>> Also, please make sure that the binary is in your PATH. >>>>> >>>>> Thanks, >>>>> Guru. >>>>> >>>>> On Feb 1, 2010, at 1:17 PM, Iwe Muiser wrote: >>>>> >>>>>> Hello Guru, >>>>>> >>>>>> Thanks for the tip. It kind of works (no more red errors) except that >>>>>> it still doesn't actually fetches the sequences. >>>>>> The error is: >>>>>> "empty, format: fasta, database: hg18 >>>>>> Info: 55 warnings, 1st is: Chromosome by name 'chr1' was not found for >>>>>> build 'hg18'. >>>>>> Skipped 55 invalid lines, 1st is #1, "chr1 1259997 1260525 MACS_peak_6 0 +"" >>>>>> >>>>>> Do I also need the actual fasta files stored somewhere? I have them >>>>>> but I don't know where to put them so that galaxy can find them. >>>>>> >>>>>> Another problem I have is with integrating tools. I have a binary >>>>>> executable that gives a lot of trouble. I have tried everything from >>>>>> making a wrapper python script to all kinds of combinations with "./" >>>>>> and bash -c. Also I have put quotes pretty much everywhere. Now, >>>>>> finally, I have it working with a wrapper script only Galaxy gives me >>>>>> a message that the scripts stdout and sterror are actual errors while >>>>>> the file is being generated perfectly (dataset_X.dat). Do you know >>>>>> what is going on? >>>>>> I hope I made it clear enough because it is a nasty problem to explain. >>>>>> >>>>>> Iwe >>>>>> >>>>>> 2010/2/1 Guruprasad Ananda <gua110@bx.psu.edu>: >>>>>>> Hello Iwe, >>>>>>> I apologise for not covering data integration in detail on our wiki. >>>>>>> Here's how you add a new sequence to your Galaxy instance. In your case, for >>>>>>> hg18, please download hg18.2bit file >>>>>>> from http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/, and then >>>>>>> update tool-data/alignseq.loc file with the following line: >>>>>>> seq hg18 PATH_TO_hg18.2bit >>>>>>> Also, in future if you want to find out which ".loc" file to use, you can >>>>>>> check the validator tag under the input parameter in the tool's xml file. >>>>>>> Please let us know if you need any more information. >>>>>>> Thanks, >>>>>>> Guru >>>>>>> Galaxy team. >>>>>>> >>>>>>> On Feb 1, 2010, at 9:39 AM, Iwe Muiser wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I have some questions about installing data onto my Ubuntu linux box. Mainly >>>>>>> genomic data that can be used by the extract_genomic_dna.py script. I have >>>>>>> done some attempts to get this working but now I'm quite stuck. First I >>>>>>> downloaded this data from the UCSC ftp site. >>>>>>> (ftp://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes to be precise). >>>>>>> Then I added the following line to "faseq.loc" as instructed: "hg18 >>>>>>> /home/muiser/data/Human_data/HG18/chromosomes". >>>>>>> After this I restarted the galaxy deamon but nothing seems to have happened. >>>>>>> I have the idea that I am missing certain steps. >>>>>>> >>>>>>> The ultimate goal is to get a local galaxy install working like the one on >>>>>>> "http://main.g2.bx.psu.edu/" >>>>>>> >>>>>>> The wiki is helpful to install new tools and to get an idea of how stuff >>>>>>> works but sparsely mentions data integration. Except for MAFs which I tried >>>>>>> as well. I'm now building an index for these .maf files which I assume will >>>>>>> take quite some time. >>>>>>> >>>>>>> I hope you can help me out a bit. >>>>>>> >>>>>>> Thanks in advance, >>>>>>> >>>>>>> Iwe Muiser >>>>>>> _______________________________________________ >>>>>>> galaxy-dev mailing list >>>>>>> galaxy-dev@lists.bx.psu.edu >>>>>>> http://lists.bx.psu.edu/listinfo/galaxy-dev >>>>>>> >>>>>>> Guruprasad Ananda >>>>>>> Graduate Student >>>>>>> Bioinformatics and Genomics >>>>>>> The Pennsylvania State University >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Master student >>>>>> Bioinformatics & Gene Regulation Group >>>>>> NTNU Trondheim >>>>>> +4594157082 >>>>>> >>>>> >>>>> Guruprasad Ananda >>>>> Graduate Student >>>>> Bioinformatics and Genomics >>>>> The Pennsylvania State University >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Iwe EC Muiser >>>> Master student >>>> Bioinformatics & Gene Regulation Group >>>> NTNU Trondheim >>>> +4594157082 >>>> >>> >>> Guruprasad Ananda >>> Graduate Student >>> Bioinformatics and Genomics >>> The Pennsylvania State University >>> >>> >>> >>> >>> >> >> >> >> -- >> Iwe EC Muiser >> Master student >> Bioinformatics & Gene Regulation Group >> NTNU Trondheim >> +4594157082 >> > > Guruprasad Ananda > Graduate Student > Bioinformatics and Genomics > The Pennsylvania State University > > > > >
-- Iwe EC Muiser Master student Bioinformatics & Gene Regulation Group NTNU Trondheim +4594157082
Guruprasad Ananda Graduate Student Bioinformatics and Genomics The Pennsylvania State University
-- Iwe EC Muiser Master student NTNU - Department of Cancer Research and Molecular Medicine Bioinformatics & Gene Regulation Group Laboratory Centre, 5th floor. Erling Skjalgssons gt. 1 Trondheim +4594157082
-- Iwe EC Muiser Master student NTNU - Department of Cancer Research and Molecular Medicine Bioinformatics & Gene Regulation Group Laboratory Centre, 5th floor. Erling Skjalgssons gt. 1 Trondheim +4594157082
Guruprasad Ananda Graduate Student Bioinformatics and Genomics The Pennsylvania State University
_______________________________________________ galaxy-bugs mailing list galaxy-bugs@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-bugs
-- Iwe EC Muiser Master student NTNU - Department of Cancer Research and Molecular Medicine Bioinformatics & Gene Regulation Group Laboratory Centre, 5th floor. Erling Skjalgssons gt. 1 Trondheim +4594157082
_______________________________________________ galaxy-bugs mailing list galaxy-bugs@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-bugs
-- Iwe EC Muiser Master student NTNU - Department of Cancer Research and Molecular Medicine Bioinformatics & Gene Regulation Group Laboratory Centre, 5th floor. Erling Skjalgssons gt. 1 Trondheim +4594157082
participants (1)
-
Daniel Blankenberg