How to get raw file name of the uploaded dataset
Dear all, May I ask how to get the raw file name of the uploaded dataset? The value of $input assigned from <param name="input" type="data" format="xml"> is a file name like "dist-path/database/files/000/dataset_12.dat". However, I really need to get the raw file name like "abc.xml" which seems to be stored as attribute "name" of the uploaded dataset. How can I get the raw file name? Please help me. Thank you very much. Yingwei HU
The field $input.name will give you the visible name from the history. But those are usually not very useful,since they usually are something like "Intersect on Data 23 and Data 24" I have began to change all the labels for most of the tools to something more useful such that it is clear from the label what the original file was that you used. Specifically for FASTQ files and mapping etc. So when I start with a fastq file called "165_1_ACGTAG.fastq" and I run BWA, I get "165_1_ACGTAG-bwa.sam". Then when I changed that to BAM that, I get "165_1_ACGTAG-bwa.bam" and when I change the header, I get something like "165_1_ACGTAG-bwa-ReHead.bam" etc. The trick I use for that is something like this: label="#echo os.path.splitext ( str ( $paired.input1.name ) ) [ 0 ] #-PE-bwa.sam" for BWA for instance Later, in a different tool I use: label="#echo os.path.splitext ( str ( $inputFile.name ) ) [ 0 ] #-ReHead.bam" That will extract the basename of the file (The stuff before a period) and attached "-ReHead.bam" to the original input file name ($inputFile.name). This way it becomes much clearer what the file contains and what its history is. It is using the Cheetah syntax (Cheetah Users' Guide ) to extract and reformat the strings. Everything between the two hashmarks is treated as CHEETAH code. It is not a trivial task to change all the labels and sometimes you have to fudge a little (in the case of multiple input files for instance...Then I just choose the first...Also it becomes more tricky when there are some conditional statements (such as in BWA)...>Then you sometimes have to duplicate the code to make it work...But it is possible... I have attached a few of my .xml file for the most popular tools...Hope they help... Thon On Feb 24, 2012, at 07:05 AM, Yingwei HU <husince@gmail.com> wrote:
Dear all, May I ask how to get the raw file name of the uploaded dataset? The value of $input assigned from <param name="input" type="data" format="xml"> is a file name like "dist-path/database/files/000/dataset_12.dat". However, I really need to get the raw file name like "abc.xml" which seems to be stored as attribute "name" of the uploaded dataset. How can I get the raw file name? Please help me. Thank you very much.
Yingwei HU ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Dear Thondeboer, Thank you very much for your help. I have tried your method. It woks well. Best Regards, HU Yingwei 2012/2/25 <thondeboer@me.com>
The field $input.name will give you the visible name from the history.
But those are usually not very useful,since they usually are something like "Intersect on Data 23 and Data 24"
I have began to change all the labels for most of the tools to something more useful such that it is clear from the label what the original file was that you used. Specifically for FASTQ files and mapping etc. So when I start with a fastq file called "165_1_ACGTAG.fastq" and I run BWA, I get "165_1_ACGTAG-bwa.sam". Then when I changed that to BAM that, I get "165_1_ACGTAG-bwa.bam" and when I change the header, I get something like "165_1_ACGTAG-bwa-ReHead.bam" etc.
The trick I use for that is something like this:
label="#echo os.path.splitext ( str ( $paired.input1.name ) ) [ 0 ] #-PE-bwa.sam" for BWA for instance
Later, in a different tool I use:
label="#echo os.path.splitext ( str ( $inputFile.name ) ) [ 0 ] #-ReHead.bam"
That will extract the basename of the file (The stuff before a period) and attached "-ReHead.bam" to the original input file name ($inputFile.name).
This way it becomes much clearer what the file contains and what its history is.
It is using the Cheetah syntax (Cheetah Users' Guide<http://www.cheetahtemplate.org/docs/users_guide_html/>) to extract and reformat the strings. Everything between the two hashmarks is treated as CHEETAH code.
It is not a trivial task to change all the labels and sometimes you have to fudge a little (in the case of multiple input files for instance...Then I just choose the first...Also it becomes more tricky when there are some conditional statements (such as in BWA)...>Then you sometimes have to duplicate the code to make it work...But it is possible...
I have attached a few of my .xml file for the most popular tools...Hope they help...
Thon
On Feb 24, 2012, at 07:05 AM, Yingwei HU <husince@gmail.com> wrote:
Dear all,
May I ask how to get the raw file name of the uploaded dataset? The value of $input assigned from <param name="input" type="data" format="xml"> is a file name like "dist-path/database/files/000/dataset_12.dat". However, I really need to get the raw file name like "abc.xml" which seems to be stored as attribute "name" of the uploaded dataset. How can I get the raw file name? Please help me. Thank you very much.
Yingwei HU ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
thondeboer@me.com
-
Yingwei HU