Hi Graham, This may be the long way around the transformation, but the workflow shared here will convert the identifiers without requiring any programing/regular expression knowledge: http://main.g2.bx.psu.edu/u/jen-bx-galaxy-edu/w/transform-fastq-nameidentife... To use this: 1 - log into galaxy and switch histories to one containing this dataset (if needed) 2 - click on the link above 3 - click on "Import workflow" at the top of the page, right of center, next to the green "+" icon 4 - on the "Import successful" page, click on "start using this workflow" 5 - on the "Your workflows" page, click on the down arrow at the end of "imported: Transform fastq name/identifier" to open the menu, then click on "Run" (second choice in list). If you ever need to reach this page again, just click on "Workflow" in the top menu bar. 6 - your history from step 1 will now display with the workflow in the center panel. 7 - set "Step 1: Input dataset", annotated as "CASAVA 1.8+ FASTQ file", to the FASTQ file with the identifiers like: "@N57638:1:64JU0AAXX:1:1:1057:943 1:Y:0:" 8 - click on "Run workflow" When run to completion, the intermediate datasets will be hidden, leaving only the final dataset: a groomed (using quality score type "Sanger") FASTQ file. Hopefully this helps. Feel free to make changes, the imported copy of the workflow is yours to modify. Best, Jen Galaxy team On 9/19/11 2:19 AM, graham etherington (TSL) wrote:
Hi, I currently have read names with the format: @N57638:1:64JU0AAXX:1:1:1057:943 1:Y:0: and would like to change them to the format: @N57638:1:64JU0AAXX:1:1:1057:943/1
I use Manipulate FASTQ, on all reads and set 'Manipulate Reads on:' to 'Name/Identifier', ('String Translate' becomes the only option). I then set the 'From:' field to '1:Y:0:' and the 'To:' field to '/1' (without the literal quotes). I get the following error:
Traceback (most recent call last): File "/home/home/galaxy/software/galaxy-central/tools/fastq/fastq_manipulation.py", line 37, in main() File "/home/home/galaxy/software/galaxy-central/tools/fastq/fastq_manipulation.py", line 25, in main new_read = fastq_manipulator.match_and_manipulate_read( fastq_read ) File "/home/home/galaxy/software/galaxy-central/database/job_working_directory/942/tmpgp13Qy", line 15, in match_and_manipulate_read new_read = manipulate_read( fastq_read ) File "/home/home/galaxy/software/galaxy-central/database/job_working_directory/942/tmpgp13Qy", line 8, in manipulate_read new_read.identifier = "@%s" % new_read.identifier[1:].translate( maketrans( binascii.unhexlify( "313a593a303a" ), binascii.unhexlify( "2f31" ) ) ) ValueError: maketrans arguments must have same length
So, do the From and To fields really need to be the same length? This seems rather strange and unhelpful. Am I doing something wrong?
Many thanks, Graham
Dr. Graham Etherington Bioinformatics Support Officer, The Sainsbury Laboratory, Norwich Research Park, Norwich NR4 7UH. UK
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support