Unix-Tools package - (small) update
Hello, I've updated the unix-tools package. http://hannonlab.cshl.edu/galaxy_unix_tools/ 1. Created a combined "Find and Replace" tool, which works on both lines or columns, and allows simple string or regular expressions find&replace. The purpose of this tool is to give users a way to replace text in tabular text files. Without it, users needs to save the file, and perform the replacements in excel. Being a Perl script, it works on large files with millions of lines (on which excel chokes). A usage example would be: Find all words in column 2 which starts with a digit, and add a "chr" prefix, effectively converting those drosophila "4L" chromosomes into "chr4L". see screen shot at: http://hannonlab.cshl.edu/galaxy_unix_tools/galaxy.html#find_and_replace 2. Select lines by Word-List tool. This tool accepts two files: one which will be filtered, the other contains a list of words to match. If a line from the first file matches one of the words from the other file - it is printed to the output dataset. This tool allows similar functionality as the "advanced filter" option in excel. While it is possible to achieve same functionally by building a regular expression and using Galaxy's native "select" tool - using this tool is easier and more intuitive (IMHO). Further more, this tool can be used as part of a workflow. A usage workflow example: Get the DM3 repeat masker track from UCSC, Group by CLASS + count, Sort descending by count, Select first 10 lines, Cut first column (this is the word list to filter by). Then use this tool to filter the repeat masker file with the words in the word list. Result - full information for the top ten classes from the repeat masker track. see screen shot at: http://hannonlab.cshl.edu/galaxy_unix_tools/galaxy.html#grep_word_list Comments are welcomed, Gordon.
participants (1)
-
Assaf Gordon