February 2012 - galaxy-user - lists.galaxyproject.org

bug for mate pairs colorspace to sequence conversion
by Philipp.Berninger＠unibas.ch 15 Feb '12

15 Feb '12

Hi, I have a problem converting paired reads with paired end reads from ABi, in the color code the reads started with a G and afterwards with numbers instead of T02102103... , so I guess the program assumes the T as hardcoded instead of using the G as last base best Philipp ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.

2 1

filtering pile-up fails
by Sebahattin Cirak 14 Feb '12

14 Feb '12

Dear All, I have been successful by using the online tool to align Illumina pair end reads , each direction 5GB, and also generated a pileup of 680,000,000 lines, but the filtering of the pileup always fails, it runs for several hours and I get an empty file back. I tried different options and different pileups, always the same. Could somebody please help or what is the trick? Thank you Sebahattin

2 1

Clustering with cuffcompare or cuffdiff results
by Zhang Xiaoyu 14 Feb '12

14 Feb '12

Dear Sir or Madam, I am planning to do clustering of several libraries based on the output of cuffcompare or cuffdiff, as they allow me to construct a matrix whose columns represent the libraries and rows are the count of transcripts or genes. I want to construct the matrix because it is the required input format of many RNA-seq clustering softwares, e.g. baySeq, HTSCluster. However, by reading the answer of question "I want to find differentially expressed genes. Can I use Cufflinks in conjunction with count-based differential expression packages?" in the cufflinks FAQ list, it is suggested not to convert FPKM value to count data. Now my question is 1. It seems that it is better to run everything up to cuffdiff, but does cuffdiff allow multiple sample comparison because I read somewhere that even for multi-samples it still compare tham pairwisely? In a sense, because I want to do clustering which needs some quantitative data source to do the merging, will cuffdiff provide me some quantitative measures rather than the test score and p-value which is too qualitative to include? 2. If I really need to get count data from the FPKM values, how do I obtain the mentioned "effective length"? Would it be better if I treat each assembled transcript as an object in clustering, rather than genes. What does it mean "you'd be throwing away Cufflinks' uncertainty" even with using isoforms as objects? How should I include the uncertainty into my clustering? Best, Sherry

2 1

Solution for: Error running cuffdiff. Error: cannot open reference GTF file CONDITION, CONTROL for reading
by Carlos Borroto 14 Feb '12

14 Feb '12

Hi, I ran into this error running cuffdiff. I had a hard time debugging this user error, so I though it would be nice to share the solution. This was in a local instance, but I don't see why it wouldn't happen in Galaxy Main under the same circumstances. Tool execution generated the following error message: Error running cuffdiff. Error: cannot open reference GTF file CONDITION,CONTROL for reading The tool produced the following additional output: cuffdiff v1.3.0 () cuffdiff --no-update-check -q -p 4 -c 1000 --FDR 0.050000 -N -b --labels CONDITION,CONTROL /local/db/genomes/illumina/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/genes.gtf /local/opt/galaxy_central/database/files/001/dataset_1339.dat,/local/opt/galaxy_central/database/files/001/dataset_1391.dat /local/opt/galaxy_central/database/files/001/dataset_1452.dat,/local/opt/galaxy_central/database/files/001/dataset_1478.dat The problem ended being the use of "Perform Bias Correction"(-b) and a GTF file with no "Database/Build" associated. Looking at cuffdiff wrapper I found, if a FASTA reference is not selected from the history, the FASTA reference of the GTF file associated build is used. If there is not build association, your cuffdiff run will fail with this not so helpful error. My feeling is, cuffdiff should check for a non-dashed string after '-b' and complain if is absents, but this doesn't happen currently. Kind regards, Carlos

2 1

Large local file of NGS for FASTAQ Groomer
by Arthur Zheng 14 Feb '12

14 Feb '12

Hi, I have downloaded and installed a local instance of galaxy on the linux server using my user account according to here: http://main.g2.bx.psu.edu/ Then I ran the following command: > sh run.sh and accessed galaxy through the local firefox browser on the server http://localhost:8080 Now I am trying to use some NGS files for FASTQ Groomer. Each file is in the server disk already, but very large (~8G each). I was not able to use the "upload file from your computer" function under the "Get Data" tab (maybe because each file is too large). What am I supposed to do? Thank you! Arthur

3 6

2012 Galaxy Community Conference (GCC2012): Now Accepting Abstracts
by Dave Clements 13 Feb '12

13 Feb '12

Hello all, Abstracts <http://wiki.g2.bx.psu.edu/Events/GCC2012/Abstracts> are now being accepted for oral presentations at the 2012 Galaxy Community Conference (GCC2012) <http://wiki.g2.bx.psu.edu/Events/GCC2012>. Submissions on any topics of interest to the Galaxy community are encouraged. Areas of interest include, but are not limited to: - Best practices for local Galaxy installation and management - Integrating tools and/or data sources into the Galaxy framework - Deploying galaxy on different infrastructures - Compelling or novel uses of Galaxy for biomedical analysis See the GCC2011 program <http://wiki.g2.bx.psu.edu/Events/GCC2011> for an idea of the breadth of topics that can be covered. Oral presentations will be approximately 15-20 minutes long, including time for question and answer. There will also be an opportunity for lightning talks, which will be solicited at the meeting. The submission deadline is April 16. See the GCC2012 Abstracts <http://wiki.g2.bx.psu.edu/Events/GCC2012/Abstracts> page for more details and how to submit. GCC2012 <http://wiki.g2.bx.psu.edu/Events/GCC2012> will be held, July 25-27 in Chicago, Illinois, United States. The main meeting will run for two full days <http://wiki.g2.bx.psu.edu/Events/GCC2012/Program>, and be preceded by a full day of training workshops<http://wiki.g2.bx.psu.edu/Events/GCC2012/Program>. If you are a bioinformatics tool developer, data provider, workflow developer, power bioinformatics user, sequencing or bioinformatics core staff, or a data and analysis archival specialist, then GCC2012 is relevant to you. Registration will open in March. GCC2012 is hosted by the University of Illinois at Chicago <http://uic.edu/>, the University of Illinois at Urbana-Champaign <http://illinois.edu/>, and the Computation Institute <http://www.ci.anl.gov/>. Links: http://galaxyproject.org/GCC2012 http://galaxyproject.org/wiki/Events/GCC2012/Abstracts Thanks, Dave Clements -- http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://galaxyproject.org/wiki/

1 0

Result of annotation hg18 and hg19 on one file
by La Chi 13 Feb '12

13 Feb '12

Hi everyone, how can i write the result of both functions in one single file then history panel will show its result.. thanks

1 0

Is the result of galaxy tool (e.g. FastQ Groomer) automatically saved somewhere on a local linux server?
by Arthur Zheng 13 Feb '12

13 Feb '12

> Hi, > > I have downloaded and installed a local instance of galaxy on the linux > server using my user account according to here: > http://main.g2.bx.psu.edu/ > > I then created a library and linked some datasets on the local server. Then I ran FastQ Groomer on some datasets. The results were shown under "History". I am wondering whether the results were saved automatically somewhere on the local server? If so, what is the default path? Thanks. Arthur

2 1

output file
by La Chi 12 Feb '12

12 Feb '12

Hi , i have added a tool to the galaxy and it is showing me the desired result on history panel when I execute, now i have added another tool and i want to show its result on the same file which was the output of my first tool . here is the relative code of both python and xml file thanks def __main__(): bpup = sys.argv[1] bpdwn = sys.argv[2] userid = sys.argv[3] output_name = sys.argv[4] out = open( output_name, 'w' ) xml <tool --------------> <command interpreter="python">mytool.py $bpup $bpdwn $userid $out_file1</command> <input> some inputs </input> <output> <data format="txt" name="out_file1" /> </output> </tool>

1 0

Column Concatenation
by La Chi 10 Feb '12

10 Feb '12

Hi , I am trying to fetch multiple column from a table here is my code for row in result_set: db += [(row ['chr'],str(row[ 'user_id' ]), False),] i want to concatenate 'start' column and 'end' column with this output , how can i do that thanks

1 0