May 2012 - galaxy-user - lists.galaxyproject.org

Inconsistency between cufflinks and cuffdiff. This is killing me.
by Jia Meng 18 Jun '12

18 Jun '12

<<Problem>> The cufflinks and cuffdiff results are not consistent with each other. This is killing me. Does it make sense? << Obseration >> We have 3 control samples and 3 treated sample. For many genes, their FPKM in cufflinks and cuffdiff are far from consistent. In cufflinks result, for a gene’s FPKM are control group (sample: 1,2,3)：0， 0， 4.8 treated group(sample: 1,2,3)：0， 0， 6.0 In cuffdiff, the estimated FPKM are control group： 12.6 treated group：2.0 <<Method>> Use ucsc gene annotation gtf file, mm9, downloaded from UCSC table database Use cufflinks on each individual sample. Cufflinks: galaxy mirror at cistrome, minimal count:10, no quantile normalization, use gtf as reference, no background correction Use cufflinks on treated groups (3 biological replicates) and control groups (3 biological replicates) Cuffdiff: galaxy mirror at cistrome, minimal count:10, no normalization, use gtf as reference, no background correction <<Additional Comments>> Cufflinks returns 55350 transcripts, while cuffdiff return 55418 transcripts, even though they use the same gene annotation gtf file. For the 6 cufflinks results (corresponding to 6 samples), the transcript ids are all the same, but the order are not, <<Question>> Does it make sense? Or did I do anything wrong?

4 3

Tophat "Mean Inner Distance between Mate Pairs"
by 杨继文 04 Jun '12

04 Jun '12

Hi all, When mapping pair end RNA-seq reads using tophat, we need to type in "Mean Inner Distance between Mate Pairs". In galaxy, we can read the following information: This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs. I think the size of fragment (here 300bp) includes not only the length of pair end reads, but also the length of adaptors. so, maybe the Mean Inner Distance between Mate Pairs should be : fragment length - pair end read length - adaptor length. Am I right? or did I miss something? Is it a must to type in the accurate value? Looking forward to your reply JIwen

4 4

galaxy cloud not setting up properly
by Randall, Thomas (NIH/NIEHS) [C] 31 May '12

31 May '12

The last few times I have tried to initiate a galaxy instance on the cloud I have gotten messages like the following: * 18:42:04 - Master starting * 18:42:05 - Completed initial cluster configuration. * 18:42:09 - Prerequisites OK; starting service 'SGE' * 18:42:20 - Configuring SGE... * 18:42:29 - Successfully setup SGE; configuring SGE * 18:42:29 - Saved file 'persistent_data.yaml' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925' * 18:42:29 - Saved file 'cm_boot.py' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925' * 18:42:29 - Problem connecting to bucket 'cm-26cac39701f0918ab9a9dca54f69e925', attempt 1/5 * 18:42:32 - Saved file 'cm.tar.gz' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925' * 18:42:32 - Saved file 'test.clusterName' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925' * 18:44:34 - Initializing a 'Galaxy' cluster. * 18:44:34 - Retrieved file 'snaps.yaml' from bucket 'cloudman' to 'cm_snaps.yaml'. * 18:45:25 - Error mounting file system '/mnt/galaxyData' from '/dev/sdg3', running command '/bin/mount /dev/sdg3 /mnt/galaxyData' returned code '32' and following stderr: 'mount: you must specify the filesystem type ' * 18:45:27 - Prerequisites OK; starting service 'Postgres' * 18:45:27 - PostgreSQL data directory '/mnt/galaxyData/pgsql/data' does not exist (yet?) * 18:45:27 - Configuring PostgreSQL with a database for Galaxy... * 18:45:39 - Prerequisites OK; starting service 'Galaxy' * 18:45:39 - Setting up Galaxy application * 18:45:40 - Retrieved file 'universe_wsgi.ini.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/universe_wsgi.ini'. * 18:45:40 - Retrieved file 'tool_conf.xml.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/tool_conf.xml'. * 18:45:40 - Retrieved file 'tool_data_table_conf.xml.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/tool_data_table_conf.xml.cloud'. * 18:45:40 - Starting Galaxy... * 18:45:51 - Saved file 'persistent_data.yaml' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925' * 18:49:34 - Galaxy daemon not running. * 18:49:34 - Galaxy service state changed from 'Starting' to 'Error' * 18:49:35 - Saved file 'persistent_data.yaml' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925' * 18:49:41 - Galaxy daemon not running. * 18:49:58 - Galaxy daemon not running. * 18:50:15 - Galaxy daemon not running. I am using 861460482541/galaxy-cloudman-2011-03-22, which is supposed to be the current version. Tom Thomas Randall, PhD Bioinformatics Scientist, Contractor Integrative Bioinformatics National Institute of Environmental Health Sciences P.O. Box 12233, Research Triangle Park, NC 27709 randallta2(a)niehs.nih.gov<mailto:randallta2@niehs.nih.gov> 919-541-2271

2 1

Question about fetching sequence from genome
by Qianli Shen 31 May '12

31 May '12

Hi I want to fetch sequence from soybean genome, according to a gff file. My gff3 file and genome file are attached to the email, because it is not easy to recongnize the format if I paste it in the email. And it keeps reporting the error: An error occurred running this job: Traceback (most recent call last): File "/galaxy/home/g2main/galaxy_main/tools/extract/extract_genomic_dna.py", line 288, in <module> if __name__ == "__main__": __main__() File "/galaxy/home/g2main/galaxy_main/tools/extract/extract_genomic_dna.py" Could you please tell me where is the problem? Best Qianli

2 1

Genome Browser Histogram Visualization of Accepted Hits
by Fowler, Trent 31 May '12

31 May '12

Hello, I am attempting to run accepted hit data from Tophat output into the UCSC Genome Web Browser for visualization of sequencing hits in specific genes. However, the BAM files yield tiles and are too large to present through the browser. Is there a better file format to convert to that would allow better visualization such as histograms? >From word of mouth, I have been told to convert BAMs to BEDs and put BED files through the browser. However, I notice that Galaxy does not have an option for this and the oft used BEDtools appears to involve writing code, which is above my computer abilities. Any tips or solutions on how to obtain histograms from sequencing data would be very welcome. Thanks Trent Fowler

3 3

Re: [galaxy-user] How to control registration
by Zeeshan Ali Shah 31 May '12

31 May '12

what about chaning cm.tar.gz in S3 ? Zeeshan On Thu, May 31, 2012 at 9:34 AM, Jorrit Boekel <jorrit.boekel(a)scilifelab.se>wrote: > Hej Zeeshan > > Rebooting leads to upstart running /opt/cloudman/pkg/ec2autorun.py. > This downloads a boot script, which downloads a cloudman package > (cm.tar.gz), unpacks it and starts cloudman from there. This will mean a > new /mnt/cm/universe.wsgi will be in place. > > When cloudman starts, it attaches EBS volumes to /mnt/galaxyTools and > other mount points from snapshots. The snapshots will include > /mnt/galaxyTools/galaxy-central/universe_wsgi.ini. > > Restarting the node will thus eradicate changes. You will have to use the > persist changes-functionality to create a file persistent_data.yaml in an > S3 bucket and saved snapshots. AFAIK, it saves when you terminate the > session in cloudman. When instantiating a new cluster with the *same name* > (important) as the old one, it will pick the persistent data from the > bucket and mount your saved snapshots. > > good luck, > jorrit > > > > On 05/30/2012 04:34 PM, Zeeshan Ali Shah wrote: > > I tried changing > both files > /mnt/galaxyTools/galaxy-central/universe_wsgi.ini > > and > > /mnt/cm/universe_wsgi.ini.cloud > > > but after reboot of headnode the changes are gone . > > > any hint ? > > > Zeeshan > > On Thu, May 24, 2012 at 10:30 AM, Roman Valls Guimera < > brainstorm(a)nopcode.org> wrote: > >> Zeeshan, AFAIK the only settings you can set regarding this are in >> universe_wsgi.ini through the directives: >> >> >> # Allow unregistered users to create new accounts (otherwise, they will >> have to >> # be created by an admin). >> allow_user_creation = True >> >> # Email administrators when a new user account is created >> # You also need to have smtp_server set for this to work. >> new_user_email_admin = False >> >> I have not seen a moderation system in place, but maybe you can use PDC's >> Plone forms for this moderation ? >> >> Hope that helps ! >> Roman >> >> 24 maj 2012 kl. 10:18 skrev Zeeshan Ali Shah: >> >> > Hi, Is there any way to moderate Registration on Galaxy portal ? We are >> setting up a cluster for internal users but it seems that by default >> registration is open for all. >> > >> > Can any way we disable the registration and Moderator create users >> separately ? OR Moderate the registration process ? >> > >> > >> > Zeeshan >> > ___________________________________________________________ >> > The Galaxy User list should be used for the discussion of >> > Galaxy analysis and other features on the public server >> > at usegalaxy.org. Please keep all replies on the list by >> > using "reply all" in your mail client. For discussion of >> > local Galaxy instances and the Galaxy source code, please >> > use the Galaxy Development list: >> > >> > http://lists.bx.psu.edu/listinfo/galaxy-dev >> > >> > To manage your subscriptions to this and other Galaxy lists, >> > please use the interface at: >> > >> > http://lists.bx.psu.edu/ >> >> > > > ___________________________________________________________ > The Galaxy User list should be used for the discussion of > Galaxy analysis and other features on the public server > at usegalaxy.org. Please keep all replies on the list by > using "reply all" in your mail client. For discussion of > local Galaxy instances and the Galaxy source code, please > use the Galaxy Development list: > > http://lists.bx.psu.edu/listinfo/galaxy-dev > > To manage your subscriptions to this and other Galaxy lists, > please use the interface at: > > http://lists.bx.psu.edu/ > > >

2 2

Bowtie mapping problem
by Kelkar, Hemant 30 May '12

30 May '12

It is possible that this question has been asked/answered before. I tried searching through the galaxy-user list archives on nabble but could not find an applicable answer. ----------------------- Bowtie alignments done using a de-multiplexed Illumina sequence data set (CASAVA v.1.8.2) appear to be leading to alignment problem in our local galaxy install. At first glance this appears to be because of the " @SOMETHING<space>READINFO" read names not being handled correctly by bowtie. This is not a galaxy issue per se but what are other users doing to avoid this problem. We are using bowtie v. 0.12.7 (going to upgrade soon) at the moment. Posting a snippet from the alignment file below: MACHINE_NAME:2:1101:1533:1944 1:N:0:CGATGT 4 * 0 0 * * 0 0 NACGAAACGGGTCGGTCCGTCGGCATAGCGCGCCACGGCCTGCGGATCGG #4=DDFFFHHHFHIJHIJIHIIJJJIIIIIJJIHHFFDDDDDDDDDDDDD XM:i:0 MACHINE_NAME:2:1101:2523:1962 16 chr5 80936209 255 50M * 0 0 CTAAAAGGAAAAATTCCAGGGATTAAGGAACTTGAAGTTAGAAAAACTAN IIJJJIHHJJJIHFEIHIJIIJIJJJJIIJIIGJJIJFGHHHFEDAD=4# XA:i:1 MD:Z:49C0 NM:i:1 MACHINE_NAME:2:1101:1596:1971 16 chr13 41692461 255 50M * 0 0 GCTGAATAATAGTCCATTGTGAACATATACCATGTTTTCTTTATTTTTAN JJJJJJJJJIJJIJJJJJJJJJJJJJJHFJJJJJJJJHHHHHFFFFD=4# XA:i:1 MD:Z:49T0 NM:i:1 MACHINE_NAME:2:1101:2670:1962 1:N:0:CGATGT 4 * 0 0 * * 0 0 NTGCACTCGCCTGGATACCGTCGCCGGTGAGGTGGCATTCGAACACACCC --Hemant

2 1

How to control registration
by Zeeshan Ali Shah 30 May '12

30 May '12

Hi, Is there any way to moderate Registration on Galaxy portal ? We are setting up a cluster for internal users but it seems that by default registration is open for all. Can any way we disable the registration and Moderator create users separately ? OR Moderate the registration process ? Zeeshan

2 2

How to transfer files between two galaxy instances
by shamsher jagat 29 May '12

29 May '12

I have uploaded files in citsrome/ Gunner Ratch lab Galaxy instances which allow users to use their tool. I want to either share work flow from these instances or atleast transfer FAstq files to penn state open source galaxy severer. Is it possible or not? I have another question in this regard when I tried to upload the FASTq files via web link or FTP the job is never completed. I have tried it couple of times. This problem is there from last couple of months. Are there so me changes which have been implemented recently which is not allowing me to upload the files. Indeed I have seen from last month or so too many messages suggesting either Tophat job stuck or job is not completed or unable to upload the file. I am not if all tehse problems are related (storage) or not. Can someone from Galaxy team advice.

3 6

How To Keep Original Sample Names in Galaxy through Cufflinks pipeline?
by Christopher M. Weber 28 May '12

28 May '12

Hello, Overall Problem: Given sample names are not carried through Cufflinks pipeline. Instead, they are referenced with a number and in final Cuffdiff output are referred to as Sample 1/Q1, Sample 2/Q2 etc., instead of given name which requires tedious re-labeling. Is there a way to implement -L function in Galaxy so that sample names are carried through from Cufflinks to Cuffdiff? Originally posted in Seqanswers: http://seqanswers.com/forums/showthread.php?t=20338, where commenter suggested using -L as a fix but I don't see this option in Galaxy.

2 1