The cufflinks and cuffdiff results are not consistent with each other. This is killing me.
Does it make sense?
<< Obseration >>
We have 3 control samples and 3 treated sample. For many genes, their FPKM in cufflinks and cuffdiff are far from consistent. In cufflinks result, for a gene’s FPKM are
control group (sample: 1,2,3)：0， 0， 4.8
treated group(sample: 1,2,3)：0， 0， 6.0
In cuffdiff, the estimated FPKM are
control group： 12.6
Use ucsc gene annotation gtf file, mm9, downloaded from UCSC table database
Use cufflinks on each individual sample.
Cufflinks: galaxy mirror at cistrome, minimal count:10, no quantile normalization, use gtf as reference, no background correction
Use cufflinks on treated groups (3 biological replicates) and control groups (3 biological replicates)
Cuffdiff: galaxy mirror at cistrome, minimal count:10, no normalization, use gtf as reference, no background correction
Cufflinks returns 55350 transcripts, while cuffdiff return 55418 transcripts, even though they use the same gene annotation gtf file.
For the 6 cufflinks results (corresponding to 6 samples), the transcript ids are all the same, but the order are not,
Does it make sense? Or did I do anything wrong?
When mapping pair end RNA-seq reads using tophat, we need to type in "Mean Inner Distance between Mate Pairs". In galaxy, we can read the following information:
This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments
selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter
is required for paired end runs.
I think the size of fragment (here 300bp) includes not only the length of pair end reads, but also the length of adaptors. so, maybe the Mean Inner Distance between Mate Pairs should be : fragment length - pair end read length - adaptor length. Am I right? or did I miss something?
Is it a must to type in the accurate value?
Looking forward to your reply
The last few times I have tried to initiate a galaxy instance on the cloud I have gotten messages like the following:
* 18:42:04 - Master starting
* 18:42:05 - Completed initial cluster configuration.
* 18:42:09 - Prerequisites OK; starting service 'SGE'
* 18:42:20 - Configuring SGE...
* 18:42:29 - Successfully setup SGE; configuring SGE
* 18:42:29 - Saved file 'persistent_data.yaml' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925'
* 18:42:29 - Saved file 'cm_boot.py' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925'
* 18:42:29 - Problem connecting to bucket 'cm-26cac39701f0918ab9a9dca54f69e925', attempt 1/5
* 18:42:32 - Saved file 'cm.tar.gz' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925'
* 18:42:32 - Saved file 'test.clusterName' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925'
* 18:44:34 - Initializing a 'Galaxy' cluster.
* 18:44:34 - Retrieved file 'snaps.yaml' from bucket 'cloudman' to 'cm_snaps.yaml'.
* 18:45:25 - Error mounting file system '/mnt/galaxyData' from '/dev/sdg3', running command '/bin/mount /dev/sdg3 /mnt/galaxyData' returned code '32' and following stderr: 'mount: you must specify the filesystem type '
* 18:45:27 - Prerequisites OK; starting service 'Postgres'
* 18:45:27 - PostgreSQL data directory '/mnt/galaxyData/pgsql/data' does not exist (yet?)
* 18:45:27 - Configuring PostgreSQL with a database for Galaxy...
* 18:45:39 - Prerequisites OK; starting service 'Galaxy'
* 18:45:39 - Setting up Galaxy application
* 18:45:40 - Retrieved file 'universe_wsgi.ini.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/universe_wsgi.ini'.
* 18:45:40 - Retrieved file 'tool_conf.xml.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/tool_conf.xml'.
* 18:45:40 - Retrieved file 'tool_data_table_conf.xml.cloud' from bucket 'cloudman' to '/mnt/galaxyTools/galaxy-central/tool_data_table_conf.xml.cloud'.
* 18:45:40 - Starting Galaxy...
* 18:45:51 - Saved file 'persistent_data.yaml' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925'
* 18:49:34 - Galaxy daemon not running.
* 18:49:34 - Galaxy service state changed from 'Starting' to 'Error'
* 18:49:35 - Saved file 'persistent_data.yaml' to bucket 'cm-26cac39701f0918ab9a9dca54f69e925'
* 18:49:41 - Galaxy daemon not running.
* 18:49:58 - Galaxy daemon not running.
* 18:50:15 - Galaxy daemon not running.
I am using 861460482541/galaxy-cloudman-2011-03-22, which is supposed to be the current version.
Thomas Randall, PhD
Bioinformatics Scientist, Contractor
National Institute of Environmental Health Sciences
P.O. Box 12233, Research Triangle Park, NC 27709
I want to fetch sequence from soybean genome, according to a gff file. My
gff3 file and genome file are attached to the email, because it is not easy
to recongnize the format if I paste it in the email. And it keeps
reporting the error:
An error occurred running this job: Traceback (most recent call last):
line 288, in <module>
if __name__ == "__main__": __main__()
Could you please tell me where is the problem?
I am attempting to run accepted hit data from Tophat output into the UCSC Genome Web Browser for visualization of sequencing hits in specific genes. However, the BAM files yield tiles and are too large to present through the browser. Is there a better file format to convert to that would allow better visualization such as histograms?
>From word of mouth, I have been told to convert BAMs to BEDs and put BED files through the browser. However, I notice that Galaxy does not have an option for this and the oft used BEDtools appears to involve writing code, which is above my computer abilities.
Any tips or solutions on how to obtain histograms from sequencing data would be very welcome.
what about chaning cm.tar.gz in S3 ?
On Thu, May 31, 2012 at 9:34 AM, Jorrit Boekel
> Hej Zeeshan
> Rebooting leads to upstart running /opt/cloudman/pkg/ec2autorun.py.
> This downloads a boot script, which downloads a cloudman package
> (cm.tar.gz), unpacks it and starts cloudman from there. This will mean a
> new /mnt/cm/universe.wsgi will be in place.
> When cloudman starts, it attaches EBS volumes to /mnt/galaxyTools and
> other mount points from snapshots. The snapshots will include
> Restarting the node will thus eradicate changes. You will have to use the
> persist changes-functionality to create a file persistent_data.yaml in an
> S3 bucket and saved snapshots. AFAIK, it saves when you terminate the
> session in cloudman. When instantiating a new cluster with the *same name*
> (important) as the old one, it will pick the persistent data from the
> bucket and mount your saved snapshots.
> good luck,
> On 05/30/2012 04:34 PM, Zeeshan Ali Shah wrote:
> I tried changing
> both files
> but after reboot of headnode the changes are gone .
> any hint ?
> On Thu, May 24, 2012 at 10:30 AM, Roman Valls Guimera <
> brainstorm(a)nopcode.org> wrote:
>> Zeeshan, AFAIK the only settings you can set regarding this are in
>> universe_wsgi.ini through the directives:
>> # Allow unregistered users to create new accounts (otherwise, they will
>> have to
>> # be created by an admin).
>> allow_user_creation = True
>> # Email administrators when a new user account is created
>> # You also need to have smtp_server set for this to work.
>> new_user_email_admin = False
>> I have not seen a moderation system in place, but maybe you can use PDC's
>> Plone forms for this moderation ?
>> Hope that helps !
>> 24 maj 2012 kl. 10:18 skrev Zeeshan Ali Shah:
>> > Hi, Is there any way to moderate Registration on Galaxy portal ? We are
>> setting up a cluster for internal users but it seems that by default
>> registration is open for all.
>> > Can any way we disable the registration and Moderator create users
>> separately ? OR Moderate the registration process ?
>> > Zeeshan
>> > ___________________________________________________________
>> > The Galaxy User list should be used for the discussion of
>> > Galaxy analysis and other features on the public server
>> > at usegalaxy.org. Please keep all replies on the list by
>> > using "reply all" in your mail client. For discussion of
>> > local Galaxy instances and the Galaxy source code, please
>> > use the Galaxy Development list:
>> > http://lists.bx.psu.edu/listinfo/galaxy-dev
>> > To manage your subscriptions to this and other Galaxy lists,
>> > please use the interface at:
>> > http://lists.bx.psu.edu/
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org. Please keep all replies on the list by
> using "reply all" in your mail client. For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
It is possible that this question has been asked/answered before. I tried searching through the galaxy-user list archives on nabble but could not find an applicable answer.
Bowtie alignments done using a de-multiplexed Illumina sequence data set (CASAVA v.1.8.2) appear to be leading to alignment problem in our local galaxy install. At first glance this appears to be because of the " @SOMETHING<space>READINFO" read names not being handled correctly by bowtie. This is not a galaxy issue per se but what are other users doing to avoid this problem. We are using bowtie v. 0.12.7 (going to upgrade soon) at the moment.
Posting a snippet from the alignment file below:
MACHINE_NAME:2:1101:1533:1944 1:N:0:CGATGT 4 * 0 0 * * 0 0 NACGAAACGGGTCGGTCCGTCGGCATAGCGCGCCACGGCCTGCGGATCGG #4=DDFFFHHHFHIJHIJIHIIJJJIIIIIJJIHHFFDDDDDDDDDDDDD XM:i:0
MACHINE_NAME:2:1101:2523:1962 16 chr5 80936209 255 50M * 0 0 CTAAAAGGAAAAATTCCAGGGATTAAGGAACTTGAAGTTAGAAAAACTAN IIJJJIHHJJJIHFEIHIJIIJIJJJJIIJIIGJJIJFGHHHFEDAD=4# XA:i:1 MD:Z:49C0 NM:i:1
MACHINE_NAME:2:1101:1596:1971 16 chr13 41692461 255 50M * 0 0 GCTGAATAATAGTCCATTGTGAACATATACCATGTTTTCTTTATTTTTAN JJJJJJJJJIJJIJJJJJJJJJJJJJJHFJJJJJJJJHHHHHFFFFD=4# XA:i:1 MD:Z:49T0 NM:i:1
MACHINE_NAME:2:1101:2670:1962 1:N:0:CGATGT 4 * 0 0 * * 0 0 NTGCACTCGCCTGGATACCGTCGCCGGTGAGGTGGCATTCGAACACACCC
Hi, Is there any way to moderate Registration on Galaxy portal ? We are
setting up a cluster for internal users but it seems that by default
registration is open for all.
Can any way we disable the registration and Moderator create
users separately ? OR Moderate the registration process ?
I have uploaded files in citsrome/ Gunner Ratch lab Galaxy instances which
allow users to use their tool. I want to either share work flow from these
instances or atleast transfer FAstq files to penn state open source
galaxy severer. Is it possible or not?
I have another question in this regard when I tried to upload the FASTq
files via web link or FTP the job is never completed. I have tried it
couple of times. This problem is there from last couple of months. Are
there so me changes which have been implemented recently which is not
allowing me to upload the files. Indeed I have seen from last month or so
too many messages suggesting either Tophat job stuck or job is not
completed or unable to upload the file. I am not if all tehse problems are
related (storage) or not. Can someone from Galaxy team advice.
Given sample names are not carried through Cufflinks pipeline. Instead,
they are referenced with a number and in final Cuffdiff output are
referred to as Sample 1/Q1, Sample 2/Q2 etc., instead of given name
which requires tedious re-labeling.
Is there a way to implement -L function in Galaxy so that sample names
are carried through from Cufflinks to Cuffdiff?
Originally posted in Seqanswers:
http://seqanswers.com/forums/showthread.php?t=20338, where commenter
suggested using -L as a fix but I don't see this option in Galaxy.