have a question about the cuffdiff output "differential expression testing".
For most of you might sound "naive" but I'm new to this field and I
have very little background in statistic.
So, I have compared a control sample with 2 biological replicates
using cuffdiff. I have now about 4000 genes which were tested.
1. How do I extrapolate the genes which are up- or downregulated from the 4000?
2. Is there a FPKM value above which a gene is up- or downregulated?
3.I used excel and sorted the values from highest to smallest:
assuming that control highest value is 200 and the correspondent
treated values is 2, can I say that that gene is downregulated in the
treated ssamples by a 100 fold chnage?
4. Do I have to use at all the p_values given in the output to
extrapolate the most up- or downregualted genes?
I do not have yet cummerbund and I am not very good with R. And I 'm lost!
I have been trying to use either Unified genotyper or Freebayes on one of the Bam file. Both are failing.
1. With Unified genotyper it give me message saying Sequences are not currently available for specified build. I have hg19 related data and using default settings (pick up hg_g1k_v37 no other option). I am not sure why it is giving me this error.
2. As an alternative I tried to run Freebayes with default setting and choosing hg19 - it i snot giving any specific message but undetr bug icon gives me -killed.
Now in order to make sure my Bam is OK, I tested out side Galaxy mPile up and with in Galaxy pile up. Any suggestion why UNified genotyper is not working. If needed I can share my history.
I am a registered user of the public Galaxy Server (main). I was trying to
upload two files (one is 1.6G and the other is 1.9G) but it was labeled as
"Job is waiting to run" forever. Could you please let me know the possible
I am wondering if I can use user-defined reference instead of selecting the
reference genome listed in Galaxy while doing mapping. If we can, how can I
choose the uploaded reference instead of selecting the reference genome? Any
ideas are greatly appreciated!
Begin forwarded message:
> From: Mark Lindsay <m.a.lindsay(a)bath.ac.uk>
> Date: 7 August 2012 10:55:03 GMT+01:00
> To: galaxy-user(a)bx.psu.edu
> Subject: Staring a Cloud Instance
> I wondered if you might be able to help.
> I have very little computing experience but have been trying to launch a cloud instance for the last 24 hrs.
> Initially I had problems launching the instance in Sahari i.e. it kept saying that it could not connect to the relevant page.
> When I have finally got top the CloudMan launch page it is giving the following message
[Critical] Volume 'vol-dd6397db' is located in the wrong availability zone for this instance. You MUST terminate this instance and start a new one in zone 'us-east-1a'
> despite the fact that I am sure I have started in zone 'us-east-1a.
> How important are the security codes and keys?
I've just set up my own instance of Galaxy running in AWS and had a security question that I couldn't find an answer to on the wiki.
I'd like to prevent people from hitting my public Amazon IP and using Galaxy. Is there a way to prevent anonymous users from accessing any part of Galaxy except a login prompt and have user registration be invite-only (or perhaps turn it off completely once a handful of accounts have been created)?
I am a Galaxy-naive molecular, developmental biologist studying
repression/derepression of early embryonic gene expression in zebrafish
After attending the Galaxy meeting I returned home and worked up two
mRNAseq files to determine RNA expression differences using cuffdiff
between a treated and an untreated sample (i.e. data from cuffdiff under
the title of "gene differential expression testing").
I downloaded the data, opened it up in an Excel file and captured all the
If I look at the "value 1" and "value 2" columns I find that many of the
numbers are single digits. I expect that in one of the columns that the
numbers will be very low (that is, less than 1) because the treatment
should be inducing gene expression in a subfamily of genes that are
My questions are:
1) what do these numbers represent?
2) If in the "value" column where I expect a higher number has a "value of
10" or less mean anything or should one be selecting for values higher that
these single digit numbers
3) And in the column of genes that might be repressed is there really a
difference between a "value of 0.1 versus something like 0.01" since that
can change my log ratios significantly--this, of course, goes back to my
I would appreciate any help I could get, sincerely,
Professor of Molecular Genetics and Microbiology
Duke University Medical Center