Re: [galaxy-user] [galaxy-bugs] loading data
by Peng, Tao
Hi Jen, I have RNA-seq data for 2 biologial samples loaded in to galaxy.
The samples are from human skin biopsies from genital herpes
reactivation. Using Tophat I will align the reads to human genome. But I
am NOT how to build genital herpes genome (HSV-2) and aligh reads to
HSV-2 genome? Any suggestion is greatly appreciated!
Thanks,
tao
-----Original Message-----
From: Jennifer Jackson [mailto:jen@bx.psu.edu]
Sent: Friday, August 12, 2011 11:54 AM
To: Peng, Tao; galaxy-bugs(a)bx.psu.edu
Subject: Re: [galaxy-bugs] loading data
Hello,
Glad to hear that you were able to load your data.
When logged in, FTP loaded data will initially be in the FTP upload area
under the "Get Data -> Upload" tool (in the center pane). From here,
load data into the history that you wish to work with. If you are not
sure where this is exactly, please note the graphics in the FTP wiki and
screencast.
As a reminder, please send all new and followup questions with a to or
cc to the mailing list, not to individual team members. This is
important for our team to be able to track and answer questions. We
would appreciate your helping out with this going forward.
Hopefully this helps.
Jen
Galaxy team
On 8/12/11 11:39 AM, Peng, Tao wrote:
> Hi jen, I just finished FTP of R4 data set with 8.4 GB. Why can't I
see
> the data on the right panel of galaxy after loggin in?
>
> Tao
>
> -----Original Message-----
> From: Jennifer Jackson [mailto:jen@bx.psu.edu]
> Sent: Tuesday, August 09, 2011 3:26 PM
> To: Peng, Tao
> Cc: galaxy-bugs(a)bx.psu.edu
> Subject: Re: [galaxy-bugs] loading data
>
> Hello Tao,
>
> It sounds like the load works with an uncompressed file but is failing
> when compressed? Perhaps there is a problem with the compression
itself.
>
> We can't really help with this part of the process except to note
which
> compression types we accept. The help wiki link again is:
> http://wiki.g2.bx.psu.edu/Learn/Upload%20via%20FTP
>
> The final goal would be to have all of the data compressed and then
load
>
> using FTP in a batch. If this takes too long to run, then the next
> option is to run either a local or cloud install. Help can be found
at:
> http://getgalaxy.org
> http://galaxyproject.org/Admin/Cloud
>
> Please send all questions related to local or cloud installations to
the
>
> galaxy-dev(a)bx.psu.edu mailing list and not to individual team members
or
>
> to the galaxy-bugs(a)bs.psu.edu mailing list.
>
> Best wishes for your project,
>
> Jen
> Galaxy team
>
> On 8/9/11 2:25 PM, Peng, Tao wrote:
>> If I up-load one file (uncompressed) at one time, I will have 4x8=32
>> files to up-load. Each file takes about 1 hour to load. This is
>> untenable. Any suggestion?
>>
>> Thanks,
>>
>>
>> tao
>>
>> -----Original Message-----
>> From: Jennifer Jackson [mailto:jen@bx.psu.edu]
>> Sent: Tuesday, August 09, 2011 2:04 PM
>> To: Peng, Tao; galaxy-bugs(a)bx.psu.edu
>> Subject: Re: [galaxy-bugs] loading data
>>
>>
>>
>> On 8/9/11 1:55 PM, Peng, Tao wrote:
>>> Hi Jen, if I have 8 fastq.gz files for each of my 4 samples, how
>> should
>>> I load the data for analysis in galaxy?
>>>
>>> Thanks,
>>>
>>> tao
>>>
>>> -----Original Message-----
>>> From: Jennifer Jackson [mailto:jen@bx.psu.edu]
>>> Sent: Tuesday, August 09, 2011 1:31 PM
>>> To: Peng, Tao
>>> Cc: galaxy-bugs(a)bx.psu.edu
>>> Subject: Re: [galaxy-bugs] loading data
>>>
>>> Hello Tao,
>>>
>>> For your other question sent to me directly, Galaxy will accept only
>> one
>>>
>>> file per archive, so sending a multi-file .gz will result in only
the
>>> first file in the archive loaded into the "Get Data -> Upload"
FTP
>> area.
>>>
>>> Given the current issues, please try a restart. Log out of Galaxy
and
>>> FileZilla (and restart your computer if possible). Then begin again,
>>> testing with a single, uncompressed file. You do not have to be
> logged
>>> into your Galaxy account to load with FileZilla, but you will need
to
>> be
>>>
>>> logged in to access the files on the "Get Data -> Upload" form
> after
>>> upload is complete.
>>>
>>> If this fails, perhaps try reinstalling FileZilla or even a
different
>>> FTP client. This tutorial covers an alternative that can also be
used
>>> from the desktop:
>>> http://galaxyproject.org/Learn/Upload via FTP
>>>
>>> If you do have more questions, please leave galaxy-bugs(a)bx.psu.edu
on
>>> the cc list so that our entire team can help contribute to replies.
>> This
>>>
>>> may also be a question that you want to ask the larger user
community
>> at
>>>
>>> galaxy-user(a)bx.psu.edu, as this sounds like an external issue.
> Another
>>> user may have encountered this problem using a PC and have
>> suggestions.
>>>
>>> Thanks!
>>>
>>> Jen
>>> Galaxy team
>>>
>>> On 8/9/11 1:16 PM, Peng, Tao wrote:
>>>> Hi jen, I attached a screen shot of FileZilla. Do you know why the
>> top
>>>> panel says "disconnected from server" while the bottom panel
>> indicates
>>>> it is transferring the data?
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> tao
>>>>
>>>> -----Original Message-----
>>>> From: Jennifer Jackson [mailto:jen@bx.psu.edu]
>>>> Sent: Monday, August 08, 2011 4:55 PM
>>>> To: Peng, Tao
>>>> Cc: galaxy-bugs(a)bx.psu.edu
>>>> Subject: Re: [galaxy-bugs] loading data
>>>>
>>>> Hello Tao,
>>>>
>>>> The loading times do seem to be long for the size of the files,
>>> perhaps
>>>> the internet connection is the problem. That said, FileZilla can
>>> restart
>>>>
>>>> an FTP if it is interrupted and a message would be reported. I
> didn't
>>>> see any restarts reported in the log in the screenshot you sent.
>>>>
>>>> The compression is another place to look for a problem. You might
> try
>>> to
>>>>
>>>> use an alternate compression or no compression and load that way.
> Try
>>>> one file, through FileZilla, as a test to see if that performs
>> better.
>>>> Winzip can compress in a few different formats, .bz2 is also
> accepted
>>> by
>>>>
>>>> Galaxy.
>>>>
>>>> Hopefully one of these will work. Please keep galaxy-bugs in the cc
>>> for
>>>> any follow-up so that our team can help contribute to replies,
>>>>
>>>> Thanks,
>>>>
>>>> Jen
>>>>
>>>> On 8/8/11 4:04 PM, Peng, Tao wrote:
>>>>> Hi jen, when I FTP 9 of the fasq.gz files (each about 350 MB) for
>>>>> sample, the FTP program (FileZilla) keep coping the files again
and
>>>>> again, this is strange to me. Do you know what went wrong here.
>>>>>
>>>>> I attached a powerpoint slide that is the screen shot of FileZila
>>>>> (R1_002.fasq.gz has been copied 3x so far??)
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>> tao
>>>>>
>>>>> -----Original Message-----
>>>>> From: Jennifer Jackson [mailto:jen@bx.psu.edu]
>>>>> Sent: Friday, July 29, 2011 1:20 PM
>>>>> To: Peng, Tao
>>>>> Cc: galaxy-bugs(a)bx.psu.edu
>>>>> Subject: Re: [galaxy-bugs] loading data
>>>>>
>>>>> Hello Tao,
>>>>>
>>>>> Our apologies, the public Galaxy instance at http://usegalaxy.org
>> has
>>>>> been experiencing higher than usual usage the last few days. This
>>>> should
>>>>>
>>>>> now be improved. Please try your FTP load again - it is the best
> way
>>>> to
>>>>> load files and the sizes you mention are well within the accepted
>>>>> limits.
>>>>>
>>>>> Thank you for your patience,
>>>>>
>>>>> Best,
>>>>>
>>>>> Jen
>>>>> Galaxy team
>>>>>
>>>>>
>>>>> On 7/29/11 9:47 AM, Peng, Tao wrote:
>>>>>> Hi I work at FHCRC in Seattle. I am trying to load the FASQ data
>>> from
>>>>>> Illumina HySeq. I have 4 samples with 2.4 GB of seq data per
>> sample.
>>>>> For
>>>>>> the overnight FTP, I can't even finish loading half of one
sample.
>>> Is
>>>>>> there any faster way to load the data? Can GALAXY do the analysis
>>> for
>>>>>> this type of large data?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> tao
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
9 years, 3 months
Macs
by Keith Giles
To add to my post, I reread the macs documentation and it seems that
the fdr is calculated based on the p value of negative peaks,
regardless of their location. That is to say, a strong peak on one
chromosome could be associated with a high fdr bc of an even stronger
negative peak, anywhere in the genome.
So, is the fdr then the percent of negative peaks relative to total
peaks (pos + neg) at or below a certain p value ?
Sent from my iPhone
9 years, 3 months
BED to BAM conversion in Galaxy
by shamsher jagat
Is it possible to use some tool in Galaxy to convert BED file to Bam/ sam
file. In other word do we have Bed tools or other option in Galaxy
Thanks
9 years, 3 months
upload zip file to custom tool
by Brent Pedersen
Hi,
I have followed the wiki and built a custom tool on a local galaxy installation.
One of the inputs is a zip file. It seems galaxy automatically unpacks
it and keeps only
the first file. Is there any way I can tell galaxy not to unzip the file?
I tried to register .zip as a datatype, but that didn't seem to change
the behavior.
thanks for any help.
$tool.xml is below in case it helps.
-Brent
<command interpreter="python">charmqc.py $xyszip $output $organism</command>
<inputs>
<param format="xys.zip" name="xyszip" type="data" label="Zip of
.xys files"/>
<param type="select" name="organism" label="organism">
<option value="Human">Human (hg18)</option>
<option value="Mouse">Mouse (mm8)</option>
</param>
</inputs>
<outputs>
<data format="data" name="output" />
</outputs>
9 years, 3 months
sequence analysis full tutorial
by frederic lepretre
hi,
I'm a little bit rooky in using galaxy, and ask if someone could send me
a tutorial.
precisely, I'm using ion torrent data that I succeed to load. but what
to do after?
thanks a lot
fred
--
*********************************************************
* Dr Frédéric Leprêtre *
* Institut pour la Recherche sur le Cancer de Lille *
* Plateforme de génomique fonctionnelle *
* Agilent - Affymetrix - SoliD3 *
* Place de Verdun 59045 Lille cedex *
* *
* tel. 03 20 16 92 20 poste 2339 *
* fax. 03 20 16 92 29 *
* http://www.ircl.org/ *
* frederic.lepretre(a)inserm.fr *
* ou *
* frederic.lepretre(a)univ-lille2.fr *
*********************************************************
9 years, 3 months
non-coding RNA
by dongdong zhaoweiming
Hi, I want to evaluate wherther my assembly transcripts produced by trinity is protein-coding or notcoding. I found two methods which are "txCdsPredict" program from the UCSC(John R Prensner,2011) and Codon Substitution Frequencies,CSF(Michael F. Lin,2008). I wonder if galaxy can do this? Thanks a lot!
weimin zhao
9 years, 3 months
Fwd: de novo assembly
by John Nash
Forgive the forwarded email. Not used to working from an iPad.
> On 2011-09-29, at 7:58 PM, Peter Cock <p.j.a.cock(a)googlemail.com> wrote:
>
>> On Thu, Sep 29, 2011 at 7:49 PM, Cecilia Tamborindeguy
>> <ctamborindeguy(a)ag.tamu.edu> wrote:
>>> Hello,
>>>
>>> I would like to know if Galaxy can do de novo assembly without a reference
>>> genome.
>>>
>>> Thanks.
>>>
>>> Cecilia
>>
>> Are you trying to use the Public Galaxy or a local install? There
>> are several assemblers with Galaxy Wrappers on the Galaxy
>> ToolShed (e.g. Roche "Newbler", and MIRA 3) which you could
>> add to your own local Galaxy if you have one.
>>
>> However, do novo genome assembly can be very computationally
>> demanding, so not many Galaxy Instances will want to offer it.
>>
>> Peter
>
> I would like to echo Peter's advice. (Again, that's twice in 5 min from 2 different lists. I promise you I'm not stalking you, Peter).
>
> Genome assembly is a bit of a dedicated domain with respect to expertise and time. If possible, if you are assembling a lot of genome data, you really should set yourself up properly with a multiple-CPU unix box with a lot of RAM and dedicate it to assembly. Install MIRA, Newbler, Velvet, AMOS, samtools, bamtools, bedtools, Staden, phred/phrap/consed on it, and you can assemble and interconvert data to your heart's content.
>
> Galaxy is a wonderful and useful service but assembling genomes does require dedicated power and expertise, and preferably in house. I just forked out for a 64 CPU processor with 1 TB RAM bc we assemble lots of genomes. You don't have to go that far but a 3-4 quad processor box with 128 GB RAM and 1 TB disk should be on your mind.
>
> John
9 years, 3 months
Add library to dataset performance metric: developer vs production instances
by Roman Valls
Hello,
Today I was routinely adding a 27GB Illumina lane on my galaxy instance
running on a cluster node. Just the regular cloned-from-hg type of
instance with set_metadata_externally, no more tuning.
It took more than 10 minutes to have the dataset imported into a data
library via the filesystem path upload method... not copying it into
galaxy, just "linking".
galaxy.jobs INFO 2011-09-19 18:05:08,641 job 120 dispatched
(...)
galaxy.jobs DEBUG 2011-09-19 18:16:52,822 job 120 ended
galaxy.datatypes.metadata DEBUG 2011-09-19 18:16:52,824 Cleaning up
external metadata files
Since I cannot add datasets to libraries in usegalaxy.org and compare, I
was wondering if someone can state an approximated average time *for a
production* galaxy installation to do that operation.
I would like to have some empirical number to show on how a production
deployment[1] could speed things up, as opposed to having individual
galaxy instances per user in a cluster (as per IT policies):
http://blogs.nopcode.org/brainstorm/2011/08/22/galaxy-on-uppmax-simplified/
Thanks in advance !
Roman
[1] http://usegalaxy.org/production
9 years, 3 months
de novo assembly
by Cecilia Tamborindeguy
Hello,
I would like to know if Galaxy can do de novo assembly without a reference genome.
Thanks.
Cecilia
9 years, 3 months