fastx barcode-splitter
by Marek Szubert
Hi
I have been looking into galaxy with a view to finding a platform target for
developing code. It is a very ambitious project, and you deserve
congratulations for all the development work that has gone into it so far.
I have just started using the fastx barcode splitter tool and found that the
job successfully completed with a tabular list of links to display results
files contents. When I run a 'get data - upload file from your computer' to
load these split files into the galaxy system (as they do not seem to appear
in the history list nor are they automatically accessible to Galaxy for some
strange reason), I can only do this using the URL input field (of get data
- upload file from your computer'), then the task fails and get error :
http://localhost:8080/datasets/2d9035b3fc152403/display/Clip_on_data_12__...
An error occurred running this job: *The uploaded file contains
inappropriate HTML content
The fastx documentation say the split files should be written to /tmp , but
that it not accessible on the public server the files are not there on my
installation.
*
*
Could anyone suggest alternative working methods for reloading the barcode
split files?
thanks
Dr Jan Szubert
*
11 years, 6 months
Read shuffler and code contributions
by Florent Angly
Hi,
I was wondering if there is a tool in Galaxy to put mate pair reads
located in two files inside a single file? I made the error of believing
that the FASTQ joiner does that, but it does not.
If this feature is not planned, I am willing to work on it. Which of the
following would be better for integration into Galaxy?
* a clean Python implementation like the other utilities in
tools/fastq/, i.e. fastq_groomer.py and fastq_paired_end_joiner.py
* a wrapper around the Velvet utilities, shuffleSequences_fastq.pl and
shuffleSequences_fasta.pl, given that Velvet already has a wrapper in Galaxy
Regarding contributing code to the galaxy-central repository, what is
the best way to get it done? Recently, I cloned the galaxy-central
repository on Bitbucket, made some changes and requested the changes to
be pulled, but I have not heard from the Galaxy Team yet. Let me know if
you like to do things a different way!
Best,
Florent
11 years, 6 months
Re: [galaxy-user] Fwd: Need a generous help regarding uploading
by Jennifer Jackson
Hello Asifullah,
It would be helpful if you would be able to share your history. Please
note which exact datasets (by number) you are having problems saving.
Use Options -> Share or Publish -> click on the button to Share via link
and then email that to me.
Thank you,
Jen
Galaxy team
On 12/15/10 7:56 AM, vasu punj wrote:
> I have done the same it gav eme bam index file when i removed the
> extension index or as such it come out to be not right bam file format
> as recognized by various tools
> Thanks
>
> --- On *Wed, 12/15/10, Jennifer Jackson /<jen(a)bx.psu.edu>/* wrote:
>
>
> From: Jennifer Jackson <jen(a)bx.psu.edu>
> Subject: Re: [galaxy-user] Fwd: Need a generous help regarding uploading
> To: "vasu punj" <punjv(a)yahoo.com>
> Cc: "galaxy-user(a)bx.psu.edu" <galaxy-user(a)bx.psu.edu>
> Date: Wednesday, December 15, 2010, 9:52 AM
>
> Hello Asifullah,
>
> Click on the "save" icon for the dataset to initiate a download. This
> icon is shaped like a little purple disk on the very left side of the
> dataset's box in the right history pane.
>
> Hopefully this is helpful,
>
> Thank you,
>
> Jen
> Galaxy team
>
> On 12/15/10 7:47 AM, vasu punj wrote:
> > Hi,
> > Can we save Bam file from Tophat out put if so how can we do that
> please.
> > Thanks.
> >
> > --- On *Wed, 12/15/10, Jennifer Jackson /<jen(a)bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=jen@bx.psu.edu>>/* wrote:
> >
> >
> > From: Jennifer Jackson <jen(a)bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=jen@bx.psu.edu>>
> > Subject: Re: [galaxy-user] Fwd: Need a generous help regarding
> uploading
> > To: "asifullah khan" <asifullah111(a)gmail.com
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=asifullah111@gmail.com>>
> > Cc: galaxy-user(a)bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@bx.psu.edu>
> > Date: Wednesday, December 15, 2010, 9:38 AM
> >
> > Hello Asifullah,
> >
> > The Galaxy tool "Get Data -> Upload File" has instructions for
> > uploading
> > data to the FTP server. We recommend using the 3rd party utility
> > FileZilla as a desktop FTP client.
> >
> > http://file-zilla.com/
> >
> > Hopefully this will get you started,
> >
> > Best!
> >
> > Jen
> > Galaxy team
> >
> > ps: In the future, if you could send questions directly to
> > galaxy-user(a)bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@bx.psu.edu>
> >
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@bx.psu.edu>,
> > it would be very helpful for us.
> >
> > On 12/12/10 8:32 AM, Kelly Vincent wrote:
> > >
> > >
> > > Begin forwarded message:
> > >
> > >> *From: *asifullah khan <asifullah111(a)gmail.com
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=asifullah111@gmail.com>
> >
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=asifullah111@gmail.com>
> > >> <mailto:asifullah111@gmail.com
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=asifullah111@gmail.com>
> >
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=asifullah111@gmail.com>>>
> > >> *Date: *December 12, 2010 11:01:27 AM EST
> > >> *To: *galaxy-user-owner(a)lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user-owner@lists.bx....>
> >
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user-owner@lists.bx....>
> > >> <mailto:galaxy-user-owner@lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user-owner@lists.bx....>
> >
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user-owner@lists.bx....>>
> > >> *Subject: **Need a generous help regarding uploading*
> > >>
> > >> Dear All,
> > >>
> > >> I am in trouble to upload 1.2Gb illumine data on Galaxy page for
> > >> subsequent assembling and analysis steps. I am basically a
> biologist
> > >> and new regarding utility of modern bioinformatics tools. some one
> > >> have suggested to upload first your data on a http web page
> then it
> > >> would be easy to be uploaded on GALaxy. But it is also not
> possible
> > >> for me as having little computer knowledge. So could some one
> > help me
> > >> generously regarding these issue. I will highly appreciate
> > his/her help.
> > >>
> > >> Regards
> > >> Asifullah Khan
> > >
> > >
> > >
> > > _______________________________________________
> > > galaxy-user mailing list
> > > galaxy-user(a)lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@lists.bx.psu.edu>
> >
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@lists.bx.psu.edu>
> > > http://lists.bx.psu.edu/listinfo/galaxy-user
> >
> > --
> > Jennifer Jackson
> > http://usegalaxy.org <http://usegalaxy.org/> <http://usegalaxy.org/>
> > _______________________________________________
> > galaxy-user mailing list
> > galaxy-user(a)lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@lists.bx.psu.edu>
> >
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@lists.bx.psu.edu>
> > http://lists.bx.psu.edu/listinfo/galaxy-user
> >
> >
>
> --
> Jennifer Jackson
> http://usegalaxy.org <http://usegalaxy.org/>
>
>
--
Jennifer Jackson
http://usegalaxy.org
11 years, 6 months
Re: [galaxy-user] Fwd: Need a generous help regarding uploading
by Jennifer Jackson
Hello Asifullah,
Click on the "save" icon for the dataset to initiate a download. This
icon is shaped like a little purple disk on the very left side of the
dataset's box in the right history pane.
Hopefully this is helpful,
Thank you,
Jen
Galaxy team
On 12/15/10 7:47 AM, vasu punj wrote:
> Hi,
> Can we save Bam file from Tophat out put if so how can we do that please.
> Thanks.
>
> --- On *Wed, 12/15/10, Jennifer Jackson /<jen(a)bx.psu.edu>/* wrote:
>
>
> From: Jennifer Jackson <jen(a)bx.psu.edu>
> Subject: Re: [galaxy-user] Fwd: Need a generous help regarding uploading
> To: "asifullah khan" <asifullah111(a)gmail.com>
> Cc: galaxy-user(a)bx.psu.edu
> Date: Wednesday, December 15, 2010, 9:38 AM
>
> Hello Asifullah,
>
> The Galaxy tool "Get Data -> Upload File" has instructions for
> uploading
> data to the FTP server. We recommend using the 3rd party utility
> FileZilla as a desktop FTP client.
>
> http://file-zilla.com/
>
> Hopefully this will get you started,
>
> Best!
>
> Jen
> Galaxy team
>
> ps: In the future, if you could send questions directly to
> galaxy-user(a)bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@bx.psu.edu>,
> it would be very helpful for us.
>
> On 12/12/10 8:32 AM, Kelly Vincent wrote:
> >
> >
> > Begin forwarded message:
> >
> >> *From: *asifullah khan <asifullah111(a)gmail.com
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=asifullah111@gmail.com>
> >> <mailto:asifullah111@gmail.com
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=asifullah111@gmail.com>>>
> >> *Date: *December 12, 2010 11:01:27 AM EST
> >> *To: *galaxy-user-owner(a)lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user-owner@lists.bx....>
> >> <mailto:galaxy-user-owner@lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user-owner@lists.bx....>>
> >> *Subject: **Need a generous help regarding uploading*
> >>
> >> Dear All,
> >>
> >> I am in trouble to upload 1.2Gb illumine data on Galaxy page for
> >> subsequent assembling and analysis steps. I am basically a biologist
> >> and new regarding utility of modern bioinformatics tools. some one
> >> have suggested to upload first your data on a http web page then it
> >> would be easy to be uploaded on GALaxy. But it is also not possible
> >> for me as having little computer knowledge. So could some one
> help me
> >> generously regarding these issue. I will highly appreciate
> his/her help.
> >>
> >> Regards
> >> Asifullah Khan
> >
> >
> >
> > _______________________________________________
> > galaxy-user mailing list
> > galaxy-user(a)lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@lists.bx.psu.edu>
> > http://lists.bx.psu.edu/listinfo/galaxy-user
>
> --
> Jennifer Jackson
> http://usegalaxy.org <http://usegalaxy.org/>
> _______________________________________________
> galaxy-user mailing list
> galaxy-user(a)lists.bx.psu.edu
> <http://us.mc1147.mail.yahoo.com/mc/compose?to=galaxy-user@lists.bx.psu.edu>
> http://lists.bx.psu.edu/listinfo/galaxy-user
>
>
--
Jennifer Jackson
http://usegalaxy.org
11 years, 6 months
Fwd: Need a generous help regarding uploading
by Kelly Vincent
Begin forwarded message:
> From: asifullah khan <asifullah111(a)gmail.com>
> Date: December 12, 2010 11:01:27 AM EST
> To: galaxy-user-owner(a)lists.bx.psu.edu
> Subject: Need a generous help regarding uploading
>
> Dear All,
>
> I am in trouble to upload 1.2Gb illumine data on Galaxy page for
> subsequent assembling and analysis steps. I am basically a biologist
> and new regarding utility of modern bioinformatics tools. some one
> have suggested to upload first your data on a http web page then it
> would be easy to be uploaded on GALaxy. But it is also not possible
> for me as having little computer knowledge. So could some one help
> me generously regarding these issue. I will highly appreciate his/
> her help.
>
> Regards
> Asifullah Khan
11 years, 6 months
Bam file after Tophat run
by vasu punj
I am running RNAseq in Galaxy after running Tophat I am trying to save Bam file but by default it is Bam Index file any suggestion how can I save Bam out of Tophat
Thanks.
11 years, 6 months
help regarding annotation
by pande
Dear galaxy,
Conceptually if I take sequences from mm9 and use the lift over to
convert them to mm8 and then annotate these with any software for cis
element annotation do you think I will get a fairly good result ?
I am bit confused. Does the annotation for the cis regulome undergo so
much variation ?
thanks .
11 years, 6 months
Wiki erratum...
by Richard Bruskiewich
Galaxy Colleagues,
I don't know who is maintaining the Galaxy wiki page at
http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I noticed
that the Python script under the Megablast instructions has an error: the
"defline" operation after the "line.startswith" should be moved *after* the
if length > 0 statement, otherwise, the defline is reset incorrectly before
the previous sequence is written out. This results in a frameshift in the
FASTA header line identifiers (i.e. the current sequence gets the next
sequence identifier).
I've commented out the erroneous defline below and added the right one:
import sys
length = 0
defline = ''
seq = []
for line in sys.stdin :
line = line.rstrip( '\r\n' )
if line.startswith( '>' ):
# defline = line.split( "|" )[1] # defline should NOT be here
if length > 0:
print ">%s_%s" % ( defline, length )
print "\n".join( seq )
length = 0
seq = []
defline = line.split( "|" )[1] # defline should be here
else:
seq.append( line )
length += len( line )
print ">%s_%s" % ( defline, length )
print "\n".join( seq )
While on the topic of this page, perhaps the software versions need to be
revisited. Megablast has been superseded already by Blast+. Perhaps new
releases of Galaxy should update this?
BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I
heard rumors that it was due this week.
Cheers
Richard Bruskiewich
--
*Richard Bruskiewich, PhD*
Senior Scientist, Computational and Systems Biology
Applications Team for Computational Genomics
T.T. Chang Genetic Resources Center
International Rice Research Institute
11 years, 6 months
Bioinformatics analyst
by Weidong Zhang
Hello -
There is a Bioinformatics analyst position open at The Jackson Laboratory. If interested, please go to http://www.jax.org/careers/current-jobs.html to apply.
Job Description
There is a position available in the Computational Sciences Statistics and Analysis (CS-SA) Group to provide bioinformatics analysis and statistical consulting support to our research staff and JAX Mice and Services. This position will have a significant focus on providing analytical support for high-throughput-sequence (HTPS) data analysis and will interact directly with research faculty and staff at The Jackson Laboratory. The successful candidate will need to develop a broad understanding of faculty research goals and will consult with faculty on the use of the bioinformatics service. Responsibilities include a majority of the following:
* Analysis and interpretation of HTPS data;
* Developing and managing HTPS data analysis pipelines;
* Involvement in all phases of analysis from data processing and quality control through final analytical results;
* Develop analytical expertise in DNASeq, RNASeq and ChIPSeq as it pertains to HTPS data analysis;
* Statistical consulting;
* QTL Analysis;
* Gene Expression Analysis;
* Work closely with the other Computational Sciences groups, especially the Computational Sciences scientific computing group to develop synergistic interactions that promote best use and collaboration within the entire Computational Sciences group direct involvement in writing data analysis scripts in widely used bioinformatics and statistics programming languages such as Perl, Python, R, MATLAB, or SAS.
Required Experience
The successful candidate should have a strong background in bioinformatics tools and resources, molecular biology, genomics and genetics, and data analysis techniques advanced knowledge of biostatistics is a plus. Minimum requirements are a PhD in a biomedical and/or biostatistics discipline or a MS degree and significant relevant experience. The successful candidate will be detailed oriented, highly organized, have excellent written and verbal communications skills and be able to orally present complex materials to scientific audiences. The successful candidate will also need to have a high level of interest in continuous learning of new science and tools for discovery. Flexibility to travel a few times each year is required.
Job Location
Bar Harbor, ME, US.
11 years, 6 months
Re: [galaxy-user] [galaxy-lab] [galaxy-dev] Wiki erratum...] + cloud question
by Enis Afgan
Hi Richard,
Please see some comments below.
On Fri, Dec 10, 2010 at 8:59 PM, <anton(a)bx.psu.edu> wrote:
> ---------------------------- Original Message ----------------------------
> Subject: Re: [galaxy-dev] Wiki erratum...
> From: "Richard Bruskiewich" <r.bruskiewich(a)irri.org>
> Date: Fri, December 10, 2010 7:17 pm
> To: "Anton Nekrutenko" <anton(a)bx.psu.edu>
> --------------------------------------------------------------------------
>
> Hi Anton,
>
> The power of open source: many eyes... Glad to be of help.
>
> BTW, thank you for all your tutorial videos... they are excellent. I
> present
> them to my staff as an example of how to empower end users to they can work
> more independently.
>
> I am located at the International Rice Research Institute (IRRI;
> www.irri.org) in the Philippines where I've been for over a decade working
> on rice genomics. Due to recent strategic research restructuring here, I
> now have the excuse, after years of senior research management, of simply
> being a bioinformatics hacker again. It's both fun and frustrating.
>
> I'm just getting seriously started with Galaxy although I've known about
> the platform for some years now. It is a very exciting tool. I can't wait
> to put it to good use in our projects here.
>
> In particular, NGS data sets are starting to pour in from many IRRI
> projects. Galaxy promises to make the analysis of such data tractable,
> documented and efficient.
>
> In fact, in 2011, we may be resequencing up to 10,000 new rice genomes.
> Galaxy on the Amazon cloud is a godsend for this, although I'm patiently
> awaiting for the AMI to be cloned to the ap-southeast region in Singapore,
> where we do most of our computing deployments (since we are in Asia). I've
> also been told that the next release will have also the Michael Smith
> Genome
> Sciences Center ABySS assembler included... I'm keen on using that software
> within Galaxy.
>
> On that note, a technical question about which I'm curious: does Galaxy
> configuration currently allow specific tools to run on specific sized
> instances? For example, if I fire up a Galaxy CloudMan cluster with a few
> large RAM Amazon instances/nodes, can I specifically request that specific
> software components (e.g. assemblers like Abyss) run only, or preferably,
> on
> those high capacity nodes?
>
Currently, Galaxy Cloud allows a cluster to be composed of multiple types of
instances but the selection of which tool runs on which instance is handled
by the job manager (i.e., SGE) and thus a specific job cannot be targeted at
a specific instance type; we should eventually provide support for this type
of functionality. In the mean time, a cluster can be composed of the type of
instance that match the current workload type and then the type of instances
can be changed as the type of workload changes.
Also, Amazon has so-called "cluster" instances, and now GPU cluster
> instances. Again, the same idea applies: can specific tools be told to only
> run on such a cluster instance? Further ahead, could Galaxy be configured
> to automatically start/stop specific instances only when needed (including
> cluster instances)?
>
Because MPI-type jobs are the only true beneficiaries of the cluster
instances, but only a handful of bioinformatics software are actually
implemented using MPI and because those instances require a different AMI,
we do not currently have support for that type of instances - maybe down the
line.
Nonetheless, in the coming new version Galaxy Cloud (currently being
tested), the application will be able to automatically scale the size of the
cluster based on the current workload.
Thanks for your interest,
Enis
> I know... probably forging recklessly ahead here. I hope to have a stronger
> computing science staff on board in a few months which may allow me to
> explore such topics more proactively, but I'm simply wondering about the
> state-of-the-art here.
>
> I hope that once I get more familiar with the platform, that I'll be able
> to
> contribute back more. I'm configuring Galaxy to connect to rice genome
> data,
> and there are some other tools I think might be useful in the platform (for
> our work, anyhow) so I'll get them in, then share the configuration files
> with the community. Maybe the deeper I dig, the more useful I'll get :-).
>
> Cheers
> Richard
>
> --
> *Richard Bruskiewich, PhD*
> Senior Scientist, Computational and Systems Biology
> Applications Team for Computational Genomics
> T.T. Chang Genetic Resources Center
> International Rice Research Institute
>
> On Fri, Dec 10, 2010 at 9:50 PM, Anton Nekrutenko <anton(a)bx.psu.edu>
> wrote:
>
> > Richard:
> >
> > This beauty was mine. Thanks for pointing this out. It is now fixed.
> >
> > Thanks,
> >
> > anton
> >
> >
> > On Dec 9, 2010, at 10:04 PM, Richard Bruskiewich wrote:
> >
> > Galaxy Colleagues,
> >
> > I don't know who is maintaining the Galaxy wiki page at
> > http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I
> > noticed that the Python script under the Megablast instructions has an
> > error: the "defline" operation after the "line.startswith" should be
> moved
> > *after* the if length > 0 statement, otherwise, the defline is reset
> > incorrectly before the previous sequence is written out. This results in
> a
> > frameshift in the FASTA header line identifiers (i.e. the current
> sequence
> > gets the next sequence identifier).
> >
> > I've commented out the erroneous defline below and added the right one:
> >
> > import sys
> >
> > length = 0
> > defline = ''
> > seq = []
> >
> > for line in sys.stdin :
> > line = line.rstrip( '\r\n' )
> >
> > if line.startswith( '>' ):
> > # defline = line.split( "|" )[1] # defline should NOT be here
> > if length > 0:
> > print ">%s_%s" % ( defline, length )
> >
> > print "\n".join( seq )
> > length = 0
> > seq = []
> > defline = line.split( "|" )[1] # defline should be here
> >
> > else:
> > seq.append( line )
> >
> > length += len( line )
> >
> > print ">%s_%s" % ( defline, length )
> > print "\n".join( seq )
> >
> > While on the topic of this page, perhaps the software versions need to be
> > revisited. Megablast has been superseded already by Blast+. Perhaps new
> > releases of Galaxy should update this?
> >
> > BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I
> > heard rumors that it was due this week.
> >
> > Cheers
> > Richard Bruskiewich
> >
> > --
> > *Richard Bruskiewich, PhD*
> > Senior Scientist, Computational and Systems Biology
> > Applications Team for Computational Genomics
> > T.T. Chang Genetic Resources Center
> > International Rice Research Institute
> >
> > _______________________________________________
> > galaxy-dev mailing list
> > galaxy-dev(a)lists.bx.psu.edu
> > http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> >
> > Anton Nekrutenko
> > http://nekrut.bx.psu.edu
> > http://usegalaxy.org
> >
> >
> >
> >
>
> _______________________________________________
> galaxy-lab mailing list
> galaxy-lab(a)lists.bx.psu.edu
> http://lists.bx.psu.edu/listinfo/galaxy-lab
>
>
11 years, 6 months