<< Graduate Student: Yanfeng Zhang
   Comparative Genomics Group.
   Kunming Institute of Zoology,Chinese Academy of Sciences.
>>




> From: galaxy-user-request@lists.bx.psu.edu
> Subject: galaxy-user Digest, Vol 58, Issue 3
> To: galaxy-user@lists.bx.psu.edu
> Date: Tue, 5 Apr 2011 11:33:23 -0400
>
> Send galaxy-user mailing list submissions to
> galaxy-user@lists.bx.psu.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.bx.psu.edu/listinfo/galaxy-user
> or, via email, send a message with subject or body 'help' to
> galaxy-user-request@lists.bx.psu.edu
>
> You can reach the person managing the list at
> galaxy-user-owner@lists.bx.psu.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of ! galaxy-user digest..."
>
>
> HEY! This is important! If you reply to a thread in a digest, please
> 1. Change the subject of your response from "Galaxy-user Digest Vol ..." to the original subject for the thread.
> 2. Strip out everything else in the digest that is not part of the thread you are responding to.
>
> Why?
> 1. This will keep the subject meaningful. People will have some idea from the subject line if they should read it or not.
> 2. Not doing this greatly increases the number of emails that match search queries, but that aren't actually informative.
>
> Today's Topics:
>
> 1. Re: MAF (Eccles, David)
> 2. Re: Analyzing Targeted Resequencing data with Galaxy
> (Anton Nekrutenko)
> 3. Re: MAF (Anton Nekrutenko)
> 4. convert formats (Sher, Falak)
> 5. Re: convert formats (Daniel Blankenberg)
> 6. Subject: Analyzing Target! ed Resequencing data with> Galaxy
> (Jackie Lighten)
> 7. Re: Analyzing Targeted Resequencing data with Galaxy
> (Anton Nekrutenko)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 5 Apr 2011 14:33:35 +0200
> From: "Eccles, David" <david.eccles@mpi-muenster.mpg.de>
> To: "Ross" <ross.lazarus@gmail.com>
> Cc: galaxy-user@lists.bx.psu.edu
> Subject: Re: [galaxy-user] MAF
> Message-ID:
> <B4B747BF2FE2BB43A6D483192168CBD172D0D5@VSM.exc.top.gwdg.de>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On Tue, Apr 5, 2011 at 5:19 AM, Laura Iacolina <liacolina@uniss.it> wrote:
> > Considering I have to analyse 200 samples with 50K markers is there any way
> to tell R to analyse each SNP one after the other?
>
> From: Ross [mailto:ross.lazarus@gmail.com]
> > There are some Galaxy wrappers for plink
> &! gt; (http://pngu.mgh.harvard.edu/~purcell/plink/) that may be useful for
> > some kinds of analysis available in the rgenetics tools if you have
> > linkage pedigree genotype and map files.
>
> I would also advise using plink for this. Calculating SNP marker statistics
> [1] is the one of the things that it has been designed to do. The main
> problem is getting data into a format supported by plink, either linkage (one
> line per individual), or transposed pedigree (one line per marker). There are
> details on these formats in the plink documentation [2].
>
> [1] http://pngu.mgh.harvard.edu/~purcell/plink/summary.shtml#freq
> [2] http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#tr
>
> --
> David Eccles (gringer)
>
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 5 Apr 2011 09:56:44 -0400
> From: Anton Nekrutenk! o <anton@bx.psu.edu>
> To: Lali <laurafe@gmail.com>< br>> Cc: galaxy-user@lists.bx.psu.edu
> Subject: Re: [galaxy-user] Analyzing Targeted Resequencing data with
> Galaxy
> Message-ID: <8172FBD2-4CDA-4312-B54F-DCC730A40AB9@bx.psu.edu>
> Content-Type: text/plain; charset=us-ascii
>
> Lali:
>
> In your case the workflow for capture re-sequencing should look like this:
>
> 1. QC data (groom fastq files and plot quality distribution)
> 2. Map the reads (use bwa)
> 3. Generate and filter pileup
> 4. Intersect pileup with coordinates of sure select bates.
>
> However, before you dive in please understand basic Galaxy functionality by taking a look at http://usegalaxy.org/galaxy101 and watching *all* Illumina-related Galaxy quickies (black boxes on the front page on Galaxy). Next, take a look at http://usegalaxy.org/heteroplasmy.
>
> Note, that we are working on bringing "industrial-strength" diploid genotyping funct! ionality in Galaxy in the next two-three months that will include more sophisticated genotypers, recalibration and realignment tools, and novel visualization approaches.
>
> Thank for using Galaxy.
>
> anton
> galaxy team
>
>
>
> On Apr 5, 2011, at 2:44 AM, Lali wrote:
>
> > Hi!
> > I am having problems with my sequencing results, but I am a newbie at this; so I am thinking there is something wrong with my analysis. So far, I've tried Galaxy and CLC Workbench, but with CLC I could not align to the whole genome, only to individual chromosomes (maybe there is a way, but by the time the trial ended I had not found it).
> >
> > I used SureSelect capture kit and did single end sequencing on an Illumina. The files the lab sent me are FastQ Illumina 1.5 files, my samples were indexed, and I got a series of files each representing an Index.
> >
> > What would ! be the standard workflow for this kind of data?
> > Which too ls/settings?
> >
> > Does anyone have an example Galaxy workflow for preparing (clipping adapters, quality trimming) and mapping Targeted Resequencing Data?
> >
> > Is there a way to obtain a coverage report through Galaxy?
> >
> > Is it possible to ignore/discard the reads mapped when the coverage is below a certain threshold?
> >
> > I know, I know, a lot of things, but I am very lost.
> > Any help is appreciated.
> >
> > L ___________________________________________________________
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org. Please keep all replies on the list by
> > using "reply all" in your mail client. For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development li! st:
> >
> > http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> >
> > http://lists.bx.psu.edu/
>
> Anton Nekrutenko
> http://nekrut.bx.psu.edu
> http://usegalaxy.org
>
>
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 5 Apr 2011 10:09:44 -0400
> From: Anton Nekrutenko <anton@bx.psu.edu>
> To: Laura Iacolina <liacolina@uniss.it>
> Cc: galaxy-user@lists.bx.psu.edu
> Subject: Re: [galaxy-user] MAF
> Message-ID: <0E6E6AC6-0300-4B5E-B462-BA6715FC1F6A@bx.psu.edu>
> Content-Type: text/plain; charset=windows-1252
>
> Laura:
>
> SNP identification and analysis is a very complex subject and without knowing what you are trying to do i! t is very difficult to point you to the right direction. Perhaps a goo d place to start would be a supplement for the last year's report from 1000 Genomes Consortium (Nature. 467(7319): p. 1061-1073). Some of the steps you can perform through Galaxy, yet some are in development.
>
> Thanks!
>
> anton
> galaxy team
>
>
> On Apr 5, 2011, at 5:19 AM, Laura Iacolina wrote:
>
> > Dear all,
> > I?m analysing SNPs data for the first time. I tried with the few software I found in litterature but they can only manage small datasets. I am currently trying with ?genetics? package in R but the Geno function takes into account a marker at a time. Considering I have to analyse 200 samples with 50K markers is there any way to tell R to analyse each SNP one after the other?
> >
> > Thank you very much for the help.
> >
> > Laura
> >
> >
> > ___________________________________________________________
> > T! he Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org. Please keep all replies on the list by
> > using "reply all" in your mail client. For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> >
> > http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> >
> > http://lists.bx.psu.edu/
>
> Anton Nekrutenko
> http://nekrut.bx.psu.edu
> http://usegalaxy.org
>
>
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 5 Apr 2011 10:21:38 -0400
> From: "Sher, Falak" <Falak.Sher@childrens.harvard.edu>
> To: "gala! xy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu>
> Subject: [galaxy-user] convert formats
> Message-ID:
> <28032B153244774DA100823F4D4764210CBC2B2D6F@CHEXCCRV4.CHBOSTON.ORG>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi colleagues,
>
> I used MACS for peak finding through Galaxy, I want to convert the format of the resulted wig files into bigwig using Galaxy tool "convert formats'
> The job is executed but not running, I redo even then it does not start. it is stucked with the message, job is waiting to run.
> logout and re login are not helping
>
> any suggestion/information please ?
>
> F
>
>
>
> ------------------------------
>
> Message: 5
> Date: Tue, 5 Apr 2011 10:30:47 -0400
> From: Daniel Blankenberg <dan@bx.psu.edu>
> To: "Sher, Falak" <Falak.Sher@childrens.harvard.edu>
> Cc: "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu>> Subject: Re: [galaxy-user] convert formats
> Message-ID: <B7D82135-EF7A-4DD1-81F2-9A99C1BA2023@bx.psu.edu>
> Content-Type: text/plain; charset=us-ascii
>
> Hi Falak,
>
> Due to the fact that the underlying wig to bigWig executable can use huge amounts of RAM, a single large-memory node is allocated for these jobs. This has the unfortunate side effect that wigToBigwig jobs may need to wait for a significant amount of time before being executed. Please be patient, although if you suspect a problem and have waited for a very long period of time, please do report it.
>
> Thanks for using Galaxy,
>
> Dan
>
>
> On Apr 5, 2011, at 10:21 AM, Sher, Falak wrote:
>
> > Hi colleagues,
> >
> > I used MACS for peak finding through Galaxy, I want to convert the format of the resulted wig files into bigwig using Galaxy tool "convert formats'
> > The ! job is executed but not running, I redo even then it does not start. i t is stucked with the message, job is waiting to run.
> > logout and re login are not helping
> >
> > any suggestion/information please ?
> >
> > F
> >
> > ___________________________________________________________
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org. Please keep all replies on the list by
> > using "reply all" in your mail client. For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> >
> > http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> >
> > http://lists.bx.psu.edu/
>
>
>
>
> ---------! ---------------------
>
> Message: 6
> Date: Tue, 05 Apr 2011 10:02:12 -0300
> From: Jackie Lighten <jc807177@dal.ca>
> To: <galaxy-user@lists.bx.psu.edu>
> Subject: [galaxy-user] Subject: Analyzing Targeted Resequencing data
> with> Galaxy
> Message-ID: <C9C09924.23C2%jc807177@dal.ca>
> Content-Type: text/plain; charset="US-ASCII"
>
> You can do all the quality filtering with Galaxy, but may involve various
> manipulations of the data. If I am not mistaken the "metagenomics" workflow
> may help you out a little. Its designed for 454 data but should give you an
> idea of how to go about things. There is a video tutorial on the site for
> this workflow.
>
> A good place for you to start, however, may be here: Subject:
> http://edwards.sdsu.edu/prinseq_beta/#
>
> Prinseq is easy to use and will give you a full break down of your raw! data
> and enables you to filter by quality/length etc.
>
> FASTqc is an Illumina specialized preliminary analysis tool:
> http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
>
> I my self was not very impressed with CLC at all. It lacks very rudimentary
> yet critical functions.
>
> I am currently working on population amplicon data so couldn't really help
> you too much in the latest mapping to reference advances, but I found the
> Lasergene SEQman mapping and de-nova assembler much better than the CLC
> assembler.
> Good luck
>
> Jack
>
> On 11
> >
> > Message: 6
> > Date: Tue, 5 Apr 2011 08:44:06 +0200
> > From: Lali <laurafe@gmail.com>
> > To: galaxy-user@lists.bx.psu.edu
> > Subject: [galaxy-user] Analyzing Targeted Resequencing data with
> > Galaxy
> > Message-ID: <BANLkTin1ShWLQQ46+mFFBcxS-dO1GJuw_A@mail.gmail.com>
> > Content-Type: t! ext/plain; charset="iso-8859-1"
> >
> > Hi!
> > I am having problems with my sequencing results, but I am a newbie at this;
> > so I am thinking there is something wrong with my analysis. So far, I've
> > tried Galaxy and CLC Workbench, but with CLC I could not align to the whole
> > genome, only to individual chromosomes (maybe there is a way, but by the
> > time the trial ended I had not found it).
> >
> > I used SureSelect capture kit and did single end sequencing on an Illumina.
> > The files the lab sent me are FastQ Illumina 1.5 files, my samples were
> > indexed, and I got a series of files each representing an Index.
> >
> > What would be the standard workflow for this kind of data?
> > Which tools/settings?
> >
> > Does anyone have an example Galaxy workflow for preparing (clipping
> > adapters, quality trimmin! g) and mapping Targeted Resequencing Data?
> >
> > Is there a way to obtain a coverage report through Galaxy?
> >
> > Is it possible to ignore/discard the reads mapped when the coverage is below
> > a certain threshold?
> >
> > I know, I know, a lot of things, but I am very lost.
> > Any help is appreciated.
> >
> > L
> >
>
>
>
>
> ------------------------------
>
> Message: 7
> Date: Tue, 5 Apr 2011 11:33:18 -0400
> From: Anton Nekrutenko <anton@bx.psu.edu>
> To: Lali <laurafe@gmail.com>
> Cc: galaxy-user <galaxy-user@lists.bx.psu.edu>
> Subject: Re: [galaxy-user] Analyzing Targeted Resequencing data with
> Galaxy
> Message-ID: <C33F2B26-C075-47FC-B2F7-9D468ED3ED0E@bx.psu.edu>
> Content-Type: text/plain; charset="us-ascii"
>
> Lali:
>
> Please, always CC mailing list when you reply.
>
&! gt; > My only problem with Galaxy is that I have to keep on clearing my cache in order to get the history to display correctly, is there another way of solving this issue?
>
>
> Which browser/OS are your using?
>
> Thanks,
>
> anton
> galaxy team
>
> On Apr 5, 2011, at 11:25 AM, Lali wrote:
>
> > Thanks so much for the tips Anton!
> > I am very excited about the newer developments.
> > I did watch the quickies and they were very useful for a beginner like me, I actually did my first try at the alignment by following the Illumina single-end tutorial video step by step, but you need to watch the paired-end too, for some of the first steps, which are explained better on that one.
> > I have been playing around a lot with Galaxy, and I have several workflows, my department just started doing sequencing, so we don't have standard procedures set in place. I was assigned ! to evaluate Galaxy and CLC, and so far CLC has not impressed me, excep t for the fact that it can generate reports easily.
> > I think Galaxy is the way to go for me (us, if I can convince them to run a local server), since I am not a bioinformatician, and just the fact that you can queue up actions and just walk away is fantastic (amongst other things).
> > But because I am a beginner, I am not 100% of the settings I have chosen and my data is not looking too good so far, but I am having a bioinformatician come over and help me on Thursday and I think your tips will be of help.
> > My only problem with Galaxy is that I have to keep on clearing my cache in order to get the history to display correctly, is there another way of solving this issue?
> >
> > Best regards,
> >
> > L
> >
> > On Tue, Apr 5, 2011 at 3:56 PM, Anton Nekrutenko <anton@bx.psu.edu> wrote:
> > Lali:
> >
> > In your case the workflow for capture re-sequencing s! hould look like this:
> >
> > 1. QC data (groom fastq files and plot quality distribution)
> > 2. Map the reads (use bwa)
> > 3. Generate and filter pileup
> > 4. Intersect pileup with coordinates of sure select bates.
> >
> > However, before you dive in please understand basic Galaxy functionality by taking a look at http://usegalaxy.org/galaxy101 and watching *all* Illumina-related Galaxy quickies (black boxes on the front page on Galaxy). Next, take a look at http://usegalaxy.org/heteroplasmy.
> >
> > Note, that we are working on bringing "industrial-strength" diploid genotyping functionality in Galaxy in the next two-three months that will include more sophisticated genotypers, recalibration and realignment tools, and novel visualization approaches.
> >
> > Thank for using Galaxy.
> >
> > anton
> > galaxy team
> >
> >
! > >
> > On Apr 5, 2011, at 2:44 AM, Lali wrote:
> ; >
> > > Hi!
> > > I am having problems with my sequencing results, but I am a newbie at this; so I am thinking there is something wrong with my analysis. So far, I've tried Galaxy and CLC Workbench, but with CLC I could not align to the whole genome, only to individual chromosomes (maybe there is a way, but by the time the trial ended I had not found it).
> > >
> > > I used SureSelect capture kit and did single end sequencing on an Illumina. The files the lab sent me are FastQ Illumina 1.5 files, my samples were indexed, and I got a series of files each representing an Index.
> > >
> > > What would be the standard workflow for this kind of data?
> > > Which tools/settings?
> > >
> > > Does anyone have an example Galaxy workflow for preparing (clipping adapters, quality trimming) and mapping Targeted Resequencing Data?
> > >
> > > Is there ! a way to obtain a coverage report through Galaxy?
> > >
> > > Is it possible to ignore/discard the reads mapped when the coverage is below a certain threshold?
> > >
> > > I know, I know, a lot of things, but I am very lost.
> > > Any help is appreciated.
> > >
> > > L ___________________________________________________________
> > > The Galaxy User list should be used for the discussion of
> > > Galaxy analysis and other features on the public server
> > > at usegalaxy.org. Please keep all replies on the list by
> > > using "reply all" in your mail client. For discussion of
> > > local Galaxy instances and the Galaxy source code, please
> > > use the Galaxy Development list:
> > >
> > > http://lists.bx.psu.edu/listinfo/galaxy-dev
> > >
> > > To manage your subscriptions! to this and other Galaxy lists,
> > > please use the inte rface at:
> > >
> > > http://lists.bx.psu.edu/
> >
> > Anton Nekrutenko
> > http://nekrut.bx.psu.edu
> > http://usegalaxy.org
> >
> >
> >
> >
>
> Anton Nekrutenko
> http://nekrut.bx.psu.edu
> http://usegalaxy.org
>
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.bx.psu.edu/pipermail/galaxy-user/attachments/20110405/a8615fdf/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> galaxy-user mailing list
> galaxy-user@lists.bx.psu.edu
> http://lists.bx.psu.edu/listinfo/galaxy-user
>
>
> End of galaxy-user Digest, Vol 58, Issue 3
> ******************************************