May 2010 - galaxy-user - lists.galaxyproject.org

Python error when running Bowtie for Illumina
by Weng Khong Lim 01 Feb '11

01 Feb '11

Hi all, I'm new to next-gen sequencing, so please be gentle. I've just received a pair of Illumina FASTQ files from the sequencing facility and intend to map them to the hg19 reference genome. I first used the FASTQ Groomer utility to convert the reads into Sanger reads. However, when running Bowtie for Illumina on the resulting dataset under default settings, I received the following error: An error occurred running this job: *Error aligning sequence. requested number of bytes is more than a Python string can hold* * * Can someone help point out my mistake? My history is accessible at http://main.g2.bx.psu.edu/u/wengkhong_lim/h/chip-seq-pilot-batch Appreciate the help! Weng Khong, LIM Department of Genetics University of Cambridge E-mail: wkl24(a)cam.ac.uk Tel: +447503225832

4 3

The uploaded file contains inappropriate content
by Erick Antezana 04 Jun '10

04 Jun '10

Hi, It seems there is a (serious) bug while "Adding datasets" via the "Upload directory of files" (with or without copying the data into Galaxy (3545:2590120aed68), i.e. (un)tick the 'No' box) if the files are in ZIP format. Actually the files get erased from the filesystem. In the "information" column, I get: Job error (click name for more info) Then, after clicking on the file name: Information about 2010.fastq.zip Message: Uploaded by: erick(a)mydomain.com Date uploaded: 2010-03-23 Build: ? Miscellaneous information: The uploaded file contains inappropriate content error We have no problems with adding unzipped files... thanks, Erick

5 12

Re: [galaxy-user] downloading huge data sets from history using wget
by Florent Angly 31 May '10

31 May '10

Hi Peter, Please use 'reply all' so that everyone on the mailing list can participe in the discussion. I did not publish my history, so that's probably not what causes problems for you. If you can click on the 'save' icon and it starts the download successfully, then you ought to be able to copy the download link and use it in wget and have it working. What happens when you click on 'save'? Does it start the download? Florent On 31/05/10 09:57, pis(a)duke.edu wrote: > Hi Florent, > > Do you think that I need to publish it first as a history and then try > it again? > I suspect that may be the reason for the strange behavior. > > I will let you know when I get it to work > > Thank you very much for your help > > Have a nice day > Peter > > > > Zitat von Florent Angly <florent.angly(a)gmail.com>: > >> Hi Peter, >> >> See an example below: >>> $ wget >>> http://main.g2.bx.psu.edu/datasets/59a2a6ec00c47fc4/display?to_ext=fasta >>> >>> --2010-05-31 09:45:24-- >>> http://main.g2.bx.psu.edu/datasets/59a2a6ec00c47fc4/display?to_ext=fasta >>> >>> Resolving main.g2.bx.psu.edu... 128.118.201.93 >>> Connecting to main.g2.bx.psu.edu|128.118.201.93|:80... connected. >>> HTTP request sent, awaiting response... 200 OK >>> Length: 1177056689 (1.1G) [text/plain] >>> Saving to: `display?to_ext=fasta' >>> 0% [ ] 224,112 243K/s >> The download link was copied from the "save" icon. >> >> When I try with your link, I get: >>> $ wget >>> http://main.g2.bx.psu.edu/datasets/c3a8db0a339f7a43/display?to_ext=fastqsan… >>> >>> --2010-05-31 09:49:10-- >>> http://main.g2.bx.psu.edu/datasets/c3a8db0a339f7a43/display?to_ext=fastqsan… >>> >>> Resolving main.g2.bx.psu.edu... 128.118.201.93 >>> Connecting to main.g2.bx.psu.edu|128.118.201.93|:80... connected. >>> HTTP request sent, awaiting response... 416 Request Range Not >>> Satisfiable >>> >>> The file is already fully retrieved; nothing to do. >> The "request range not satisfiable" makes me think that your download >> link is not valid for some reason. >> >> Florent >> >> >> On 30/05/10 23:43, pis(a)duke.edu wrote: >>> Dear Florent Angly, >>> >>> Thank you very much for your response. I have actually tried to do >>> that but it >>> still does not work. When I choose "copy link location" in firefox >>> (in my >>> version no save link location appears" I get an URL with a strange >>> data file >>> name such as >>> http://main.g2.bx.psu.edu/datasets/c3a8db0a339f7a43/display?to_ext=fastqsan…. >>> This will not work with wget ("wget: No match"). I does not work >>> either when I >>> replace the data file name with the name of the file that apper when >>> I want to >>> download the file using the disl ikon. I would be happy for further >>> advice on >>> that since it has apparently worked for you. >>> >>> Thank you very much for your help >>> Peter >>> >>> Zitat von Florent Angly <florent.angly(a)gmail.com>: >>> >>>> Hi Peter, >>>> It's pretty easy. In the Galaxy interface, use the "save" icon >>>> represented as a floppy disk. Instead of clicking on the icon, get >>>> the URL for it (in Firefox: right click > save link location). Then >>>> simply copy this URL in your terminal for wget to use. >>>> Florent >>>> >>>> On 30/05/10 07:12, pis(a)duke.edu wrote: >>>>> Dear Galaxy team, >>>>> >>>>> I am doing some standard clipping and trimming using galaxy and >>>>> would be happy >>>>> to download the generated files using a unix terminal and wget. Is >>>>> it a way to >>>>> figure it out the exact link of the data file to wget it easily? I >>>>> am talking >>>>> about file sizes over 5 Gb that are really time consuming and >>>>> error prone to >>>>> download using a web browser. >>>>> >>>>> Thank you very much for developing this excellent tool >>>>> Peter >>>>> >>>>> _______________________________________________ >>>>> galaxy-user mailing list >>>>> galaxy-user(a)lists.bx.psu.edu >>>>> http://lists.bx.psu.edu/listinfo/galaxy-user >>>>> >>>> >>>> _______________________________________________ >>>> galaxy-user mailing list >>>> galaxy-user(a)lists.bx.psu.edu >>>> http://lists.bx.psu.edu/listinfo/galaxy-user >>>> >>> >>> >>> >> >> > > >

1 0

Re: [galaxy-user] downloading huge data sets from history using wget
by Florent Angly 31 May '10

31 May '10

Hi Peter, See an example below: > $ wget > http://main.g2.bx.psu.edu/datasets/59a2a6ec00c47fc4/display?to_ext=fasta > --2010-05-31 09:45:24-- > http://main.g2.bx.psu.edu/datasets/59a2a6ec00c47fc4/display?to_ext=fasta > Resolving main.g2.bx.psu.edu... 128.118.201.93 > Connecting to main.g2.bx.psu.edu|128.118.201.93|:80... connected. > HTTP request sent, awaiting response... 200 OK > Length: 1177056689 (1.1G) [text/plain] > Saving to: `display?to_ext=fasta' > 0% [ ] 224,112 243K/s The download link was copied from the "save" icon. When I try with your link, I get: > $ wget > http://main.g2.bx.psu.edu/datasets/c3a8db0a339f7a43/display?to_ext=fastqsan… > --2010-05-31 09:49:10-- > http://main.g2.bx.psu.edu/datasets/c3a8db0a339f7a43/display?to_ext=fastqsan… > Resolving main.g2.bx.psu.edu... 128.118.201.93 > Connecting to main.g2.bx.psu.edu|128.118.201.93|:80... connected. > HTTP request sent, awaiting response... 416 Request Range Not Satisfiable > > The file is already fully retrieved; nothing to do. The "request range not satisfiable" makes me think that your download link is not valid for some reason. Florent On 30/05/10 23:43, pis(a)duke.edu wrote: > Dear Florent Angly, > > Thank you very much for your response. I have actually tried to do > that but it > still does not work. When I choose "copy link location" in firefox > (in my > version no save link location appears" I get an URL with a strange > data file > name such as > http://main.g2.bx.psu.edu/datasets/c3a8db0a339f7a43/display?to_ext=fastqsan…. > > This will not work with wget ("wget: No match"). I does not work > either when I > replace the data file name with the name of the file that apper when I > want to > download the file using the disl ikon. I would be happy for further > advice on > that since it has apparently worked for you. > > Thank you very much for your help > Peter > > Zitat von Florent Angly <florent.angly(a)gmail.com>: > >> Hi Peter, >> It's pretty easy. In the Galaxy interface, use the "save" icon >> represented as a floppy disk. Instead of clicking on the icon, get >> the URL for it (in Firefox: right click > save link location). Then >> simply copy this URL in your terminal for wget to use. >> Florent >> >> On 30/05/10 07:12, pis(a)duke.edu wrote: >>> Dear Galaxy team, >>> >>> I am doing some standard clipping and trimming using galaxy and >>> would be happy >>> to download the generated files using a unix terminal and wget. Is >>> it a way to >>> figure it out the exact link of the data file to wget it easily? I >>> am talking >>> about file sizes over 5 Gb that are really time consuming and error >>> prone to >>> download using a web browser. >>> >>> Thank you very much for developing this excellent tool >>> Peter >>> >>> _______________________________________________ >>> galaxy-user mailing list >>> galaxy-user(a)lists.bx.psu.edu >>> http://lists.bx.psu.edu/listinfo/galaxy-user >>> >> >> _______________________________________________ >> galaxy-user mailing list >> galaxy-user(a)lists.bx.psu.edu >> http://lists.bx.psu.edu/listinfo/galaxy-user >> > > >

1 0

downloading huge data sets from history using wget
by pis＠duke.edu 30 May '10

30 May '10

Dear Galaxy team, I am doing some standard clipping and trimming using galaxy and would be happy to download the generated files using a unix terminal and wget. Is it a way to figure it out the exact link of the data file to wget it easily? I am talking about file sizes over 5 Gb that are really time consuming and error prone to download using a web browser. Thank you very much for developing this excellent tool Peter

2 1

nmslss Lee Sze Sing is out of the office.
by Lee.Sze.Sing＠nccs.com.sg 28 May '10

28 May '10

I will be out of the office starting 28/05/2010 and will not return until 14/06/2010. I will respond to your message when I return.

1 0

Problems with Galaxy kindly reply urgently
by Amit Pande 28 May '10

28 May '10

Dear Galaxy, We have installed Galaxy on one of our Institute's server http://totoro:8080/ but the problem is when ever some file is uploaded for example in the bed format to retrieve sequences from fetch sequences tool the message that gets displayed is :"Sequences not available for the specific build". Kindly help. warm regards, Amit.

1 0

Problems with Galaxy kindly reply urgently
by Amit Pande 28 May '10

28 May '10

Dear Galaxy, We have installed Galaxy on one of our Institute's server http://totoro:8080/ but the problem is when ever some file is uploaded for example in the bed format to retrieve sequences from fetch sequences tool the message that gets displayed is :"Sequences not available for the specific build". Kindly help. warm regards, Amit.

1 0

installing galaxy
by pande 25 May '10

25 May '10

Dear Galaxy, If I download galaxy on the server of my Institute , will I be in a position to extract the sequences of my interest ? and use the tools as I would do them on your server....I have the ENCODE data and it is difficult to upload these files from the internet because of their enormous size. Kindly help. regards, Amit.

1 0

Issue with saving 'manipulate fastq' in workflow; and request for advice dealing with barcoded 454 data
by Pip Griffin 25 May '10

25 May '10

Hi, I'm a new user, learning how to use Galaxy while I wait for my 454 results. So I'm not actually playing with any data yet but I'm trying to set up a draft workflow as practice. Two issues: Issue 1. I am having trouble with the 'manipulate fastq' command. Without this, my workflow saves quickly and seems fine, but when I include even a (seemingly simple) 'manipulate fastq' step, it tries to save for many minutes, unsuccessfully, until I get sick of it and close the window. Issue 2. Well this isn't really an issue, just a request for advice! My dataset will be a barcoded amplicon library, containing 8 different gene regions (which I can recognise from the amplicon-specific primer sequences) amplified in 64 different individuals (which I can recognise by an individual-specific barcode sequence). I thought I'd set up a workflow with the following steps: 1) convert to FASTQ format. 2) grooming, filtering to remove short reads etc. 3) 'manipulate FASTQ' to match all sequences containing one of the eight reverse primer sequences, and reverse-complement them. 4) FASTQ--tabular format conversion. 5) eight separate 'select' steps to select sequences with a match to either the forward primer or the reverse-complemented reverse primer of the desired gene region. My question is: does this seem sensible? Is there a more efficient way to do this that I haven't discovered yet? I was thinking I'd then set up another workflow to label barcoded individuals, for I could use each of the eight gene 'output files' in turn as input. Thanks so much for this service! The screencasts are especially great. Pip Griffin University of Melbourne, Australia

1 0