Hi everyone,
I am currently running a fastqgroomer on some data (about 7Go big). I am surprise because I launched the analysis several hours ago and it is still running. I am wondering if everything is alright since usually it is faster than that, even on bigger dataset. I had the same problem yesterday with fastQC that was not finished after 24h also it was indicated as running (I finally discarded this job)
Is the galaxy sever busy or do you think it is because of my data?
thank you in advance for your help
Sincerely
Emmanuelle Lerat
Hi Emmanuelle,
I just went in and looked at your data. It appears that you uploaded a tar archive. That compression format is not supported. To load using FTP, these are the instructions. The "Upload" tool at the top of the tool panel can also be used and the usage is very similar. FTP is a two step process. FTP the file using a client, then load into your history using either tool (Get Data: Upload File, or Upload): https://wiki.galaxyproject.org/FTPUpload
That said, sometimes with tar archives, the first file in the archive will be extracted and loaded. From a quick look this seems to be what resulted, but a closer look shows a few problems with the file extracted.
1. The file is truncated. I use the tool " Text Manipulation -> Select last lines from a dataset" to view the end of the file to see this.
2. The datatype ".fastqillumina" may or may not be the correct designator for the starting quality scores. Once you upload the file again (just the 'fastq" file, not the tar archive), run the tool "FastQC" to determine the correct dataytpe.
3. After that, then you can run the groomer as needed. The current groomer run you have going had setting that were a mismatch for the input datatype that you had set. This wiki explains how to do all of this correctly: https://wiki.galaxyproject.org/Support#Dataset_special_cases
As you go through this, please leave datasets undeleted in case you would like feedback. I wasn't able to look at all of the steps you did since some were permanently deleted. Then once confirmed as OK, you can delete/perm delete what isn't needed.
When you step the current jobs, you can delete, then perm delete all data to completely stop the processes (and all associated data) to recover disk space.
Hopefully this helps things go smoother this time,
Jen Galaxy team
On 5/27/14 3:58 AM, Emmanuelle Lerat wrote:
Hi everyone,
I am currently running a fastqgroomer on some data (about 7Go big). I am surprise because I launched the analysis several hours ago and it is still running. I am wondering if everything is alright since usually it is faster than that, even on bigger dataset. I had the same problem yesterday with fastQC that was not finished after 24h also it was indicated as running (I finally discarded this job)
Is the galaxy sever busy or do you think it is because of my data?
thank you in advance for your help
Sincerely
Emmanuelle Lerat
galaxy-user@lists.galaxyproject.org