Hello again, Vadim. I just ran the tool on main using known good data and it runs as expected - ploteig is clearly being found correctly - so it looks like your data is the source of your problem - it definitely has some issues. You can find the data I used (the sample data supplied with plink) and import it for testing from the 'medium' folder in the 'GenotypeSamples' shared data library at http://main.g2.bx.psu.edu/library In the MDLD data, 1) affection status (col 6 in the .fam file) is set to missing for all subjects - that's normally used to set the plot symbol and legend so may be a source for the eigensoft error. 2) The family id (first column) of the pedigree in your MDLD.fam file contains the population group as a text field. While that seems like a logical thing to do, it's not the way pedigrees with all unrelateds are normally specified and eigensoft may be getting confused - the normal convention is that every independent family has the same family id - you have dozens of unrelated (no father/mother id) subjects in the same 'family' Now that I see how you are using the tool, I think we need a better and more flexible way to specify the nominal population group for the plot - not just case/control as I'd originally conceived the tool. Thanks for bringing up this more general use case. I'll take a look at adding an optional plink style phenotype text file to provide those indicators and let you know when I have something to test. FYI the GRR plot is interesting - two 'unrelated' subjects HGDP00794 and HGDP00801 are almost certainly siblings or a parent/child pair - not enough data to tell - you would have to remove one of that pair to get strictly valid eigenvectors. Once again, thanks for using Galaxy and thanks again for helping us make it even better... On Mon, May 16, 2011 at 11:13 PM, Ross <ross.lazarus@gmail.com> wrote:
Hi, thanks for reporting this error.
Ploteig is a perl script that is part of the eigensoft package and it seems to be inaccessible while eigensoft is executing judging from the error message.
We'll fix this as soon as we can.
Meanwhile, in case you didn't check, I noticed that if you click on the eye icon of items 2 and 4 in your history, the eigenplot is there - even though the history items are in an error state...I'm not sure I understand exactly how this has happened but it did.
The way the samples are labelled (all are 'sample') on the plot is not very informative - I need to look more closely at the data to see how we can make the plot more useful.
On Mon, May 16, 2011 at 9:48 PM, <galaxy-bugs@bx.psu.edu> wrote:
GALAXY TOOL ERROR REPORT ------------------------
This error report was sent from the Galaxy instance hosted on the server "main.g2.bx.psu.edu" ----------------------------------------------------------------------------- This is in reference to dataset id 2452824 from history id 554342 ----------------------------------------------------------------------------- You should be able to view the history containing the related history item
3: Ancestry PCA_rgEig.txt
by logging in as a Galaxy admin user to the Galaxy instance referenced above and pointing your browser to the following link.
main.g2.bx.psu.edu/history/view?id=1f6151421eac94f6 ----------------------------------------------------------------------------- The user 'vadimverenich@gmail.com' provided the following information:
sh: ploteig: not found
----------------------------------------------------------------------------- job id: 2167094 tool id: rgEigPCA1 ----------------------------------------------------------------------------- job command line: python /galaxy/home/g2main/galaxy_main/tools/rgenetics/rgEigPCA.py "/galaxy/main_database/files/002/452/dataset_2452822_files/MDLD" "Ancestry PCA" "/galaxy/main_pool/pool5/tmp/job_working_directory/2167094/galaxy_dataset_2452823.dat" "/galaxy/main_pool/pool5/tmp/job_working_directory/2167094/dataset_2452823_files" "4" "5" "5" "6" "/galaxy/main_pool/pool5/tmp/job_working_directory/2167094/galaxy_dataset_2452824.dat" ----------------------------------------------------------------------------- job stderr: sh: ploteig: not found
----------------------------------------------------------------------------- job stdout: smartpca -p Ancestry_PCA_pca.xls.par >Ancestry_PCA_log.txt ploteig -i Ancestry_PCA_pca.xls.evec -c 1:2 -p ??? -x -y -o Ancestry_PCA_eigensoftplot.pdf.xtxt evec2pca.perl 4 Ancestry_PCA_pca.xls.evec /galaxy/main_database/files/002/452/dataset_2452822_files/MDLD.fam Ancestry_PCA_pca.xls rgEigPCA.py got /galaxy/home/g2main/galaxy_main/tools/rgenetics/rgEigPCA.py /galaxy/main_database/files/002/452/dataset_2452822_files/MDLD Ancestry PCA /galaxy/main_pool/pool5/tmp/job_working_directory/2167094/galaxy_dataset_2452823.dat /galaxy/main_pool/pool5/tmp/job_working_directory/2167094/dataset_2452823_files 4 5 5 6 /galaxy/main_pool/pool5/tmp/job_working_directory/2167094/galaxy_dataset_2452824.dat llist=c("???") glist=c(2) par(lab=c(10,10,10)) par(mai=c(1,1,1,0.5)) pdf("Ancestry_PCA_PCAPlot.pdf",h=8,w=10) par(lab=c(10,10,10)) pca1 = c(0.076000,0.083300,0.072000,0.077100,0.079000,0.070600,0.082900,0.077600,0.077900,0.075400,0.076500,0.071000,0.070000,0.070600,0.072800,0.073100,0.068000,0.066700,0.073700,0.073400,0.074000,0.078700,0.077000,0.080100,0.063700,0.081400,0.077000,0.082500,0.083500,0.064900,0.072500,0.075200,0.078200,0.064600,0.051800,0.076600,0.059900,0.062800,0.058200,0.052800,0.060600,0.064300,0.049700,0.054000,0.058300,0.051700,0.057700,0.042700,0.046800,0.038700,0.048200,0.058600,0.035800,0.039700,0.043700,0.041300,0.033800,0.038400,0.042000,0.030600,0.042200,0.046200,0.039500,0.036500,0.039400,0.045200,0.085400,0.085600,0.084800,0.098500,0.093700,0.071400,0.088400,0.090300,0.096300,0.096900,-0.109500,-0.107400,-0.098500,-0.102900,-0.108700,-0.102900,-0.102000,-0.103400,-0.104600,-0.105300,-0.105100,-0.072000,-0.111300,-0.070000,-0.112000,-0.100200,-0.114700,-0.103900,-0.005000,-0.068300,0.018600,0.002600,0.012400,-0.000600,0.014100,0.012000,0.018300,0.011800,0.003600,0.022300,0.008! 900,0.012900,0.023900,0.084100,0.077700,0.075600,0.070000,0.077800,0.064400,0.082800,0.067100,0.064200,0.075200,0.064400,0.057500,0.056000,0.052400,0.058800,0.052600,-0.061100,-0.014800,0.047700,-0.056900,0.053200,-0.070900,0.060200,-0.070600,-0.075800,0.059800,-0.049500,0.056200,-0.073500,-0.037000,-0.055400,-0.074600,-0.068600,0.067800,0.055900,-0.072600,-0.069500,-0.077800,-0.068600,-0.070900,0.043300,0.070400,0.068000,-0.014200,-0.024000,-0.020600,-0.020400,-0.012500,-0.010300,-0.020700,-0.025500,-0.034300,-0.018900,-0.018200,-0.017100,-0.012900,-0.032600,-0.040300,-0.058900,-0.061900,-0.067100,-0.062300,-0.062800,-0.070300,-0.063700,-0.065000,-0.078900,-0.068800,-0.053300,-0.050400,-0.067800,-0.055600,-0.065300,-0.052000,-0.054400,-0.065400,-0.069000,-0.080300,-0.089400,-0.087900,-0.090100,-0.007800,-0.084000,-0.087500,-0.095700,-0.087700,-0.099300,-0.062100,-0.006700,-0.084200,-0.086400,-0.087500,-0.101600,0.033600,-0.096300,-0.090500,-0.044900,-0.055000,-0.049300,-0.! 054800,-0.059400,-0.061200,-0.055100,-0.052100,-0.055900,-0.050800,-0. 063800,-0.055800,-0.061500,-0.057900,-0.065100,-0.056400,-0.057900,-0.044500) pca2 = c(0.007500,0.001300,0.012400,0.000100,-0.001100,0.004400,0.013100,0.015700,0.015300,0.016800,0.000100,0.008200,0.008700,0.006900,0.004600,0.018400,0.003400,0.007400,0.004600,0.023700,-0.001200,-0.008500,0.015300,0.003300,-0.034300,-0.030100,-0.052700,-0.040700,-0.035400,-0.027100,-0.030700,-0.037700,-0.035500,0.117400,0.112900,-0.012000,0.118600,0.126200,0.118500,0.115200,0.118300,0.087600,0.105900,0.111500,0.118000,0.090500,0.111500,-0.026600,-0.034800,-0.047100,-0.036100,-0.040500,-0.042900,-0.036300,-0.044200,-0.041100,-0.038500,-0.042800,-0.033700,-0.042300,-0.034400,-0.033900,-0.051700,-0.040500,-0.036900,-0.048600,-0.041800,-0.041400,-0.040700,-0.045200,-0.040400,-0.046000,-0.043900,-0.055300,-0.045300,-0.035400,-0.042800,-0.039800,-0.030700,-0.032400,-0.028200,-0.039600,-0.039000,-0.035900,-0.025800,-0.030100,-0.036700,-0.014100,-0.031900,-0.026600,-0.041800,-0.031900,-0.035900,-0.011100,-0.014900,-0.027700,-0.049200,-0.027400,-0.040600,-0.032000,-0.040800,-0.0! 42800,-0.033100,-0.044500,-0.034900,-0.048500,-0.039900,-0.035900,-0.034500,-0.032300,-0.036800,-0.044800,-0.036200,-0.020200,0.005500,-0.018100,-0.029200,-0.004300,-0.027700,-0.068300,-0.057600,-0.055700,-0.049800,-0.048900,-0.058700,0.001000,0.008700,-0.048600,0.004700,-0.049600,-0.030500,-0.059400,-0.012300,-0.013800,-0.045800,0.018700,-0.048800,-0.011000,-0.001700,0.008400,-0.023000,-0.009500,-0.059300,-0.065300,-0.008000,-0.007800,-0.018700,-0.028000,0.016300,-0.051000,-0.019800,-0.031500,0.238200,0.249400,0.216100,0.235900,0.226100,0.236800,0.209900,0.149000,0.106500,0.222100,0.197900,0.210100,0.264400,0.137800,0.134200,0.019800,-0.010000,-0.010400,0.015100,0.009800,-0.002700,0.002500,-0.005600,-0.007600,-0.016600,-0.001800,-0.005400,-0.011200,-0.002700,0.016600,0.004000,0.010300,0.003800,-0.003100,-0.027700,-0.026600,-0.040400,-0.045900,-0.017100,-0.032800,-0.030000,-0.025500,-0.045500,-0.039500,-0.037300,-0.039100,-0.029200,-0.040500,-0.027500,-0.041000,-0.041900,-0! .046800,-0.038000,0.000400,-0.008000,-0.004200,-0.010300,-0.007400,-0. 014600,-0.015600,-0.016800,-0.007100,-0.012600,-0.009900,-0.012400,-0.011300,-0.018300,-0.015600,-0.014800,-0.001700,-0.006000) pgvec = c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2) plot(pca1,pca2,type='p',main='Ancestry_PCA', ylab='Second ancestry eigenvector',xlab='First ancestry eigenvector',col=pgvec,cex=0.8,pch=pgvec) legend("top",legend=llist,pch=glist,col=glist,title="Sample") grid(nx = 10, ny = 10, col = "lightgray", lty = "dotted") dev.off() png("Ancestry_PCA_PCAPlot.pdf.png",h=8,w=10,units="in",res=72) plot(pca1,pca2,type='p',main='Ancestry_PCA', ylab='Second ancestry eigenvector',xlab='First ancestry eigenvector',col=pgvec,cex=0.8,pch=pgvec) legend("top",legend=llist,pch=glist,col=glist,title="Sample") grid(nx = 10, ny = 10, col = "lightgray", lty = "dotted") dev.off() ['pdf \n', ' 2 \n', 'pdf \n', ' 2 \n']
----------------------------------------------------------------------------- job info: None ----------------------------------------------------------------------------- job traceback: None ----------------------------------------------------------------------------- (This is an automated message).
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;