Index of /~nanjiang/zincpred/download
Name Last modified Size Description
Parent Directory 20-Jan-2010 09:30 -
ChangeLog.txt 27-Dec-2009 18:52 1k
predzinc_cygwin.tar.gz 20-Jan-2010 09:27 85.7M
predzinc_linux.tar.gz 27-Dec-2009 18:27 86.1M
* PREDZINC *
PREDZINC is a program for prediction zinc-binding sites in proteins from their
amino acid sequences. The program is witten in c/c++ and bash shell scripts.
Currently, PREDZINC can be run on Linux and Windows (with Cygwin).
PREDZINC is copyrighted (c) to Nanjiang Shu, Structural Chemistry, Stockholm
University, Sweden and is free for academic use.
Reference:
Shu, N., T. Zhou, et al. (2008). "Prediction of zinc-binding sites in
proteins from sequence." Bioinformatics 24(6): 775-782.
================================================================================
Usage: predzinc.sh [options] sequence-file | --pssm pssm-file
Note: do not use ';' in the rootname of the sequence file
the supplied sequence should be in FASTA format
Options:
--pssm file : supply pssm file in PSI-BLAST -Q flag output format,
: if supplied, PSI-BLAST will not run
--seqfilelist file : supply a file with a list of sequence files for batch mode prediction
--pssmfilelist file : supply a file with a list of pssm files for batch mode prediction
--blastdb file : database for psi-blast, default = $BLASTDB/nr
--outpath path : output the result to the specified path, default=./
--cpu <int> : set the number of cpu cores to be used when running blastpgp
--not-clean : if supplied, the intermediate files will be kept
--help|-h : print this help message and exit
--version : print version
Note: only one of the four arguments
sequence-file
--pssm <pssm-file>
--seqfilelist <listfile>
--pssmfilelist <listfile>
can be supplied
Examples:
# In the subfolder "test"
# Carry out the prediction by supplying a single sequence file in FASTA format
../predzinc.sh test1.aa
# Carry out the prediction by supplying a single pssm file
../predzinc.sh --pssm test2.pssm
# Carry out the prediction by a list file with a number of sequence files
../predzinc.sh --seqfilelist test3.seqfilelist
Note that the rootname of the file should be <= 30 characters.
================================================================================
DOWNLOAD:
www.fos.su.se/~nanjiang/zincpred/download
================================================================================
INSTALLATION:
download the package
for Linux : predzinc_linux.tar.gz
for Windows (Cygwin) : predzinc_cygwin.tar.gz
then untar the file by
tar -xvzf predzinc_linux.tar.gz
or
tar -xvzf predzinc_cygwin.tar.gz
go into the predzinc directory
cd predzinc
export the environmental variable
export PREDZINC=$PWD
Make sure that NCBI nr database formatted for PSI-BLAST is installed. The
environmental variable BLASTDB points to the directory storing nr blast
database needed by PSI-BLAST
export BLASTDB=path-storing-blast-nr-database
e.g.
export BLASTDB=/data/blastdb
If your operating system is x86 linux or cygwin, you should be able to run the
program by the pre-compiled executables already
go in to the subfolder "test" by
cd test
../predzinc.sh test0003.aa
You can also compile the program from source code by running
make
make install
make clean
in the "src" folder
you can also copy the predzinc.sh to a common bin folder e.g. /usr/bin/, so
that you can run the program anywhere by running
predzinc.sh testseq.aa
================================================================================
Others:
The blastpgp used in this version of PREDZINC is 2.2.17.
If you want to use a different version of PSIBLAST, please copy the program
'blastpgp' to $PREDZINC/bin and the corresponding matrix file to $PREDZINC/data
cp blastpgp $PREDZINC/bin
cp BLOSUM62 $PREDZINC/data
The gist-svm used in this version of PREDZINC is 2.1.1
If you want to use a different version of gist-svm, please copy the program
'gist-svm-train', 'gist-classify' and 'gist-score-svm' to $PREDZINC/bin
cp gist-svm-train $PREDZINC/bin
cp gist-classify $PREDZINC/bin
cp gist-score-svm $PREDZINC/bin
================================================================================
Running time:
It takes on average about 6 minutes to predict a protein sequence with 300
amino acids, when running on a single core with 2GHZ cpu and 1GB RAM. Most
time is taken by PSI-BLAST for building the sequence profile. When the
sequence profile is obtained, it takes about one minute to get the
prediction result.
================================================================================
Examples:
in the "test" folder, run
../predzinc.sh test0003.aa
or if the pssm file is already built
../predzinc.sh --pssm test0003.pssm
The result of the predict will be output to test0003_final_zn.report
as shown below.
Res : three-letter amino acid code
SerialNo : the residue number in sequence, starting from 1
Score the : predicted zinc-binding score for the residue, ranging from 0 to 1
*********** Begin the file test0003_final_zn.report ***********************
Zinc-binding site prediction by PREDZINC version 1.1 (c) Shu.
Reference:
Shu, N., Zhou, T. and Hovmoller, S. (2008) Prediction of zinc-binding
sites in proteins from sequence, Bioinformatics, 24, 775-782.
The following 9 residues were predicted as zinc-binding for protein "test0003" (with score >= 0.450)
Res SerialNo Score
HIS 92 0.886
HIS 94 0.931
ASP 96 0.818
HIS 97 0.795
HIS 167 0.753
ASP 189 0.669
ASP 327 0.531
HIS 382 0.818
HIS 404 0.769
Prediction scores for all 90 selected residues, sorted by scores
Res SerialNo Score
HIS 94 0.931
HIS 92 0.886
ASP 96 0.818
HIS 382 0.818
HIS 97 0.795
HIS 404 0.769
HIS 167 0.753
ASP 189 0.669
ASP 327 0.531
GLU 425 0.315
GLU 216 0.253
ASP 35 0.158
ASP 166 0.130
GLU 282 0.129
HIS 193 0.116
GLU 417 0.105
GLU 237 0.103
ASP 349 0.095
ASP 105 0.092
== Updated 2009-12-27