Index of /~nanjiang/zincpred/download

      Name                    Last modified       Size  Description

[DIR] Parent Directory 20-Jan-2010 09:30 - [TXT] ChangeLog.txt 27-Dec-2009 18:52 1k [   ] predzinc_cygwin.tar.gz 20-Jan-2010 09:27 85.7M [   ] predzinc_linux.tar.gz 27-Dec-2009 18:27 86.1M

						 * PREDZINC *

PREDZINC is a program for prediction zinc-binding sites in proteins from their
amino acid sequences. The program is witten in c/c++ and bash shell scripts.
Currently, PREDZINC can be run on Linux and Windows (with Cygwin). 

PREDZINC is copyrighted (c) to Nanjiang Shu, Structural Chemistry, Stockholm
University, Sweden and is free for academic use.

Reference:
    Shu, N., T. Zhou, et al. (2008). "Prediction of zinc-binding sites in
    proteins from sequence." Bioinformatics 24(6): 775-782.

================================================================================
Usage: predzinc.sh [options] sequence-file | --pssm pssm-file
 Note: do not use ';' in the rootname of the sequence file
       the supplied sequence should be in FASTA format

Options:
 --pssm         file : supply pssm file in PSI-BLAST -Q flag output format,
                     : if supplied, PSI-BLAST will not run
 --seqfilelist  file : supply a file with a list of sequence files for batch mode prediction
 --pssmfilelist file : supply a file with a list of pssm files for batch mode prediction
 --blastdb      file : database for psi-blast, default = $BLASTDB/nr
 --outpath      path : output the result to the specified path, default=./
 --cpu         <int> : set the number of cpu cores to be used when running blastpgp
 --not-clean         : if supplied, the intermediate files will be kept
 --help|-h           : print this help message and exit
 --version           : print version 

Note: only one of the four arguments
    sequence-file
    --pssm         <pssm-file>
    --seqfilelist  <listfile>
    --pssmfilelist <listfile>
can be supplied

Examples:

# In the subfolder "test" 
# Carry out the prediction by supplying a single sequence file in FASTA format
    ../predzinc.sh test1.aa

# Carry out the prediction by supplying a single pssm file
    ../predzinc.sh --pssm test2.pssm

# Carry out the prediction by a list file with a number of sequence files
    ../predzinc.sh --seqfilelist test3.seqfilelist

Note that the rootname of the file should be <= 30 characters.
================================================================================

DOWNLOAD: 

www.fos.su.se/~nanjiang/zincpred/download
================================================================================

INSTALLATION:

download the package 
  for Linux            : predzinc_linux.tar.gz
  for Windows (Cygwin) : predzinc_cygwin.tar.gz
then untar the file by
  tar -xvzf predzinc_linux.tar.gz
or 
  tar -xvzf predzinc_cygwin.tar.gz

go into the predzinc directory
  cd predzinc

export the environmental variable
  export PREDZINC=$PWD

Make sure that NCBI nr database formatted for PSI-BLAST is installed. The
environmental variable BLASTDB points to the directory storing nr blast
database needed by PSI-BLAST
  export BLASTDB=path-storing-blast-nr-database
  e.g. 
  export BLASTDB=/data/blastdb

If your operating system is x86 linux or cygwin, you should be able to run the
program by the pre-compiled executables already 
go in to the subfolder "test" by 
  cd test
  ../predzinc.sh test0003.aa

You can also compile the program from source code by running 
  make 
  make install
  make clean
in the "src" folder

you can also copy the predzinc.sh to a common bin folder e.g. /usr/bin/, so
that you can run the program anywhere by running
  predzinc.sh testseq.aa

================================================================================
Others:
The blastpgp used in this version of PREDZINC is 2.2.17.
If you want to use a different version of PSIBLAST, please copy the program
'blastpgp' to $PREDZINC/bin and the corresponding matrix file to $PREDZINC/data
  cp blastpgp $PREDZINC/bin
  cp BLOSUM62 $PREDZINC/data

The gist-svm used in this version of PREDZINC is 2.1.1
If you want to use a different version of gist-svm, please copy the program 
'gist-svm-train', 'gist-classify' and 'gist-score-svm' to $PREDZINC/bin
  cp gist-svm-train $PREDZINC/bin
  cp gist-classify $PREDZINC/bin
  cp gist-score-svm $PREDZINC/bin

================================================================================
Running time:

It takes on average about 6 minutes to predict a protein sequence with 300
amino acids, when running on a single core with 2GHZ cpu and 1GB RAM. Most
time is taken by PSI-BLAST for building the sequence profile. When the
sequence profile is obtained, it takes about one minute to get the
prediction result.

================================================================================
Examples:

in the "test" folder, run
  ../predzinc.sh test0003.aa
or if the pssm file is already built
  ../predzinc.sh --pssm  test0003.pssm

The result of the predict will be output to test0003_final_zn.report
as shown below. 

Res       : three-letter amino acid code
SerialNo  : the residue number in sequence, starting from 1
Score the : predicted zinc-binding score for the residue, ranging from 0 to 1

***********  Begin the file test0003_final_zn.report ***********************
Zinc-binding site prediction by PREDZINC version 1.1 (c) Shu.
Reference: 
   Shu, N., Zhou, T. and Hovmoller, S. (2008) Prediction of zinc-binding
   sites in proteins from sequence, Bioinformatics, 24, 775-782.


The following 9 residues were predicted as zinc-binding for protein "test0003" (with score >=  0.450)

Res SerialNo  Score

HIS       92  0.886
HIS       94  0.931
ASP       96  0.818
HIS       97  0.795
HIS      167  0.753
ASP      189  0.669
ASP      327  0.531
HIS      382  0.818
HIS      404  0.769

Prediction scores for all 90 selected residues, sorted by scores

Res SerialNo  Score

HIS       94  0.931
HIS       92  0.886
ASP       96  0.818
HIS      382  0.818
HIS       97  0.795
HIS      404  0.769
HIS      167  0.753
ASP      189  0.669
ASP      327  0.531
GLU      425  0.315
GLU      216  0.253
ASP       35  0.158
ASP      166  0.130
GLU      282  0.129
HIS      193  0.116
GLU      417  0.105
GLU      237  0.103
ASP      349  0.095
ASP      105  0.092


== Updated 2009-12-27