Searching the IBM Bio-Dictionary-based Annotations
image of IBM Research banner


Overview
In our annotations, the following two types of results are reported:

  • local and global similarities shared between the query sequence and
        known protein families (or individual sequences if family membership
        is not reported/known). The captions of these similarities are derived
        from the DESCRIPTION ("DE") line of the SwissProt/TrEMBL
        database and are in plain English.
  • nature, location and extent of features that can be identified in
        the query sequence. The captions of these reported features are derived
        from the FEATURE TABLE ("FT") line of the SwissProt/TrEMBL
        database and consist of a short text in plain English and one of the
        following keywords (note that we make use of only a subset of the
        valid "FT" keywords):
        act_site    binding    carbohyd    ca_bind    disulfid       
        dna_bind    domain     helix       init_met   lipid          
        metal       mod_res    np_bind     se_cys     signal 
        similar     site       strand      thioeth    thiolest 
        transit     transmem   turn        zn_fing
    

  • Example similarity strings
    First off, "Discovered Similarity Results" must have been selected.
    You can then compose the strings by typing them in lower-case.
    Examples include (notice the '.' before the '*'):

      -.].*exon1-2, complete (meaning: find and report all genes 
    			  whose discovered similarities
    			  contain 'exon1-2, complete'
    			  and appear in the top 9 ranking
    			  positions)
      exon1-2, complete      (meaning: find and report all genes
    			  whose discovered similarities
    			  contain 'exon1-2, complete'
    			  and appear in any position of the
    			  ranked output)
      -[1-5]].*elong.*factor (meaning: find and report all genes
    			  whose discovered similarities
    			  contain 'elong' followed by 'factor'
    			  and appear in positions 1 through
    			  5, i.e. the top 5 ranked positions)
      -[1-4]]		   (meaning: show the discovered similarities
    			  occupying the top 4 positions for 
    			  each of the genes of the organism -- this
    			  essentially allows you to list the 
    			  complete Bio-Dictionary annotation 
    			  for this organism)
      cytochr.*oxidase
      mhc.*class.*ii
      2-hydroxypent-2,4-dienoate
      acetyltransferase
      major surface glycoprotein
      hypothetical.*21.8.*kda.*protein
      cytochrome.*oxidase.*subunit.*iii
    
    etc.


    Example feature strings
    First off, "Discovered Feature Results" must have been selected.
    You can then compose the strings by typing them in lower-case.
    Examples include (notice the '.' before the '*'):

      -.].*dna.*bind  (meaning: find and report all genes 
    		  whose discovered features 
    		  contain 'dna' and 'bind' 
    		  in this order and appear
    		  in the top 9 ranked  positions)
      dna.*bind      (meaning: find and report all genes
    		  whose discovered features
    		  contain 'dna' and 'bind'
    		  in this order and appear
    		  in any position of the ranked output)
      -[1-7]].*h-t-h  (meaning: find and report all genes 
    		  whose discovered features 
    		  contain 'h-t-h' appearing
    		  in the top 7 ranked  positions)
      -[1-9]]        (meaning: list the top 9 ranked
    		  features discovered by the Bio-Dictionary
    		  and do this for each of the
    		  genes of this organism)
      -1[1-9]]        (meaning: list the features discovered
    		  by the Bio-Dictionary that occupy
    		  positions 11 through 19 in the ranked
    		  output and do so for each of the
    		  genes of this organism)
      bind.*motif
      transmem
      h-t-h
      ankyrin
      bind.*by.*similarity
      carbohyd.*n-linked
      carbohyd.*o-linked
      act_site.*donor
      mod_res.*phosphorylation
      mod_res.*phosphor
      act_site.*acceptor
    
    etc.



    Back to Web-accessible Tools Page
    Back to Bioinformatics & Pattern Discovery Group's Home Page



    Last modification to this page: Tuesday, August 28 2001