Installation

Back to main page

Installation of OrChem

This page describes how to set up OrChem in an 11g Database. It is assumed that you have some knowledge of Oracle.

Quick check requirements

  • Oracle 11g database
  • A full Oracle client installation (sqlplus, loadjava)
  • A newly created Oracle database schema (call it "ORCHEM" or something) with 'developer' type privileges
  • A compound table that has a single primary key column and an MDL Clob column. In the near future we'll cater for other types of compound data types. Your "compound table" can also be a view - OrChem will only select from it.
  • Granted SELECT access to "ORCHEM" so it can see the compound table/view. Perhaps create a synonym for the compound table.
  • Downloaded OrChem zip file (can be found on the file pages)
  • Tablespace: as an estimate OrChem will be 10-15% in size compared to the compound table

Installation step by step

  • Check your environment

    • On a command line, type:
      loadjava 
      As a result you should see something like:
      loadjava: Usage: loadjava [-definer] [-encoding encoding] [-force] ...

      Next, type on the command line:
      sqlplus -version
      Now you should see something like:
      SQL*Plus: Release 11.1.0.6.0 - Production
      Both sqlplus and loadjava are required. If things don't work, set up a proper Oracle client on your computer. These client downloads can be found on the Oracle website


  • DBA steps

    • Orchem account
      It's best to create a new schema for OrChem (although you could install into an existing schema - not recommended).
      Let your DBA create a user (say "ORCHEM") that has privileges to create tables, views, procedures, Java etc.
    • InChI software and server directory permissions
      OrChem can generate
      InChI identifiers. InChIs are used to for exact searches. If you are interested in this feature, read on. If not, you can omit this step.

      The InChI generator is a ported C-program that works with input and output files when generating the InChI. In order for this to work inside the database, OrChem needs to write temporary files to an existing directory somewhere visible for the database server. These temporary files are deleted after the InChI generation.

      Assuming your OrChem schema is called ORCHEM (first argument) and your choice for an existing directory visible on the server is /var/tmp/ (third argument), then a DBA must grant the following privileges:
      		    
      dbms_java.grant_permission( 'ORCHEM', 'SYS:java.io.FilePermission', '/var/tmp/*', 'read' );
      dbms_java.grant_permission( 'ORCHEM', 'SYS:java.io.FilePermission', '/var/tmp/*', 'write' );
      dbms_java.grant_permission( 'ORCHEM', 'SYS:java.io.FilePermission', '/var/tmp/*', 'delete' );
      			


  • Verify Oracle Java set up

    • Connect to the database with the "ORCHEM" user
    • Test if Java is installed and available to your ORCHEM account. The steps below should be completed succesfully. If not and you get some error along the way, check the Java installation.

      (1) On the database command line, create a little Java class:
                          
      CREATE OR REPLACE AND COMPILE JAVA SOURCE NAMED "HelloThere" AS
      package uk.ac.ebi.orchem;
      public class HelloThere
      {
        public static void hello()
        {
          System.out.println("Whatever..");
        }
      }
      /
      
      (2) On the database command line again, wrap it with PL/SQL:
                          
      CREATE OR REPLACE PROCEDURE hello_there
      AS LANGUAGE JAVA NAME 'uk.ac.ebi.orchem.HelloThere.hello()';
      /
      
      (3) Test it, for example with with SQL*Plus:
                          
      set serverout on
      
      exec dbms_java.set_output(10000);
      
      exec hello_there
      Whatever..
      PL/SQL procedure successfully completed.
      


  • Download OrChem

    The OrChem download page provides access to OrChem releases. Find the latest release (tar.gz file) and save it locally. Unzip it and unpack the tar file somewhere. The unzipped directory contains a lib directory with jar files that need to be loaded into the database, and the sql DDL scripts that create the schema objects in your ORCHEM schema.


  • Loadjava

    • Navigate to the unzipped orchem/lib directory and issue the following "loadjava" commands, with the values for USER, PW and INSTANCE substituted with valid values
      loadjava -verbose -user USER/PW@INSTANCE ambit2.smarts.jar
      loadjava -verbose -user USER/PW@INSTANCE vecmath.jar
      loadjava -verbose -user USER/PW@INSTANCE commons-collections-3.2.1.jar
      loadjava -verbose -user USER/PW@INSTANCE jgrapht-0.6.0.jar
      loadjava -verbose -user USER/PW@INSTANCE log4j-1.2.15.jar
      loadjava -verbose -user USER/PW@INSTANCE jama-1.0.2.jar 
      loadjava -verbose -user USER/PW@INSTANCE cdk.jar
      loadjava -verbose -user USER/PW@INSTANCE jcp.jar
      loadjava -verbose -user USER/PW@INSTANCE inchi103.jar
      loadjava -verbose -user USER/PW@INSTANCE nestedVm.jar
      loadjava -verbose -user USER/PW@INSTANCE orchem.jar
      
      You should see output like in the example below:
                      
      $ loadjava -verbose -user orchem/orchem@marx vecmath.jar
      arguments: '-user' 'orchem/***@marx' '-verbose' 'vecmath.jar'
      
      creating : resource META-INF/MANIFEST.MF
      loading  : resource META-INF/MANIFEST.MF
      creating : resource META-INF/SUN_MICR.SF
      loading  : resource META-INF/SUN_MICR.RSA
      creating : resource javax/COPYRIGHT.txt
      loading  : resource javax/COPYRIGHT.txt
      creating : resource javax/LICENSE-JDL.txt
      creating : resource javax/LICENSE.txt
      loading  : resource javax/LICENSE.txt
      creating : class javax/vecmath/AxisAngle4d
      loading  : class javax/vecmath/AxisAngle4d
      creating : class javax/vecmath/Color3b
      loading  : class javax/vecmath/Color3b
      creating : class javax/vecmath/Color3f
      .....
      
    A note on INVALID CLASSES
    If you would query the database and look for the status of the Java classes just loaded, you'd see many classes with status INVALID. This is expected: not all the classes loaded into the database are necessarily needed by OrChem, and their invalid status has no direct effect. It would be possible to resolve the invalid statuses, but that would require loading in more third-party Java archives (.jar files), which in turn may require further jar files. Instead, the status INVALID is accepted for classes not required by OrChem.


  • JIT

    After 'loadjava', just-in-time compilation (JIT) can be done for better performance. More on that can be found in the Oracle® Database Java Developer's Guide. The chapter on "Oracle Database Java Application Performance" contains useful information regarding Java performance, so you may want to read this to tweak your own Oracle instance.

    Known issue Sun Solaris
    On Sun Solaris 10 (oracle 11.1), a JIT related problem was observed. It led to the InChI generation failing after a number of attempts.

      select orchem_convert.molfiletoinchi(orchem_convert.smilestomolfile('CCO’)) from dual;
      ORA-03113: end-of-file on communication channel

    The problem was gone after disabling Java_jit within Oracle (set JAVA_JIT_ENABLED to FALSE) apparently because of a bug Invoking Portal Procedures within Database JVM (Doc ID 1317956.1). Fixed in oracle 11.2, alternatively you can upgrade to 11.1.0.7 and request a backport for bug 7713193 on top of 11.1.0.7 for a specific platform.


    To use JIT:
    • Navigate into the orchem/sql directory.
    • Connect to the database using SQL*Plus with your ORCHEM account
    • In SQL*Plus, execute the script jit.sql, located in orchem/sql. This compiles Java classes in the ORCHEM schema. Below you can see what it should look like. Note that the compilation can take quite a LONG time (up to an hour or more), so find something else to do. Eventually the script should return and prompt messages like below.
    •                 
      > cd orchem/sql/
      
      > sqlplus orchem/orchem@marx
      
          SQL*Plus: Release 11.1.0.6.0 - Production on Tue May 19 17:48:27 2009
          Copyright (c) 1982, 2007, Oracle.  All rights reserved.
          
          @jit
      
          Total of 2 methods compiled in 9011 ms.
          Compiled javax/vecmath/MismatchedSizeException
          Total of 2 methods compiled in 38 ms.
          Compiled javax/vecmath/SingularMatrixException
          Total of 1 methods compiled in 157 ms.
          Compiled javax/vecmath/VecMathI18N
          Total of 11 methods compiled in 410 ms.
          Compiled javax/vecmath/Point2f
          Total of 40 methods compiled in 743 ms.
          Compiled javax/vecmath/Tuple2f
          Total of 21 methods compiled in 459 ms.
          ............ ... .. .
          ...
      


  • Create OrChem database objects

    • In SQL*Plus, now execute the script setup.sql, located in orchem/sql. This script creates various database objects. The script should prompt messages like below. Again, there should be no errors. (If any error occurs, it's probably because you lack some privilege; examine the error and fix the cause, then re-run the setup.sql script)
                      
      > cd orchem/sql/
      > sqlplus orchem/orchem@marx
      
          SQL*Plus: Release 11.1.0.6.0 - Production on Tue May 19 17:48:27 2009
          Copyright (c) 1982, 2007, Oracle.  All rights reserved.
          
          @setup
          
          PL/SQL procedure successfully completed.
          creating type "orchem_compound"
          Type created.
          creating type "orchem_compound_list"
          Type created.
          creating type "compound_id_table"
          Type created.
          creating sequence "orchem_sequence_log_id" for "orchem_parameters"
          Sequence created.
          .......
      


  • Populate table ORCHEM_PARAMETERS
    • In the previous step you created an empty table ORCHEM_PARAMETERS. You can describe the table in SQL*Plus.
                      
      SQL> desc orchem_parameters
       Name                                                                    Null?    Type
       ----------------------------------------------------------------------- -------- -------------------------------------------------
       COMP_TAB_NAME                                                           NOT NULL VARCHAR2(30)
       COMP_TAB_PK_COL                                                         NOT NULL VARCHAR2(30)
       COMP_TAB_MOLECULE_COL                                                   NOT NULL VARCHAR2(30)
       TMP_DIR_ON_SERVER                                                       NOT NULL VARCHAR2(1000)
      
    • The table crucially needs ONE row. Column COMP_TAB_NAME must contain your compound table name (or combined view of multiple tables), COMP_TAB_PK_COL the name of the primary key column of that table, and COMP_TAB_MOLECULE_COL the name of mol file column of that table.
      If you want to let Orchem generate InChi descriptors as explained earlier, you must also insert the name (including a final slash) of the working directory for which you have been granted privileges.
      Use an insert statement like this one to accomplish this (with proper values substituted), and commit the insert to the database.
      insert into orchem_parameters(comp_tab_name,comp_tab_pk_col,comp_tab_molecule_col,tmp_dir_on_server) values ('x','y','z','dirname');
      
    • Alternatively, for example if your compound data is split over more than one table, you can define a VIEW on your compound table(s) and refer to this view in ORCHEM_PARAMETERS.
    • An example is given below, showing the setup for a the ChEMBL schema. The example shows insertion of values 'COMPOUNDS','MOLREGNO' and 'MOLFILE', describing correctly the ChEMBL objects.
                      
      SQL> desc COMPOUNDS
       Name                            Null?    Type
       ------------------------------- -------- -------------
       MOLREGNO                        NOT NULL NUMBER(38)
       MOLWEIGHT                                NUMBER
       MOLFORMULA                               VARCHAR2(250)
       MOLFILE                                  CLOB
      
      SQL>  insert into orchem_parameters  
                        (comp_tab_name,comp_tab_pk_col,comp_tab_molecule_col,tmp_dir_on_server) 
                        values 
                        ('COMPOUNDS','MOLREGNO','MOLFILE','/var/tmp/');
      
      1 row created.
      
      SQL> commit;
      Commit complete.
      
      SQL> select * from orchem_parameters;
      
      COMP_TAB_NAME    COMP_TAB_PK_COL    COMP_TAB_MOLECULE_COL    TMP_DIR_ON_SERVER
      ---------------- ------------------ ----------------------  -----------------
      COMPOUNDS        MOLREGNO           MOLFILE                 /var/tmp/ 
      
      
    • With ORCHEM_PARAMETERS correctly populated, OrChem will now be able to find your compound data.


  • Verify the OrChem installation

    You are now at a point where you can test OrChem. There is a test class that selects a compound by primary key, creates a CDK molecule, makes a fingerprint and outputs the molecule as a SMILES string (and InChi if configured). The test class depends on ORCHEM_PARAMETERS being properly populated, and of course all previous steps having completed succesfully.

    • Connect to the database using SQL*Plus with your ORCHEM account
    • Pick a valid primary key value from your compound table, say for example 12345
    • In SQL*Plus, execute the function as below to verify OrChem works. Note that it make take a while to run the procedure for the first time, but following runs are done much quicker.
      Replace 12345 with a valid primary key value from your database.
    •                 
      SQL> SET LONG 100000
      SQL> select orchem_utils.verify_orchem(12345) from dual;
        
      
    • See below an example what the output should look like. No errors should be thrown.
                      
      SQL> SQL> select orchem_utils.verify_orchem(1100) from dual;
      
      ORCHEM_UTILS.VERIFY_ORCHEM(1100)
      --------------------------------------------------------------------------------
      
      OKAY: compound found with primary key 1100
      OKAY: CDK molecule created
      OKAY: Fingerprint made
      OKAY: Smiles generated O=C(C=CC(=O)N3CCN(C=1N=C(N)C2=CC(OC)=C(OC)C=C2(N=1))CC3)C
      =4C=CC=CC=4
      OKAY: InChi created is InChI=1S/C24H25N5O4/c1-32-20-14-17-18(15-21(20)33-2)26-24
      (27-23(17)25)29-12-10-28(11-13-29)22(31)9-8-19(30)16-6-4-3-5-7-16/h3-9,14-15H,10
      -13H2,1-2H3,(H2,25,26,27)/b9-8+
      
      PL/SQL procedure successfully completed.
      
      
  • When the test above works correctly, you're ready to fingerprint (index) your compound data so it can be searched by substructure or similarity.
    To do so, follow the link on the main page.




Back to main page