Practical virtual screening using the Instant JChem platform

Silicon and Iron have been “co-operating” for a very, very long time.

The problem
Virtual screening (VS) is not a definitive answer to finding the needle in the haystack but it can provide a sensible set of starting points relative to “random”. The available screening techniques are many and varied in design and approach and use different molecular representations, alignments (or not), scoring functions and if available, use information obtained from the target protein. The merits of 2D vs 3D screening are much debated within the literature coming down marginally favourably on the side of 2D (interestingly of course, since 2D does not inherently consider chirality) methods being more effective at isolating true positives 1.
For 3D approaches, one must generate 1 or n conformations for input into the VS protocol – the nature of the set of conformations (n) is debated to some extent as one form of “flexible search”, but “extrema representatives” is not a bad approach for ensuring some set diversity within an ensemble (http://svl.chemcomp.com/filedetails.php?lid=813&cid=40)
Many command line tools are available: from ChemAxon (Screen3D/DISCO) or other commercial vendors (Schrodinger,OpenEye,Cresset…) are astute choices, but if you don’t accept the Doctor’s prescription, you might already have written your own approach in Java or C++ (or some scripting language) and are wondering how to integrate it into IJC, so as to make use of the data management capabilities which will assist you before and after the VS events…
A proposed solution
IJC is exceedingly helpful for the administrator and allows for the rapid creation of 2D databases very easily. From this point calculator plugins can be called and used in your VS if you prefer properties similarity. Beyond this, you can access internal functionality and numerical data items exposed through the Chemical terms or the JCHEM API or derived in your own indexes.
An entry point or interface between IJC and VS is the built-in Groovy language and we have provided two scripts in the script repository: http://www.chemaxon.com/instantjchem/ijc_latest/docs/developer/scripts/CallExternalTool.html
This can assist you in getting going, deploying the VS tool of your choice, closely integrated with IJC Structure based entities for ease of management & analysis, before and after…What do we have to do, we hear you cry?
I. Obviously, use your target Structural data (SDF) set to screen and load it into IJC via standard import – this will generate a CD_ID primary key handle per molecule. Perhaps, you will use your own in-house data or some benchmark sets (for example DUD or WOMBAT).
II. Obtain the script (amend it, is likely required) and plugin your choice of VS tool i.e. change the command line call example 1 (don’t forget your query – which you will need!) [http://www.sciencedaily.com/releases/2001/11/011119072232.htm]
III. For all those whom prefer their own bespoke approaches – it’s still OK – just write your groovy code directly at the data tree level and access the API (and import any other java you like, for example the JAMA libraries are highly useful for more complex matrix operations: (http://math.nist.gov/javanumerics/jama/).
For example you can extend script example 2 if it is a 3D approach with your choice of 3D coordinates generation & force field.
IV. The results will be built back into the entity with this scripting – for further your further analysis of the rank orders derived.
We would be interested to hear about your integration experiences…
Can you successfully integrate your choice of tool (s) into IJC with the above general approach?
Which approaches do you observe are most effective at predicting viable hit options – 2D or 3D methods? Which 2D or 3D approach/paradigm do you rate the highest or is the most reliable? Perhaps your workflow uses both 2D and 3D concepts, in which case we would be very interested to hear about those observations! We think reference 2 is a good option to pursue.
References
  1. Brown R.D., Martin, Y. C. (1996) “Use of structure activity data to compare structure-based clustering methods and descriptors for use in compound selection”. Journal of Chemical Information and Computer Sciences 36:572-584. DOI: 10.1021/ci9501047.
  2. Bonachera F., Parent, B., Barbosa, F., Froloff, N., Horvath, D. (2006) “Fuzzy tricentricpharmacophore fingerprints. 1. Topological fuzzy pharmacophore triplets and adapted molecular similarity scoring schemes”. Journal of Chemical Information and Modeling 46:2457-2477. DOI: 10.1021/ci6002416.