QSAR/QSPR modelling

Data mining techniques within OpenMolGRID are used for the development of predictive models that can be used for estimating various chemical properties and biological activities. The developed predictive models are represented in the form of quantitative structure-property/activity relationships (QSPR/QSAR). The QSAR/QSPR model describes the modelled activity or property as a mathematical function of the molecular structure. The molecular structure is characterized in these models by parameters, called molecular descriptors. The QSAR/QSPR models are particularly suitable for drug design, material design, molecular modelling, and chemical engineering problems.
The model development process on large data sets is complicated because multiple time-consuming data processing steps. For example, a typical workflow for the model building involves the preparation of a training set, the generation of 3D structures for compounds in the training set, quantum chemical calculations, the calculation of molecular descriptors, and finally the QSPR/QSAR model building. Usually different software packages are involved in this workflow. A more detailed description about this process is available. The OpenMolGRID system provides a flexible infrastructure for automating this kind of scientific workflows. Complicated workflows can be represented in XML format, easily shared between colleagues, customised and extended.


Typical model building workflow



The OpenMolGRID architecture is developed on top of the UNICORE infrastructure, which is based on a client-server model. The architecture is based on the concept of tasks or services. Each task is specialised to carry out one specific operation and has a well-defined interface. A client plugin is provided for each specific task and application wrappers are provided for available software packages that can carry out this task. The details about the architecture are available under Grid Integration.

The OpenMolGRID system has Grid adapters for several existing software packages that are required for carrying out tasks in the QSAR/QSPR model development workflows. The details about supported tasks are available under Available Applications.
The managing partner for Data Mining is University of Tartu.