Simpler Substructure Search: Updated Chemcaster API, Ruby Client

Substructure search is one of the most important capabilities in a chemical registration system. Chemcaster's substructure search system was recently updated to make it much easier to integrate.

Previously, a substructure search Execution would list an array of Links to Structures. But most Chemcaster developers will only want to work with Substances. Returning Structure links would require a method to perform the conversion - a costly proposition both in terms of system resources and developer time.

Chemcaster is about eliminating this kind of busy-work, so we needed a better way.

So we replaced the existing approach with one in which an Execution returns a nested array of Substances. Each inner array now contains Links to Substances related through a common Structure. For those only using single-Structure Substances with their registries, the inner array will only ever contain one element.

You can read the full details on the Execution API documentation page.

The Ruby client for Chemcaster has also been updated to reflect the recent changes.

About Chemcaster

Chemcaster is the cheminformatics Web services platform optimized for rapid creation of chemistry-focussed websites. It supports compound registration, exact- and substructure search, and dynamic 2D structure image creation, both through a browser-based administrative interface and a RESTful Web API.

We're looking for developers interested in testing Chemcaster with their next project. If the idea of streamlining the creation of rich chemistry websites sounds interesting, please consider requesting an invitation to create an account.

Substructure Search With The Chemcaster Web API

Update August 12, 2009: The REST-RPC hybrid API described in this article has been replaced by a fully-RESTful API.

Substructure search is an essential element of any chemical structure database. Chemcaster, the cheminformatics Web services platform, now supports substructure search through the same Web API as its exact structure search feature.

Creating a Query

To test the new substructure search API, create a file in your working directory called query.xml with the following content (representing a substructure search for pyridine):

<?xml version="1.0" encoding="UTF-8"?>
<query>
  <molfile>[NO NAME]
  CHEMWRIT          2D
Created with ChemWriter - http://metamolecular.com/chemwriter
  6  6  0  0  0  0  0  0  0  0  0 V2000
   -1.2400    1.0200    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    0.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    2.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2400    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  6  1  1  0  0  0  0
M  END</molfile>
  <mode>Substructure</mode>
</query>

Executing the Query

This query can be run through a number of HTTP clients, including cURL:

$ curl -H 'Accept: application/xml' -H 'Content-Type: application/xml' -F upload=@query.xml -u you@example.com:password https://chemcaster.com/registries/1/queries

POSTing this query returns the following XML document, which gives the results of running the query:

<?xml version="1.0" encoding="UTF-8"?>
<query>
  <id type="integer">8</id>
  <registry-id type="integer">1</registry-id>
  <molfile>[NO NAME]
  CHEMWRIT          2D
Created with ChemWriter - http://metamolecular.com/chemwriter
  6  6  0  0  0  0  0  0  0  0  0 V2000
   -1.2400    1.0200    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    0.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    2.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2400    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  6  1  1  0  0  0  0
M  END</molfile>
  <results type="array">
    <structure>
      <id>4</id>
      <name>bipy</name>
    </structure>
    <structure>
      <id>21</id>
      <name>pyridine</name>
    </structure>
  </results>
</query>

Drilling Down

Information about each hit, such as its molfile, can be accessed by using the name returned in the hitset along with the returned registry-id:

$ curl -H 'Accept: application/xml'  -u you@example.com:password https://chemcaster.com/registries/1/structures/bipy

An image can be obtained by setting the Accept header to image/png (note the use of the "http" protocol and lack of login):

$ curl -H 'Accept: image/png'  http://chemcaster.com/registries/1/structures/bipy

Re-Running the Query

To refer back to this query, we can use the returned id to build a URL and run a request against it:

$ curl -H 'Accept: application/xml' -u you@example.com:password https://chemcaster.com/queries/8

The response returns the same results as before unless the underlying collection changed. In that case the query results reflect the new state of the collection.

The same API can be used to perform exact structure queries. Just replace the mode element content with "Exact Structure". If not specified, the mode defaults to "Substructure."

Conclusions

Chemcaster simplifies the creation of Web sites and services that manipulate chemical structures. The new substructure search capability is one example of a fully-integrated set of Web-centric utilities. We're looking for testers who want to incorporate Chemcaster into their next project. If you're interested, please apply online for a free account or contact us directly.