Substructure Search With The Chemcaster Web API

Update August 12, 2009: The REST-RPC hybrid API described in this article has been replaced by a fully-RESTful API.

Substructure search is an essential element of any chemical structure database. Chemcaster, the cheminformatics Web services platform, now supports substructure search through the same Web API as its exact structure search feature.

Creating a Query

To test the new substructure search API, create a file in your working directory called query.xml with the following content (representing a substructure search for pyridine):

<?xml version="1.0" encoding="UTF-8"?>
<query>
  <molfile>[NO NAME]
  CHEMWRIT          2D
Created with ChemWriter - http://metamolecular.com/chemwriter
  6  6  0  0  0  0  0  0  0  0  0 V2000
   -1.2400    1.0200    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    0.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    2.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2400    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  6  1  1  0  0  0  0
M  END</molfile>
  <mode>Substructure</mode>
</query>

Executing the Query

This query can be run through a number of HTTP clients, including cURL:

$ curl -H 'Accept: application/xml' -H 'Content-Type: application/xml' -F upload=@query.xml -u you@example.com:password https://chemcaster.com/registries/1/queries

POSTing this query returns the following XML document, which gives the results of running the query:

<?xml version="1.0" encoding="UTF-8"?>
<query>
  <id type="integer">8</id>
  <registry-id type="integer">1</registry-id>
  <molfile>[NO NAME]
  CHEMWRIT          2D
Created with ChemWriter - http://metamolecular.com/chemwriter
  6  6  0  0  0  0  0  0  0  0  0 V2000
   -1.2400    1.0200    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    0.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    2.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2400    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  6  1  1  0  0  0  0
M  END</molfile>
  <results type="array">
    <structure>
      <id>4</id>
      <name>bipy</name>
    </structure>
    <structure>
      <id>21</id>
      <name>pyridine</name>
    </structure>
  </results>
</query>

Drilling Down

Information about each hit, such as its molfile, can be accessed by using the name returned in the hitset along with the returned registry-id:

$ curl -H 'Accept: application/xml'  -u you@example.com:password https://chemcaster.com/registries/1/structures/bipy

An image can be obtained by setting the Accept header to image/png (note the use of the "http" protocol and lack of login):

$ curl -H 'Accept: image/png'  http://chemcaster.com/registries/1/structures/bipy

Re-Running the Query

To refer back to this query, we can use the returned id to build a URL and run a request against it:

$ curl -H 'Accept: application/xml' -u you@example.com:password https://chemcaster.com/queries/8

The response returns the same results as before unless the underlying collection changed. In that case the query results reflect the new state of the collection.

The same API can be used to perform exact structure queries. Just replace the mode element content with "Exact Structure". If not specified, the mode defaults to "Substructure."

Conclusions

Chemcaster simplifies the creation of Web sites and services that manipulate chemical structures. The new substructure search capability is one example of a fully-integrated set of Web-centric utilities. We're looking for testers who want to incorporate Chemcaster into their next project. If you're interested, please apply online for a free account or contact us directly.

Exact Structure Search with Chemcaster

Chemcaster now supports exact structure queries, both through the graphical administrative interface and through the Web API.

The new capability can be tested using the curl HTTP client. As an example, prepare a file called query.xml in your working directory with this content - representing an exact structure search for benzene:

<?xml version="1.0" encoding="UTF-8"?>
<query>
  <molfile>[NO NAME]
  CHEMWRIT          2D
Created with ChemWriter - http://metamolecular.com/chemwriter
  6  6  0  0  0  0  0  0  0  0  0 V2000
   -1.2400    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    0.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    2.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2400    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  6  1  1  0  0  0  0
M  END</molfile>
</query>

We can now issue a POST request, assuming a registry id of 1, to create a new query:

$ curl -H 'Accept: application/xml' -H 'Content-Type: application/xml' -F upload=@query.xml -u you@example.com:password https://chemcaster.com/registries/1/queries

This responds with the following xml and a code of 201 (Created):

<?xml version="1.0" encoding="UTF-8"?>
<query>
  <id type="integer">2</id>
  <molfile>[NO NAME]
  CHEMWRIT          2D
Created with ChemWriter - http://metamolecular.com/chemwriter
  6  6  0  0  0  0  0  0  0  0  0 V2000
   -1.2400    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    0.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    2.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2400    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  6  1  1  0  0  0  0
M  END</molfile>
</query>

We can read the Query ID ('2') and then retrieve our search results with:

$ curl -H 'Accept: application/xml' -u you@example.com:secret https://chemcaster.com/queries/2

The response contains the original query in molfile format as well as the name and id of the matching structure:

<query>
  <id type="integer">2</id>
  <registry-id type="integer">1</registry-id>
  <molfile>[NO NAME]
  CHEMWRIT          2D
Created with ChemWriter - http://metamolecular.com/chemwriter
  6  6  0  0  0  0  0  0  0  0  0 V2000
   -1.2400    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    0.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    1.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.4921    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.3740    2.5200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.2400    2.0200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0  0  0  0
  2  3  1  0  0  0  0
  3  4  2  0  0  0  0
  4  5  1  0  0  0  0
  5  6  2  0  0  0  0
  6  1  1  0  0  0  0
M  END</molfile>
  <results type="array">
    <structure>
      <id>16</id>
      <name>benzene</name>
    </structure>
  </results>
</query>

Chemcaster is the Web service for sites with rich chemical content. Request your testing invitation today and see how easy it can be to create your next chemistry Web application.