Googling for Substances with Chempedia Global Substance Identifiers (GSIDs)

You may have noticed the sequence of numbers assigned to each Chempedia substance record. These numbers are called "Global Substance Identifiers", or GSIDs for short.

Although I'll have more to say about GSIDs in future installments, one of the things they're useful for is text searching through services like Google. By creating a numerical identifier with a unique format, Chempedia improves the chances that you'll be able to use tools like Google to find information about the substances you're looking for.

I'm now happy to report that Google has started indexing Chempedia GSIDs.

For example, Hirose Yoichiro, an early Chempedia user, has registered a several substances, including the one given GSID 1-5039-8389-4491, or allyl mercaptan.

Now this google search takes us to the Chempedia entry Hirose created. This isn't quite as useful as it could be just yet. Currently, Google's search results don't point directly to the Substance Summary, but rather pages that link to it. You'll also notice incomplete coverage of Chempedia GSID's by Google, but this will change over time as Google's 'bots continue to explore the site.

Although this feature is useful for Chempedia itself, it offers even more exciting possibilities for those using Chempedia GSIDs in their online documents. As others start to do so, Google's search results will return those sites as well.

So, not only will it be possible to find all Web documents referring to a particular Chempedia substance, but the structure and all synonyms (with peer review scores) can be quickly determined simply by visiting the original Chempedia Substance Summary.

One of Chempedia's main goals is to create a hub around which chemical substance information can be organized on the Web. Google's coverage of Chempedia GSIDs is one step in that direction.

Comments

  1. Eric Milgram
    November 07, 2009 @ 3:49 PM

    Hi Rich,

    I haven't been following Chempedia closely, but I just took a quick look. The structures render nicely and the response times are very good. I'm curious how GSIDs relate to other chemical identifiers, such as INCHI, CAS #, or ChemSpider IDs.

  2. Rich Apodaca
    November 07, 2009 @ 7:58 PM

    @Eric, thanks for the feedback. Getting structure rendering right and minimizing response times have been significant technical goals for Chempedia. It's come a long way, there's always room for improvement. If you notice something odd, don't hesitate to let me know.

    GSIDs are independent of all other chemical identifiers. If the InChI/InChIKey code changes, no GSIDs should need to be re-assigned. Ditto with other IDs.

    But, as you probably saw, a major use of the Chempedia registry is as a dictionary though which relationships between external identifiers and Chempedia GSIDs can be found.

    Chempedia currently doesn't accept namings that are either InChIKeys or InChIs because those are both generatable from existing Chempedia information. The idea is for Chempedia GSID synonyms to only be used for those identifiers that can't be machine-generated.

Your thoughts?

No HTML. To create a link like this example, use: [example](http://example.com). Learn more