Applescript Tutorial 8: Rendering chemical structures embedded in graphics file
Author: drc
Web Site: http://www.macinchem.org
Rich Apodaca has been discussing embedding molecular information in images of molecules, such as a PNG file depicting a 2D structure. As we move to a more web-centric view of the world it is apparent that much of research information will be only available via the web, whilst images of chemical structures are usually adequate for a human viewer the chemical structure cannot be indexed and subsequently searched. In a subsequent article Rich showed a method of extracting the information as text. In this tutorial I'm going to show how to use applescript to extract the information from the PNG file and then display the structure in a couple of chemical display packages in an editable form.
This script will require a couple of things, ChemBioDraw (aka ChemDraw), MacPymol (http://pymol.sourceforge.net/), and the excellent ExifTool by Paul Harvey (http://www.sno.phy.queensu.ca/~phil/exiftool/). ExifTool is a platform-independent Perl library plus a command-line application for reading, writing and editing meta information in image, audio and video files. You will also need a couple of image files.
This file is Lipitor generated by Geoff Hutchison this contains the chemical information embedded as both SMILES and molfile format.

The second is rosiglitazone from Rich Apodaca which has the chemical information embedded in molfile format only. You can drag these images to your desktop to work with.

The first part of the script simply asks the user to choose the image file, and than creates the POSIX path to the file since this is needed by ExifTool.
The next part creates a three button dialog box allowing the user to choose the application to view the resulting structure.
The main part of the script then uses ExifTool to extract the metadata, in the case of ChemDraw we can generate the structure from either the SMILES string or molfile data. As written the script first extracts the SMILES and then checks to see if a string has been extracted. If there is no SMILES it then gets the molfile information. If using the molfile we need to save the data to a file using the write to file routine (actually saved into temporary items folder) called temp.mol. This is then opened using ChemBioDraw. If the SMILES data is present then we can simply use menu item scripting within ChemBioDraw to create the structure using the "Paste Special" option "SMILES". The metadata sometimes contains tabs so use the find and replace routine to remove them.
As an alternative MacPymol can be used to display the structures, since MacPymol cannot convert SMILES to structures (this will be possible using the next version of OpenBabel which will be released early 2008), we can only use the molfile info.
If we now run the script choosing the rosiglitazone.png file, then selecting ChemBioDraw for display you should get this result.

Using lipitor1.png and MacPymol you should see this.

You can download a copy of the script here.
set theMetadata to "" set theFile to (choose file with prompt "Choose a image file") set the_path to theFile as string --display dialog the_path set posix_path to POSIX path of the_path display dialog "How would you like to display the structure? " buttons {"Cancel", "ChemDraw", "MacPyMol"} default button 1 if the button returned of the result is "ChemDraw" then --set theScript to "exiftool -SMILES -b /Users/username/Desktop/lipitor1.png" set theScript to "exiftool -SMILES -b " & posix_path set theMetadata to (do shell script theScript) if theMetadata is "" then set theScript to "exiftool -molfile -b " & posix_path --use as text to remove non-printing characters set theMetadata to (do shell script theScript) as text set target_file to (path to temporary items folder as string) & "temp.mol" write_to_file(theMetadata, target_file, false) tell application "CS ChemBioDraw Ultra" activate open file target_file end tell else set this_text to (replace_chars(theMetadata, tab, "")) set the clipboard to this_text as text tell application "CS ChemBioDraw Ultra" activate if enabled of menu item "Paste" then do menu item "SMILES" of menu "Paste Special" of menu "Edit" end tell end if else if the button returned of the result is "MacPymol" then set theScript to "exiftool -molfile -b " & posix_path --use as text to remove non-printing characters set theMetadata to (do shell script theScript) as text set target_file to (path to temporary items folder as string) & "temp.mol" write_to_file(theMetadata, target_file, false) tell application "MacPyMOL" activate open file target_file end tell else quit end if --Routines on replace_chars(this_text, search_string, replacement_string) set AppleScript's text item delimiters to the search_string set the item_list to every text item of this_text set AppleScript's text item delimiters to the replacement_string set this_text to the item_list as string set AppleScript's text item delimiters to "" return this_text end replace_chars on write_to_file(this_data, target_file, append_data) try set the target_file to the target_file as text set the open_target_file to open for access file target_file with write permission if append_data is false then set eof of the open_target_file to 0 write this_data to the open_target_file starting at eof close access the open_target_file return true on error try close access file target_file end try return false end try end write_to_file


