Written by Colin Batchelor.
I’ll be talking at the 6th Joint Sheffield Conference on Cheminformatics in July on Validation and Standardization of Molecular Structures in General and Sugars in Particular. This is a taster.
Sugars in Particular
One of the big problems with chemical structure algorithms is that they can’t, in general, cope with the ways that chemists are accustomed to drawing sugar molecules. They will lose the stereochemistry around the sugar ring, collapsing D-glucose, say, on to L-glucose, not to mention allose, altrose, gulose and all the others.
(ChemDraw, I should note, can interpret chair stereo properly, but it is very much an exception.)
The first step in determining correct stereochemistry for a chair atom is recognizing a chair hexagon. That is the subject of this post.
Have you ever been in the same car as a satnav (US readers: this is the same as a GPS)? Whereas a human navigator will give general instructions like “go straight over all of the roundabouts till we reach the Red Lion”, a satnav only ever gives single-step, local instructions. “At the roundabout, take the third exit.” “In 100 metres, turn left.” Machine structure perception is rather like this. Instead of apprehending in an instant that the hexagon is a chair or a boat like you or I would, the algorithm needs to step around the structure atom by atom, bond by bond.
The trick to identifying what kind of hexagon we are dealing with is to see whether, at each atom, we turn left or right. If we keep turning in the same direction all the way round, then we have a regularish hexagon. If we turn once in one direction, then twice in the other, then once in the first, then twice in the other, then we have a chair. There are six other sorts of hexagon you can draw, and they’re all depicted below alongside the corresponding sequences of turns.
Some of them are familiar, like the boat, the twist boat, and the envelope. Others, less so.