This post is confusingly written. As the author points out, the issue is not with Damerau-Levenshtein distance, which certainly does obey the triangle inequality; rather the issue is with the incorrect algorithm used to compute it. Nonetheless, after the initial introduction he usually refers to it as "Damerau-Levenshtein" when in fact it's an incorrect version of Damerau-Levenshtein.
The difference he's pointing out isn't that Levenshtein obeys the triangle inequality but Damerau-Levenshtein doesn't; it's that a naive algorithm to compute Levenshtein works, but a naive algorithm to compute Damerau-Levenshtein doesn't -- and that the measure it does compute does not obey the triangle inequality.
While it's clear that the author recognizes this, he should really be more explicit and avoid conflating terms like this; this sort of thing is going to confuse people.
Read more closely -- it's the "restricted edit distance" which does not obey the triangle inequality, which, if you read the "Algorithm" section, is not actually the same thing as Damerau-Levenshtein distance. (Notice also how the section you point to makes sure to differentiate between restricted edit distance and real edit distance, i.e. Damerau-Levenshtein distance.)
That Damerau-Levenshtein distance obeys the triangle inequality follows trivially from the definition, since it's just distance in an appropriate graph.
You know, I have a mathematics and economics background. I love coming across CS/Applied Math gems like this. In my daily work I never even consider the computational consequences of resorting. Love it.
I must be losing my mind, but I can't figure out what 3 edits would get you from rick->irkc in the final diagram. It seems like the distance is 4, not 3 (not problematic because the triangle inequality still holds, but it's bugging the heck out of me).
The difference he's pointing out isn't that Levenshtein obeys the triangle inequality but Damerau-Levenshtein doesn't; it's that a naive algorithm to compute Levenshtein works, but a naive algorithm to compute Damerau-Levenshtein doesn't -- and that the measure it does compute does not obey the triangle inequality.
While it's clear that the author recognizes this, he should really be more explicit and avoid conflating terms like this; this sort of thing is going to confuse people.