7.24.06
The page includes this:< meta http-equiv="Content-Type" content="text/html;charset=utf-8">
but it doesn't do anything for the rendering of those goofy characters.
Serious perl programmers should go for the REAL solution , John Walkers' Demoroniser at http://www.fourmilab.ch/webtools/demoroniser/
and read his comments in the .pl file. Priceless.
I made a quick and dirty solution in my crude babytalk version of perl (pasting the damnable characters from MS Works (!!) word processor into TextPad (which displays them as a verticle black bar). Haven't a clue what the ASCII is that is captured by Textpad, but it works, as far as I can see to clean up the text if a user of my content management tools dumps copy using MS "smart" stuff. (As John Walker puts it: Translate moronic Microsoft bit-drool into vaguely readable and compatible HTML.)
This is a right single quote: '
" This is enclosed in smart quotes"
This is a long dash: -
This is a short dash: -
This is an elipsis: ...
These should be replaced by straight quotes and non-special characters.
The code more or less looks like this:
$headline=~ s/\“/\"/g;