<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body smarttemplateinserted="true">
<div id="smartTemplate4-template">Hi,<br>
</div>
<p>I am supposed to find invalid escape sequences when parsing JSON
and replace them with a user defined fallback. Invalid in the
sense that the unicode codepoint is not defined or a missing
surrogate, not syntactically invalid.<br>
</p>
<p>For example, any occurrence of \uFFFF and \uDEAD should be
replaced by \uffff and \udead respectively. Or alternatively with
???? depending on the settings.<br>
</p>
<p>I think I need to change the JSON scanner to be able to do that.</p>
<p>I could add a callback function OnInvalidEscape: function
(escapeStart: pchar): string; of object;<br>
Or perhaps OnInvalidEscape: function (unicodePoint,
previousUnicodePointSurrogate: integer): string; of object;
{although that would be troublesome if \uDEAD and \udead are
supposed to be replaced with a different fallback}<br>
Or OnInvalidEscape: function (const escapedString: string[4]):
string; of object;</p>
<p>The function would return the unescaped value. Alternatively, the
current string could be passed to it as var parameter, and the
function would append its unescaped value directly.<br>
</p>
<p>Or move all unescaping to a callback function, could be called
OnUnescape or OnDecodeEscape. So the scanner does not need to
decide which escapes are invalid. Then <br>
</p>
<p> if (joUTF8 in Options) or
(DefaultSystemCodePage=CP_UTF8) then<br>
S:=Utf8Encode(WideString(WideChar(u1)+WideChar(u2))) // ToDo: use
faster function<br>
else<br>
S:=String(WideChar(u1)+WideChar(u2)); //
WideChar converts the encoding. Should it warn on loss?</p>
<p>could be replaced by one function call. And if the user does not
set a callback function, the scanner would set its own callback
function depending on the option.</p>
<p>Any interest in a patch that adds such a callback function? Or is
there another way to do this?<br>
</p>
<div> Best,<br>
Benito <br>
</div>
</body>
</html>