Hi,<div><br></div><div>Just to go a step further, even in a UTF16 String, the "Code Element Size" probably returns 2, but there is still no guarantee that every character will be handled by one element. For example, all common characters from European languages, Korean, and Japanese can be handled in 2 bytes in UTF16, but some uncommon special characters in Japanese might need an additional 2 bytes. (And for example, the Unicode points for Egyption hiroglyphic characters require surrogate pairs).</div>
<div><br></div><div>At any rate, I wish more of this stuff would be handled by the base runtime transparently. For example, the original pascal spec for Copy() says it will copy X Characters, not X bytes - but FPC copies X bytes. (This means we have to use different functions to handle f.e. UTF8 strings properly). Wishing won''t make it come true, but...</div>
<div><br></div><div>Thank you,</div><div> Noah Silva<br><br><div class="gmail_quote">2012/11/5 Jonas Maebe <span dir="ltr"><<a href="mailto:jonas.maebe@elis.ugent.be" target="_blank">jonas.maebe@elis.ugent.be</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>
On 05 Nov 2012, at 11:49, ik wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
As I understand, AnsiString and AnsiChar contain the environment type<br>
of string (it can be ISO8859x, utf-8 etc...).<br>
If that so, how can I know the size (in bytes) of AnsiChar ?<br>
</blockquote>
<br></div>
The size of the type "AnsiChar" is always one byte. It is impossible to give the size of a generic "character" in a 1-byte string, because in e.g. UTF-8 the size depends on the code point. To get the size of a specific character, you can use<br>
widestringmanager.<u></u>codepointlengthproc(pchar,<u></u>maxlookahead)<br>
<br>
"maxlookahead" is the maximum number of bytes that routine is allowed to check to find the complete character (e.g. "widestringmanager.<u></u>codepointlengthproc(@<u></u>ansistringvar[3],length(<u></u>ansistringvar)-2)"). It seems this routine has not yet been implemented for the Windows widestringmanager though.<span class="HOEnZb"><font color="#888888"><br>
<br>
<br>
Jonas</font></span><div class="HOEnZb"><div class="h5"><br>
______________________________<u></u>_________________<br>
fpc-pascal maillist - <a href="mailto:fpc-pascal@lists.freepascal.org" target="_blank">fpc-pascal@lists.freepascal.<u></u>org</a><br>
<a href="http://lists.freepascal.org/mailman/listinfo/fpc-pascal" target="_blank">http://lists.freepascal.org/<u></u>mailman/listinfo/fpc-pascal</a><br>
</div></div></blockquote></div><br></div>