[fpc-devel] simple UTF tests

peter green plugwash at p10link.net
Mon Jan 9 12:40:28 CET 2012


 >But this seems to be be a propriety Microsoft definition while AFAIK, 
"ANSI" denotes "American National Standards Institute".
While ANSI does denote american national standards institute in general 
it doesn't really mean that in this context.

A windows machine has two main code pages in use (both language 
dependent and for some languages they may be the same code page). The 
"OEM" code page and the "ANSI" code page. The "OEM" code page is one of 
the original PC code pages and afaict is mostly used for the console. 
The "ANSI" code page is  used for the non-unicode versions of stuff in 
windows itself.

The term "ANSI" comes from the fact that the initial "ANSI" code page 
(1252) was based on an ANSI draft of what became ISO-8859-1. 1252 is 
fairly close to ISO-8859-1 (it just replaces rarely used control 
characters with more printable characters) but most of the other "ANSI 
code pages" bear little to no relationship to any ANSI or ISO standard 
encoding.

Afaict in europe, america and australasia both the "ANSI" and "OEM" code 
pages are simple encodings with one byte per user-visiable character and 
all characters drawn left to right.  Once you move to asia and africa 
though that no longer holds with CJK languages being represented by 
multibyte encodings, vietnamese being represented using combining 
characters and middle eastern languages bringing the complications of 
bidirectional text. MS encourages programmers to use unicode nowasays 
and afiact languages added more recently to windows (like the indic 
languages) don't have any non-unicode support at all.

Windows also defines other code page numbers that are used as neither 
ANSI or OEM code pages. UTF-8 falls into this category.

Delphi is a windows program (yeah there was an abortive linux port but 
that came much later and didn't stick arround for long) so it inherits 
windows terminology. FPC/lazarus is essentially a delphi clone but is 
cross platform so it's put in the position of trying to interpret and 
stretch windows grounded ideas to fit a cross-platform context.




More information about the fpc-devel mailing list