[fpc-devel] Trying to understand the wiki-Page "FPC Unicodesupport"

Sat Nov 29 15:02:58 CET 2014

In our previous episode, Hans-Peter Diettrich said:
> >> storage, we'll have to take that into account.
> > 
> > (16-bit codepages were designed into OS/2 and Windows NT before utf-8 even
> > existed)
> 
> Right, both systems were developed by Microsoft :-]

A cooperation between IBM and Microsoft starting in 1984 to somewhere in the
early nineties, yes. (or Micro Soft, I can't remember
when they dropped the space).

From
http://en.wikipedia.org/wiki/Microsoft#1984.E2.80.9394:_Windows_and_Office

"Microsoft released its version of OS/2 to original equipment manufacturers
(OEMs) on April 2, 1987"

> No problem, as long as proper host/network byteorder conversion is 
> applied in reading/writing such files. 

I don't see that as something evident. crlf vs lf is not fully transparent
either, just open an lf file with notepad. Many unix editors show crs etc.
There isn't even an universal marker to signal it (like BOMs)

Putting layer upon layer in a misguided attempt to make anything accept
anything transparent is IMHO a waste of both time resources and computing.  Better
intensively maintain a few good converters, and strengthen metadata
processing and retention to make it automatic in a few places where it
really matters. I'm no security expert, but I guess from a security
viewpoint that is better too.

> But times have changed, nowadays the Internet requires certain common 
> standards (e.g. 8-bit bytes = octets, HTML, Unicode and more), which 
> allow for data exchange across machine and country boundaries.

Internet protocols are properly annotated with metadata, so are easiest to
deal with.  That doesn't make it an requirement to push this throughout the
whole RTL, a simple routine in e.g.  the webserver can handle that at the
gate without bogging down the rest of the system with redundant checks.