[fpc-pascal] TZipper and special file names like "atenção.txt" (#26213)

Craig Peterson craig at scootersoftware.com
Fri May 23 20:50:39 CEST 2014


> Nice. I can do it, opening a new issue in bugtracker.

Filename encoding in zip files is poorly defined.  The current
APPNOTE.txt says that the only valid encoding is OEM 437, with UTF-8 if
a bit is set in the header, but those were recent additions, and in
practice Windows applications will generally use either the OEM or ANSI
codepage of the current system locale, and files generated on Unix will
be UTF-8 but won't have the language encoding bit set.

Abbrevia's zip encoding/decoding tries to handle the issue in as
compatible a manner as possible.  It stores the original filenames as
OEM/ANSI based on the current system, and stores a UTF-8 copy in an
extended header so there's a known way to decode it when changing
locales.  When reading it has to use lookup tables to guess if the
filenames are likely OEM or ANSI.  On Unicode-enabled Delphi releases
it's fully Unicode enabled; on FreePascal and older Delphi releases it
only supports ANSI filenames but still does proper encoding/decoding.

The relevant code is in AbZipTyp.pas in TAbZipItem.SetFilename and
TAbZipItem.LoadFromStream if you want a reference.  It's under the MPL,
but I'm the original author and I'm happy to relicense it if someone
else wants to incorporate the code into paszlib.

https://sourceforge.net/p/tpabbrevia/code/HEAD/tree/trunk/source/AbZipTyp.pas

-- 
Craig Peterson
Scooter Software




More information about the fpc-pascal mailing list