[fpc-pascal] helpsystem, some numbers
Marco van de Voort
marcov at stack.nl
Tue Oct 28 12:01:05 CET 2008
In our previous episode, Graeme Geldenhuys said:
> > In fact deflate/zip is 18-19 years old and there are lot of better
> > compression algorithm, like LZX. I think there is one implemented in Pascal
> > (ABC if memory don't fails).
>
> I'm still trying to find a compression algorithm that beats whatever
> 7-zip uses. The results are by magnitudes smaller than any other
> compression algorithm I have seen.
>
> The important thing for TZipFile component is that the archive format
> must compresses every file separately. Otherwise you can't extract a
> specific file without unpacking everything first.
But ZIP is 5-6 times larger than CHM, which can do all this too, and we have
the whole software shebang without deps.
I was somewhat surprised that bz2 was another 2 times smaller, and according
to Eduardo it is possible to extract blocks separately, without changing
compression parameters. Then you could index the tar+bz2 (which files in
which block + offset in block) by decompressing fully once, and then extract
single files.
Still, since that adds another index and handling, and a lot of work, chm is
working and not too bad.
> The other thing is the algorithms need to be free and supporting
> Unicode.
A compression algorithm is not related to unicode. That's the job of the
archive component.
>7-zip's LZMA does pass both requirements. I'm just not sure if it
> compresses filed separately - I would imagine it can/does.
A portable, not overly complex implementation in Pascal is also a
requirement IMHO. Not an hard one, but the fact that it is already there for
CHM makes it one for an alternative.
More information about the fpc-pascal
mailing list