[fpc-pascal] FPC with case insensitive file system under Linux

Mark Morgan Lloyd markMLl.fpc-pascal at telemetry.co.uk
Fri Feb 24 17:32:04 CET 2012


Graeme Geldenhuys wrote:
> Hi,
> 
> [rant]
> I'm just sick of the idiocy of Linux/Unix with there case sensitive
> file systems! Google'ing a round for the reason for this, it seems
> that in the 60's, it was C programmers that decided that searching for
> case sensitive files was easier to implement (and marginally faster).

I find that rather difficult to believe, since C was barely conceived in 
the '60s, and back in those days the dominant character I/O devices were 
(EBCDIC) punched cards and the (ASCII) ASR-33- both of which IIRC were 
de-facto uppercase-only.

Now it might be that by the time the Bell workers were hacking UNIX and 
C that they'd got video terminals with full character sets, and it might 
be that they gravitated towards lower-case because of received wisdom 
that the increased variation between letters makes them easier to read. 
But in any event their reluctance to assume any particular mapping 
between upper- and lower-case was probably influenced by the fact that 
there were still two major character sets (EBCDIC and ASCII) as well as 
several minor ones some of which were being used by the US government 
(e.g. Fieldata). As discussed in the context of an IBM mainframe port of 
FPC, it's bad enough having to deal with multiple mappings in the system 
library without having to define them as part of the language.

In any event, history has shown that they probably made the right 
decision, and similarly made the right decision when they pegged UNIX 
timestamps to GMT/UTC. The fact that Microsoft made different choices 
has caused nothing but grief.

All of which suggests that at the current time, when increasing numbers 
of people are wrestling with Unicode, we should all be very much aware 
of the possible problems that converting (or not) between similar 
characters can cause. For example, I found myself writing this yesterday:

       $50: { p P } inject:= #$002A; { * }
       $DB: { [  } inject:= #$2190; { ← }
       $DD: { ]  } inject:= #$2192; { → }

Some founts (e.g. the one used by default by GTK2 Lazarus) use almost 
identical glyphs for the braces in those comments, and it's only a 
matter of time before somebody with more ingenuity than common sense 
tries to use these to slip backdoor code past the casual reviewer.

So in summary, (not) translating characters is something that shouldn't 
be approached without deep understanding of the issues.

> Anybody know of other Linux file systems that have a case insensitive
> option? I really thought ext2 had this, but searching now through the
> man pages, it seems I was mistaken. Anybody know if Btrfs would have
> such an option?

The obvious ones are Windows and possibly CD filesystems mounted with 
appropriate options:

# Additional devices.

/dev/fd0   /mnt/floppy vfat    noauto,user,shortname=winnt     0   0
/dev/hda4  /mnt/zip    vfat    noauto,user,shortname=winnt     0   0
/dev/cdrom /mnt/cdrom  iso9660 noauto,user,ro                  0   0

This obviously includes USB mass storage devices such as cameras.

Not sure about btrfs. I tried using it a few months ago but it was very 
unclear whether I was getting any useful compression when it was 
enabled, I subsequently discovered that that depends on kernel version- 
frankly, it's too near the "bleeding edge" to be used in anger.

-- 
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]



More information about the fpc-pascal mailing list