[fpc-devel] Unit for handling UTF-8 strings
Michael Van Canneyt
michael at freepascal.org
Tue Apr 9 08:55:15 CEST 2013
On Tue, 9 Apr 2013, Mattias Gaertner wrote:
> On Tue, 09 Apr 2013 08:24:11 +0200
> Michael Schnell <mschnell at lumino.de> wrote:
>
>> On 04/08/2013 07:02 PM, Mattias Gaertner wrote:
>>> I guess, you mean encoded string types.
>>
>> AFAIK, you can just create string variables of the appropriate coding
>> type and an assignment will do auto-conversion.
>
> Yes.
> But how do you examine the characters?
> If I understand Michael right, there will be some "implicit functions"
> for that. I wonder how they work.
See the character unit:
// flat functions
function ConvertFromUtf32(AChar : UCS4Char) : UnicodeString;
function ConvertToUtf32(const AString : UnicodeString; AIndex : Integer) : UCS4Char; overload;
function ConvertToUtf32(const AString : UnicodeString; AIndex : Integer; out ACharLength : Integer) : UCS4Char; overload;
function ConvertToUtf32(const AHighSurrogate, ALowSurrogate : UnicodeChar) : UCS4Char; overload;
function GetNumericValue(AChar : UnicodeChar) : Double; overload;
function GetNumericValue(const AString : UnicodeString; AIndex : Integer) : Double; overload;
function GetUnicodeCategory(AChar : UnicodeChar) : TUnicodeCategory; overload;
function GetUnicodeCategory(const AString : UnicodeString; AIndex : Integer) : TUnicodeCategory; overload;
function IsControl(AChar : UnicodeChar) : Boolean; overload;
function IsControl(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsDigit(AChar : UnicodeChar) : Boolean; overload;
function IsDigit(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsSurrogate(AChar : UnicodeChar) : Boolean; overload;
function IsSurrogate(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsHighSurrogate(AChar : UnicodeChar) : Boolean; overload;
function IsHighSurrogate(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsLowSurrogate(AChar : UnicodeChar) : Boolean; overload;
function IsLowSurrogate(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsSurrogatePair(const AHighSurrogate, ALowSurrogate : UnicodeChar) : Boolean; overload;
function IsSurrogatePair(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsLetter(AChar : UnicodeChar) : Boolean; overload;
function IsLetter(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsLetterOrDigit(AChar : UnicodeChar) : Boolean; overload;
function IsLetterOrDigit(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsLower(AChar : UnicodeChar) : Boolean; overload;
function IsLower(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsNumber(AChar : UnicodeChar) : Boolean; overload;
function IsNumber(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsPunctuation(AChar : UnicodeChar) : Boolean; overload;
function IsPunctuation(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsSeparator(AChar : UnicodeChar) : Boolean; overload;
function IsSeparator(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsSymbol(AChar : UnicodeChar) : Boolean; overload;
function IsSymbol(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsUpper(AChar : UnicodeChar) : Boolean; overload;
function IsUpper(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function IsWhiteSpace(AChar : UnicodeChar) : Boolean; overload;
function IsWhiteSpace(const AString : UnicodeString; AIndex : Integer) : Boolean; overload;
function ToLower(AChar : UnicodeChar) : UnicodeChar; overload;
function ToLower(const AString : UnicodeString) : UnicodeString; overload;
function ToUpper(AChar : UnicodeChar) : UnicodeChar; overload;
function ToUpper(const AString : UnicodeString) : UnicodeString; overload;
More information about the fpc-devel
mailing list