Filter the data !!!
-
@Alan-Kilborn said in Filter the data !!!:
So do people that use non-basic encodings get stuck looking at odd sequences in the Find what box as they are composing a search term?
I don’t use non-basic encodings but I don’t think that this is an issue
because the system font used, which as far as I know is used by the dialog, handles this, normally. -
Hi, @alan-kilborn and All,
Seemingly, in the
config.xml
file, all characters above\x{007F}
( so non pureASCII
) are encoded with the usualXML
syntax&#x....;
, where a dot stands for an hexadecimal digitFor characters, over
\x{FFFF}
( so outside the UnicodeBasic Multilingual Plane
- BMP ), they are represented with two 16-bit code units called a surrogate pair. Refer to :https://en.wikipedia.org/wiki/Universal_Character_Set_characters#Surrogates
https://en.wikipedia.org/wiki/UTF-16#Code_points_from_U+010000_to_U+10FFFF
An example :
In this example, the last character, displayed by the
Courier New
font as a small white square box, is the OSMANYA letter BA ( Unicode code-point10481
) which can be described with the surrogate pair\x{D801}\x{DC81}
, correctly handled and decoded by your OS !Refer http://www.unicode.org/charts/PDF/U10480.pdf
Best Regards,
guy038
-
This is how it looks on your system, but I assume it might look different on a system where OSMANYA is more common.
That is, of course, if there is a localized version of Windows that OSMANYA takes into account.