Lexicographically or Alphabetically
-
Should Lexicographically perhaps be Alphabetically in these strings?:
<Item id=“42059” name=“Sort Lines Lexicographically Ascending”/>
<Item id=“42060” name=“Sort Lines Lexicographically Descending”/>I don’'t know the difference i would just think Alphabetically was a more know word but maybe that only applys to A-Z and not other characters.
-
Hello, @scootergrisen, and All,
Strictly speaking, The N++ sort feature is a sort of the Unicode code-points of characters !
Assuming, for instance, these few values, picked out, by hazard, from my Courrier New font,
v2.90, that I sorted, alphabetically, by their Unicode official nameﯿ U+FBFF ARABIC LETTER FARSI YEH MEDIAL FORM ج U+062C ARABIC LETTER JEEM ق U+0642 ARABIC LETTER QAF ♫ U+266B BEAMED EIGHTH NOTES ♣ U+2663 BLACK CLUB SUIT ┼ U+253C BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL U+0009 CHARACTER TABULATION ̀ U+0300 COMBINING GRAVE ACCENT © U+00A9 COPYRIGHT SIGN Ќ U+040C CYRILLIC CAPITAL LETTER KJE к U+043A CYRILLIC SMALL LETTER KA 2 U+0032 DIGIT TWO = U+003D EQUALS SIGN € U+20AC EURO SIGN ≥ U+2265 GREATER-THAN OR EQUAL TO Δ U+0394 GREEK CAPITAL LETTER DELTA δ U+03B4 GREEK SMALL LETTER DELTA א U+05D0 HEBREW LETTER ALEF אָ U+FB2F HEBREW LETTER ALEF WITH QAMATS ר U+05E8 HEBREW LETTER RESH Ă U+0102 LATIN CAPITAL LETTER A WITH BREVE Ầ U+1EA6 LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE É U+00C9 LATIN CAPITAL LETTER E WITH ACUTE N U+004E LATIN CAPITAL LETTER N ă U+0103 LATIN SMALL LETTER A WITH BREVE ầ U+1EA7 LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE é U+00E9 LATIN SMALL LETTER E WITH ACUTE n U+006E LATIN SMALL LETTER N ø U+00F8 LATIN SMALL LETTER O WITH STROKE “ U+201C LEFT DOUBLE QUOTATION MARK _ U+005F LOW LINE ˉ U+02C9 MODIFIER LETTER MACRON × U+00D7 MULTIPLICATION SIGN  U+FFFC OBJECT REPLACEMENT CHARACTER % U+0025 PERCENT SIGN ” U+201D RIGHT DOUBLE QUOTATION MARK → U+2192 RIGHTWARDS ARROW U+0020 SPACE √ U+221A SQUARE ROOT ™ U+2122 TRADE MARK SIGN | U+007C VERTICAL LINE ½ U+00BD VULGAR FRACTION ONE HALF ☺ U+263A WHITE SMILING FACE U+FEFF ZERO WIDTH NO-BREAK SPACE
If you use the option Edit > Line Operations > Sort Lines Lexicographically Ascending, this list becomes :
U+0009 CHARACTER TABULATION U+0020 SPACE % U+0025 PERCENT SIGN 2 U+0032 DIGIT TWO = U+003D EQUALS SIGN N U+004E LATIN CAPITAL LETTER N _ U+005F LOW LINE n U+006E LATIN SMALL LETTER N | U+007C VERTICAL LINE © U+00A9 COPYRIGHT SIGN ½ U+00BD VULGAR FRACTION ONE HALF É U+00C9 LATIN CAPITAL LETTER E WITH ACUTE × U+00D7 MULTIPLICATION SIGN é U+00E9 LATIN SMALL LETTER E WITH ACUTE ø U+00F8 LATIN SMALL LETTER O WITH STROKE Ă U+0102 LATIN CAPITAL LETTER A WITH BREVE ă U+0103 LATIN SMALL LETTER A WITH BREVE ˉ U+02C9 MODIFIER LETTER MACRON ̀ U+0300 COMBINING GRAVE ACCENT Δ U+0394 GREEK CAPITAL LETTER DELTA δ U+03B4 GREEK SMALL LETTER DELTA Ќ U+040C CYRILLIC CAPITAL LETTER KJE к U+043A CYRILLIC SMALL LETTER KA א U+05D0 HEBREW LETTER ALEF ר U+05E8 HEBREW LETTER RESH ج U+062C ARABIC LETTER JEEM ق U+0642 ARABIC LETTER QAF Ầ U+1EA6 LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE ầ U+1EA7 LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE “ U+201C LEFT DOUBLE QUOTATION MARK ” U+201D RIGHT DOUBLE QUOTATION MARK € U+20AC EURO SIGN ™ U+2122 TRADE MARK SIGN → U+2192 RIGHTWARDS ARROW √ U+221A SQUARE ROOT ≥ U+2265 GREATER-THAN OR EQUAL TO ┼ U+253C BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL ☺ U+263A WHITE SMILING FACE ♣ U+2663 BLACK CLUB SUIT ♫ U+266B BEAMED EIGHTH NOTES אָ U+FB2F HEBREW LETTER ALEF WITH QAMATS ﯿ U+FBFF ARABIC LETTER FARSI YEH MEDIAL FORM U+FEFF ZERO WIDTH NO-BREAK SPACE  U+FFFC OBJECT REPLACEMENT CHARACTERAnd it’s obvious that it’s sorted, according to the
U+####value of each character ! So, the correct formulation should be Sort Lines by Unicode values Ascending /Descending :-D
BTW, It would be nice if we could sort, according to our local language. For instance, this original list of some French words, below :
ère école bateau euro colis eau ferme élu à emploi lit embarras émoi zoo elle errer avion été sceau ébèneis, presently, sorted as :
avion bateau colis eau elle embarras emploi errer euro ferme lit sceau zoo à ère ébène école élu émoi étéHowever, the correct order, in a French dictionary, is :
à avion bateau colis eau ébène école elle élu embarras émoi emploi ère errer été euro ferme lit sceau zoo
In the same way, the regex expression
(?-i)[e-f]should match, for instance, the lower-case letters e and f and all Latin accentuated forms of the letter e. In other words, it should be equivalent to the regex(?-i)[eèéêëēĕėęěẹẻẽếềểễệ℮f], if I consider, simply, the Courier New font !Best Regards,
guy038
-
@guy038 :
english_customizable.xml:
<Item id="42059" name="Sort Lines Lexicographically Ascending"/> <Item id="42060" name="Sort Lines Lexicographically Descending"/>->
<Item id="42059" name="Sort Lines by Unicode values Ascending"/> <Item id="42060" name="Sort Lines by Unicode values Descending"/>=

:-)
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login