Lexicographically or Alphabetically



  • Should Lexicographically perhaps be Alphabetically in these strings?:
    <Item id=“42059” name=“Sort Lines Lexicographically Ascending”/>
    <Item id=“42060” name=“Sort Lines Lexicographically Descending”/>

    I don’'t know the difference i would just think Alphabetically was a more know word but maybe that only applys to A-Z and not other characters.



  • Hello, @scootergrisen, and All,

    Strictly speaking, The N++ sort feature is a sort of the Unicode code-points of characters !

    Assuming, for instance, these few values, picked out, by hazard, from my Courrier New font, v2.90, that I sorted, alphabetically, by their Unicode official name

       ﯿ   U+FBFF   ARABIC LETTER FARSI YEH MEDIAL FORM
       ج   U+062C   ARABIC LETTER JEEM
       ق   U+0642   ARABIC LETTER QAF
       ♫   U+266B   BEAMED EIGHTH NOTES
       ♣   U+2663   BLACK CLUB SUIT
       ┼   U+253C   BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
       	   U+0009   CHARACTER TABULATION 
       ̀   U+0300   COMBINING GRAVE ACCENT
       ©   U+00A9   COPYRIGHT SIGN
       Ќ   U+040C   CYRILLIC CAPITAL LETTER KJE
       к   U+043A   CYRILLIC SMALL LETTER KA
       2   U+0032   DIGIT TWO
       =   U+003D   EQUALS SIGN
       €   U+20AC   EURO SIGN
       ≥   U+2265   GREATER-THAN OR EQUAL TO
       Δ   U+0394   GREEK CAPITAL LETTER DELTA
       δ   U+03B4   GREEK SMALL LETTER DELTA
       א   U+05D0   HEBREW LETTER ALEF
       אָ   U+FB2F   HEBREW LETTER ALEF WITH QAMATS
       ר   U+05E8   HEBREW LETTER RESH
       Ă   U+0102   LATIN CAPITAL LETTER A WITH BREVE
       Ầ   U+1EA6   LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
       É   U+00C9   LATIN CAPITAL LETTER E WITH ACUTE
       N   U+004E   LATIN CAPITAL LETTER N
       ă   U+0103   LATIN SMALL LETTER A WITH BREVE
       ầ   U+1EA7   LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
       é   U+00E9   LATIN SMALL LETTER E WITH ACUTE
       n   U+006E   LATIN SMALL LETTER N
       ø   U+00F8   LATIN SMALL LETTER O WITH STROKE
       “   U+201C   LEFT DOUBLE QUOTATION MARK
       _   U+005F   LOW LINE
       ˉ   U+02C9   MODIFIER LETTER MACRON
       ×   U+00D7   MULTIPLICATION SIGN
          U+FFFC   OBJECT REPLACEMENT CHARACTER
       %   U+0025   PERCENT SIGN
       ”   U+201D   RIGHT DOUBLE QUOTATION MARK
       →   U+2192   RIGHTWARDS ARROW
           U+0020   SPACE
       √   U+221A   SQUARE ROOT
       ™   U+2122   TRADE MARK SIGN
       |   U+007C   VERTICAL LINE
       ½   U+00BD   VULGAR FRACTION ONE HALF
       ☺   U+263A   WHITE SMILING FACE
           U+FEFF   ZERO WIDTH NO-BREAK SPACE
    

    If you use the option Edit > Line Operations > Sort Lines Lexicographically Ascending, this list becomes :

       	   U+0009   CHARACTER TABULATION 
           U+0020   SPACE
       %   U+0025   PERCENT SIGN
       2   U+0032   DIGIT TWO
       =   U+003D   EQUALS SIGN
       N   U+004E   LATIN CAPITAL LETTER N
       _   U+005F   LOW LINE
       n   U+006E   LATIN SMALL LETTER N
       |   U+007C   VERTICAL LINE
       ©   U+00A9   COPYRIGHT SIGN
       ½   U+00BD   VULGAR FRACTION ONE HALF
       É   U+00C9   LATIN CAPITAL LETTER E WITH ACUTE
       ×   U+00D7   MULTIPLICATION SIGN
       é   U+00E9   LATIN SMALL LETTER E WITH ACUTE
       ø   U+00F8   LATIN SMALL LETTER O WITH STROKE
       Ă   U+0102   LATIN CAPITAL LETTER A WITH BREVE
       ă   U+0103   LATIN SMALL LETTER A WITH BREVE
       ˉ   U+02C9   MODIFIER LETTER MACRON
       ̀   U+0300   COMBINING GRAVE ACCENT
       Δ   U+0394   GREEK CAPITAL LETTER DELTA
       δ   U+03B4   GREEK SMALL LETTER DELTA
       Ќ   U+040C   CYRILLIC CAPITAL LETTER KJE
       к   U+043A   CYRILLIC SMALL LETTER KA
       א   U+05D0   HEBREW LETTER ALEF
       ר   U+05E8   HEBREW LETTER RESH
       ج   U+062C   ARABIC LETTER JEEM
       ق   U+0642   ARABIC LETTER QAF
       Ầ   U+1EA6   LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
       ầ   U+1EA7   LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
       “   U+201C   LEFT DOUBLE QUOTATION MARK
       ”   U+201D   RIGHT DOUBLE QUOTATION MARK
       €   U+20AC   EURO SIGN
       ™   U+2122   TRADE MARK SIGN
       →   U+2192   RIGHTWARDS ARROW
       √   U+221A   SQUARE ROOT
       ≥   U+2265   GREATER-THAN OR EQUAL TO
       ┼   U+253C   BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
       ☺   U+263A   WHITE SMILING FACE
       ♣   U+2663   BLACK CLUB SUIT
       ♫   U+266B   BEAMED EIGHTH NOTES
       אָ   U+FB2F   HEBREW LETTER ALEF WITH QAMATS
       ﯿ   U+FBFF   ARABIC LETTER FARSI YEH MEDIAL FORM
           U+FEFF   ZERO WIDTH NO-BREAK SPACE
          U+FFFC   OBJECT REPLACEMENT CHARACTER
    

    And it’s obvious that it’s sorted, according to the U+#### value of each character ! So, the correct formulation should be Sort Lines by Unicode values Ascending /Descending :-D


    BTW, It would be nice if we could sort, according to our local language. For instance, this original list of some French words, below :

    ère
    école
    bateau
    euro
    colis
    eau
    ferme
    élu
    à
    emploi
    lit
    embarras
    émoi
    zoo
    elle
    errer
    avion
    été
    sceau
    ébène
    

    is, presently, sorted as :

    avion
    bateau
    colis
    eau
    elle
    embarras
    emploi
    errer
    euro
    ferme
    lit
    sceau
    zoo
    à
    ère
    ébène
    école
    élu
    émoi
    été
    

    However, the correct order, in a French dictionary, is :

    à
    avion
    bateau
    colis
    eau
    ébène
    école
    elle
    élu
    embarras
    émoi
    emploi
    ère
    errer
    été
    euro
    ferme
    lit
    sceau
    zoo
    

    In the same way, the regex expression (?-i)[e-f] should match, for instance, the lower-case letters e and f and all Latin accentuated forms of the letter e. In other words, it should be equivalent to the regex (?-i)[eèéêëēĕėęěẹẻẽếềểễệ℮f], if I consider, simply, the Courier New font !

    Best Regards,

    guy038



  • @guy038 :

    english_customizable.xml:

    <Item id="42059" name="Sort Lines Lexicographically Ascending"/>
    <Item id="42060" name="Sort Lines Lexicographically Descending"/>
    

    ->

    <Item id="42059" name="Sort Lines by Unicode values Ascending"/>
    <Item id="42060" name="Sort Lines by Unicode values Descending"/>
    

    =

    Imgur

    :-)


Log in to reply