Community
    • Login

    Please help for RegEx query

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 4 Posters 1.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • tzrtnlutzT
      tzrtnlutz
      last edited by

      I have this problem with RegEX:

      Example:
      Das ist richTig. ISt das ein TeXt. Ja, dAS ist in OrDnung.

      If I make this:

      (?-is).+
      \L$0

      = das ist richtig. ist das ein text. ja, das ist in ordnung.

      The first letter must be ignored, that would be correct:
      Das ist richtig. Ist das ein Text. Ja, das ist in Ordnung.

      I hope for a help. Big thanks!

      1 Reply Last reply Reply Quote 0
      • gerdb42G
        gerdb42
        last edited by

        You may try this:
        Search for: (\<.)(\w*)((?:\s|[[:punct:]])*)
        Replace with: $1\L$2$3

        Then hit “Replace all”

        Teardown:
        (\<.) Selects the first Character following a word beginning and makes it group $1
        (\w*) Selects all following word characters and makes it group $2
        ((?:\s|[[:punct:]])*) Selects all following space or punctuation characters and makes it group $3

        Replace by Group $1 as is followed by group $2 converted to lowercase followed by group $3as is. Repeat for all subsequent words.

        1 Reply Last reply Reply Quote 1
        • tzrtnlutzT
          tzrtnlutz
          last edited by

          Wonderful. That’s what I’ve done.
          Big thanks again!

          1 Reply Last reply Reply Quote 0
          • guy038G
            guy038
            last edited by

            Hello, @tzrtnlutz, @gerdb42 and All,

            After some tests, I think that the regex may even be shortened :

            SEARCH (\w)(\w*)

            REPLACE $1\L$2


            Indeed, the unique ASCII character ( so, < \x{0080} ), which is, either, a word character and a punctuation character is the Low Line symbol _ ( \x{005F} ). But, as the regex \w* will catch the greatest amount of word characters, it will include all possible _ symbols, anyway ! Thus, the punctuation character, after a word, will be, necessarily, a character different from _ :-))

            Moreover, as the \w and \s sets of characters have no common element, neither, the ending part (?:\s|[[:punct:]])* is useless !

            Cheers,

            guy038

            Scott SumnerS 1 Reply Last reply Reply Quote 2
            • Scott SumnerS
              Scott Sumner @guy038
              last edited by

              @guy038

              As you know, I’m not much for shortening already published and working regexes here, but in this case I think it is worthwhile as it makes what is being done much clearer.

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors