• Login
Community
  • Login

Please help for RegEx query

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
5 Posts 4 Posters 1.0k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T
    tzrtnlutz
    last edited by Sep 26, 2018, 7:18 AM

    I have this problem with RegEX:

    Example:
    Das ist richTig. ISt das ein TeXt. Ja, dAS ist in OrDnung.

    If I make this:

    (?-is).+
    \L$0

    = das ist richtig. ist das ein text. ja, das ist in ordnung.

    The first letter must be ignored, that would be correct:
    Das ist richtig. Ist das ein Text. Ja, das ist in Ordnung.

    I hope for a help. Big thanks!

    1 Reply Last reply Reply Quote 0
    • G
      gerdb42
      last edited by Sep 26, 2018, 8:20 AM

      You may try this:
      Search for: (\<.)(\w*)((?:\s|[[:punct:]])*)
      Replace with: $1\L$2$3

      Then hit “Replace all”

      Teardown:
      (\<.) Selects the first Character following a word beginning and makes it group $1
      (\w*) Selects all following word characters and makes it group $2
      ((?:\s|[[:punct:]])*) Selects all following space or punctuation characters and makes it group $3

      Replace by Group $1 as is followed by group $2 converted to lowercase followed by group $3as is. Repeat for all subsequent words.

      1 Reply Last reply Reply Quote 1
      • T
        tzrtnlutz
        last edited by Sep 26, 2018, 9:21 AM

        Wonderful. That’s what I’ve done.
        Big thanks again!

        1 Reply Last reply Reply Quote 0
        • G
          guy038
          last edited by Sep 26, 2018, 12:35 PM

          Hello, @tzrtnlutz, @gerdb42 and All,

          After some tests, I think that the regex may even be shortened :

          SEARCH (\w)(\w*)

          REPLACE $1\L$2


          Indeed, the unique ASCII character ( so, < \x{0080} ), which is, either, a word character and a punctuation character is the Low Line symbol _ ( \x{005F} ). But, as the regex \w* will catch the greatest amount of word characters, it will include all possible _ symbols, anyway ! Thus, the punctuation character, after a word, will be, necessarily, a character different from _ :-))

          Moreover, as the \w and \s sets of characters have no common element, neither, the ending part (?:\s|[[:punct:]])* is useless !

          Cheers,

          guy038

          S 1 Reply Last reply Sep 26, 2018, 12:43 PM Reply Quote 2
          • S
            Scott Sumner @guy038
            last edited by Sep 26, 2018, 12:43 PM

            @guy038

            As you know, I’m not much for shortening already published and working regexes here, but in this case I think it is worthwhile as it makes what is being done much clearer.

            1 Reply Last reply Reply Quote 0
            2 out of 5
            • First post
              2/5
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors