• Login
Community
  • Login

Cannot find the fitting regular expression for counting the right class of words...

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 2 Posters 3.0k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Q
    Quängelosaurus Rex
    last edited by Jun 4, 2016, 11:11 AM

    Hello there!

    My name’s Wolf and i’m working on my phd-thesis right now. It’s about communication in online-games, and i’m using notepad++ for counting words and wordgroups (i’m from germany, so please forgive me for my bad english).

    Right know, i 'm trying to count a special type of posting. It has got the following structure within it:

    sucht” (in english: “guild [x] is looking for”)

    X = any number of any characters. I tried it with “Gilde [[:alnum:]]+ sucht”, which seems to work partially, BUT i think this expression only covers the primary key base. For my search, i need the primary key base as well as every character that differs from it, e.g. A, a, À, Á, Â, Ã, Ä, Å, A, a, à, á, â, ã, ä and å (it has to work for any other character as well).

    Sadly, i have no idea how to put that into a regular expression. Could you please help me?

    Thank you so much!

    Kind greetings
    Wolf

    1 Reply Last reply Reply Quote 0
    • Q
      Quängelosaurus Rex
      last edited by Jun 4, 2016, 12:44 PM

      Oh, and there is an additional issue: guild names are often framed by special characters like *, -, /, , <, > etc. the [x]-Part should involve those as well…i hope someone has an idea. :)

      1 Reply Last reply Reply Quote 0
      • G
        guy038
        last edited by guy038 Jun 6, 2016, 9:26 PM Jun 6, 2016, 9:19 PM

        Hello Quangelosaurus Rex,

        Unlike you think about, the POSIX classes [[:alpha:]] and [[:alnum:]], also, match all the Unicode Latin, Greek, Cyrillic, Hebrew and Arab accentuated letters !

        So, taking in account your specific characters, including the Space character, of your second post, we, finally, get the two regexes, below :

        Gilde [[:alpha:]/<>* -]+ sucht matches the string Gilde, followed with a space, then any non-null range of Letter, Slash, Less-Than sign, Greater-Than sign, Asterisk, Space, or Hyphen-Minus sign, followed with a Space and the string sucht

        Gilde [[:alnum:]/<>* -]+ sucht matches the string Gilde, followed with a space, then any non-null range of Number, Letter, Slash, Less-Than sign, Greater-Than sign, Asterisk, Space, or Hyphen-Minus sign, followed with a Space and the string sucht

        Best Regards,

        guy038

        P.S. :

        You’ll find good documentation, about the new Boost C++ Regex library, v1.55.0 ( similar to the PERL Regular Common Expressions, v1.48.0 ), used by Notepad++, since its 6.0 version, at the TWO addresses below :

        http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html

        http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html

        • The FIRST link explains the syntax, of regular expressions, in the SEARCH part

        • The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part

        You may, either, look for valuable informations, on the sites, below :

        http://www.regular-expressions.info

        http://www.rexegg.com

        http://perldoc.perl.org/perlre.html

        To end with, feel free to ask, the N++ Community, for infos on any tricky regex that you came across OR for building any tricky regex, for a particular purpose :-))

        1 Reply Last reply Reply Quote 1
        • Q
          Quängelosaurus Rex
          last edited by Jul 7, 2016, 10:37 AM

          Sorry for the late reply. Thank you VERY much for the awesome help! You really saved me here! =) Thank you a thousand times! =)

          1 Reply Last reply Reply Quote 0
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors