• 0 Votes
    1 Posts
    2k Views
    No one has replied
  • Characters Not Appearing Correctly

    3
    0 Votes
    3 Posts
    27k Views
    guy038G

    Hello Jdl Jacob, Klauss and All,

    From what you said, you try to extract valid strings, without control characters, from an executable program’s code, to achieve further searches on these strings, don’t you ? It’s a general well-known problem !

    A first approach would be to simply delete any C0 control characters ( from \x00 to \x1f ). However, this method is too restrictive. Indeed, executive programs may contain UNICODE strings, built with their true Unicode code-point, with the UCS-2 Little Endian encoding ( Two bytes, with the least significant byte first )

    For instance, taking the string Test One, in two lines, the ANSI encoding give us the logical list of bytes \x54\x65\x73\x74\x0D\x0A\x4F\x6E\x65 as the same UNICODE string would give the list of bytes \x54\x00\x65\x00\x73\x00\x74\x00\x0D\x00\x0A\x00\x4F\x00\x6E\x00\x65\x00

    So, you can’t simply search for the string Test One, avoiding UNICODE strings. In our example, in addition to the classical search of the string Test One, in two lines, with the Match case option checked, the search of \x54\x00\x65\x00\x73\x00\x74\x00\x0D\x00\x0A\x00\x4F\x00\x6E\x00\x65\x00 would give a second and valid match !

    Now, you should understand the need to NOT delete the NUL character ( \x00 ), too soon, as it’s part of UNICODE strings, which may contain useful information, too.

    Let us try, from an example executable file, to extract all pertinent strings. Of course, I haven’t any level game file, but we can simply use a copy of the Notepad++.exe file. I think that you’ll just have to follow the same method for your specific level game files !

    IMPORTANT : For this example, I will use the 6.8 version

    So, first al all, copy your Notepad++.exe file and rename it as TEST.txt.

    From now on, we’re going to perform some successive Searches/Replacements ( CTRL + H ), on the Test.txt file, using regular expressions.

    For all the S/R, below, I suppose that :

    The cursor location is the very beginning of the Test.txt file

    The Regular expression radio button is checked, in the Replace dialog

    The . matches newline option is UNCHECKED, in the Replace dialog

    All other options are unchecked

    The initial state of the Test.txt file is 9 194 lines long, for 2 054 656 bytes

    As, the classical EOL characters \n and \r haven’t any signification in an executable file, we, first, normalize all kinds of End of Line characters to the string \r\n, to get a classical Windows text file ! So, SEARCH = \r\0\n\0|\r\n|\r|\n and REPLACEMENT = \r\n We must, now, delete any character which is different from, either, a standard character, an EOL character or the NUL character. So, SEARCH = [^\0\n\r\x20-\x7e]+ and REPLACEMENT = NOTHING As any NUL character is normally separated from an other NUL character, by a standard character, in UNICODE strings, we can, therefore, change any consecutive list of NUL characters ( except the first one, which may be the last byte of an UNICODE string ) by an EOL, to easily notice all valid ANSI or UNICODE strings. So, SEARCH = (?<=\0)\0+ and REPLACEMENT = \r\n

    => Now, the Test.txt file is 64 341 lines long for 817 228 bytes. All lines have a Windows EOL and it contains, mostly, standard characters, and, also, some NUL characters ( \x00 ). But, you’ll notice that, from now on, there are no more sequence of two consecutive NUL characters !

    As the NUL characters, placed in the replacement part, are NEVER re-written ( Bug ! ), we simply have to change any NUL character by a specific character, which is NOT part of standard characters. I chose the Bullet, of ANSI code = \x95, or \x{2022} ( its UNICODE code-point ), in an NON-ASCII file. So, SEARCH = \0 and REPLACEMENT = \x95 Now, we’ll try to isolate and mark the different UNICODE strings. However, we must take care of a special case, where an ANSI string is followed by an UNICODE string, with, only, one NUL ( or Bullet ) character, as a separator.

    For instance, assuming that the symbol Ø stands for the NUL character, the sequence TestØTØeØsØtØ must be decoded as an ANSI string Test, followed by the same UNICODE string, with a NUL character as separator ( and NOT as the ANSI string tes, immediately followed by the UNICODE string tTest )

    => SEARCH = (?<![\x20-\x7e][\x20-\x7e])(?:[\x20-\x7e]\x95){3,} and REPLACEMENT = \r\n\x93$0\x94\r\n

    Note : I suppose that any valid UNICODE string must contain, at least, three characters. Then, we search for a minimum of 3 sequences standard + NUL characters, ONLY IF it’s NOT preceded by 2 standard characters

    In the replacement part, we re-write the entire search match $0, surrounded by the double quotation marks ( \x93 or \x{201c} and \x94 or \x{201d} ), and preceded and followed by an EOL.

    As the UNICODE strings are, now, clearly identified, we can get rid of the Bullet character ( which represented the NUL symbol ), inside a “…” sequence, for an easier reading ! So SEARCH = (?=.*\x94)\x95 and REPLACEMENT = NOTHING

    Note : At each position, where it matches the Bullet, the look-ahead regex structure verifies if it exists, further, in the current line, a closing double quotation mark ( \x94 ). By this way, we’re sure that the deleted bullet was, indeed, part of an UNICODE string, ONLY !

    => After these 3 other S/R, the Test.txt file is, now, 75 625 lines long for a size of 796 660 bytes. The UNICODE strings are correctly extracted.

    We can, from now on, delete the remaining bullet characters, located between the ANSI strings and replace them by an EOL, to clearly see the ANSI strings. So, SEARCH = \x95 and REPLACEMENT = \r\n With all the EOL characters successively added, we must clean up the file ! We’re going to suppress any empty line, or containing ONLY BLANK characters. So, SEARCH = ^ *\R and REPLACEMENT = NOTHING Finally, as for the UNICODE strings, we’ll delete any ANSI string, containing less than 3 characters. So, SEARCH = ^.{1,2}\R and REPLACEMENT = NOTHING

    After, these 9 S/R, you get a Test.txt file, of 52 830 lines long, for a total of 566 652 bytes ! But, looking through this file, it easy enough to detect that valuable strings are located in two main zones : from line 36866 to line 41392 and from line 50857 to line 52094. Once these two parts isolated, you’ll still have to manually delete some non-pertinent strings.

    I personally got a Test.txt file of 3667 lines. If you dispose of an hexadecimal editor;, you may translate some of these strings or sentences, in your mother language. Don’t forget to rewrite any UNICODE string according to the UCS-2 Little Endian encoding, with the same length !

    For instance, there are, in lines 39500 and 39501 of Test.txt file, the UNICODE strings “OVR” and “INS”, giving the writing mode of text ( Insertion or Overwriting ), in the N++ status bar. In my mother French language, these two strings are INS and RFP. So I could change, in a copy of Notepad++.exe, the string \x4f\x00\x56\x00\x52\x00 by the sequence \x52\x00\x46\x00\x50\x00. Et voilà !

    I hope, Jdl Jacob, that you could use these same above S/R, for your specific needs, on level games files. You could, even, shorten it a bit, if your file doesn’t have any UNICODE string. The S/R, of numbers 4, 5 and 6 would, then, be useless. In that case, the 7th S/R needs a small change : SEARCH = \0 and REPLACEMENT = \r\n

    Best Regards,

    guy038

  • Zen Coding Plugin

    Locked
    4
    0 Votes
    4 Posts
    9k Views
    Ray SilvaR

    @Yaron Thanks.

  • NP++ spying on my clipboard?

    Locked
    3
    0 Votes
    3 Posts
    4k Views
    jonandrJ

    I fixed some clipboard bugs a few release back, so try the latest version and see if the problem still exists.

  • 0 Votes
    3 Posts
    3k Views
    oroborosO

    Maybe we could suggest to add the option of creating a hotkey in the setup?

  • Plugin Template for vs2015

    Locked
    1
    0 Votes
    1 Posts
    3k Views
    No one has replied
  • I miss my macros after updating to 6.8.1

    Locked
    1
    0 Votes
    1 Posts
    2k Views
    No one has replied
  • notepad++ in Windows 10 wont Open .nfo files

    Locked
    3
    0 Votes
    3 Posts
    8k Views
    Chris CullenC

    That did the trick!
    its a shame tho as i use this plugin a lot.
    Regards,
    Chris

  • Pasting text grabbed by Alt-Shift?

    Locked
    1
    0 Votes
    1 Posts
    2k Views
    No one has replied
  • Win 10 Error - JS Macro Console

    Locked
    1
    0 Votes
    1 Posts
    2k Views
    No one has replied
  • " showing as @

    Locked
    2
    0 Votes
    2 Posts
    2k Views
    Simon CowellS

    ok got it now, must have pressed left Alt + Shift accidentally

  • Slowdown with pasting text

    12
    0 Votes
    12 Posts
    8k Views
    RicardoR

    The screenshot shows an auto-complete list of terms. It can be configured in Settings > Auto-Completion

  • Customise search engine (Google Search to other search)

    Locked
    2
    0 Votes
    2 Posts
    3k Views
    Jan SchreiberJ

    You can easily do that by editing the file shortcuts.xml, usually located in the folder %appdata%\Notepad++.

  • Computer Freezes Whenever I Launch Program

    3
    0 Votes
    3 Posts
    4k Views
    donhoD

    @AwkwardIrishman If your Notepad++ version is v6.8, it seems the problem is due to your (graphic?) drivers of your windows which take time to deal with otf (font) format.

    I believe the improved binary (using ttf instead of otf) solve this problem:
    https://notepad-plus-plus.org/temp/npp.7z
    Please unzip it into a new created folder, then launch notepad++.exe.

    Let me know if your issues are solved.

  • UTF-32

    Locked
    1
    0 Votes
    1 Posts
    3k Views
    No one has replied
  • Shopping for a new go-to text editor/word processor

    3
    0 Votes
    3 Posts
    4k Views
    David BaileyD

    I much prefer tools that don’t get ‘enhanced’ in all sorts of arbitrary directions, so you can browse the internet, read email, compose music, write code, edit binary, and interface to Facebook all within one package (I exaggerate very slightly!).

    Editing formatted text is vastly different from editing source code or other raw text, and I think it is far better to use different tools for those two processes. Cut/paste and the file-store provide excellent way to communicate between your tools!

    With free software, you don’t need a tool that does everything, you can afford several tools!

    David

  • Compiling notepad++

    Locked
    3
    0 Votes
    3 Posts
    3k Views
    Sdvfeefefsdfe CfvfewefesdfS

    @jonandr

    Thanks!

  • 0 Votes
    3 Posts
    4k Views
    RicardoR

    Next version will have fixes for that issue.

  • Command line

    Locked
    4
    0 Votes
    4 Posts
    5k Views
    RicardoR

    Sorry for the misunderstanding…
    I could not found a command-line for that, but you can load Session files…
    A session file is a XML file with this structure:

    <NotepadPlus> <Session activeView="1"> <mainView activeIndex="0"> <File firstVisibleLine="0" xOffset="0" scrollWidth="288" startPos="0" endPos="0" selMode="0" lang="XML" encoding="-1" filename="C:\file1.xml" backupFilePath="" originalFileLastModifTimestamp="1438295994" /> </mainView> <subView activeIndex="0"> <File firstVisibleLine="0" xOffset="0" scrollWidth="288" startPos="0" endPos="0" selMode="0" lang="XML" encoding="-1" filename="C:\file2.xml" backupFilePath="" originalFileLastModifTimestamp="1438295994" /> </subView> </Session> </NotepadPlus>

    Files in “other view” (at right) are inside <subView> tag.
    You can open it with a command-line such as this:
    notepad++.exe -openSession "MySession.xml"

  • F95

    Locked
    2
    0 Votes
    2 Posts
    3k Views
    jonandrJ

    This is mainly a forum for N++. I think you will have much better luck getting answers if you seek out a forum dedicated either to Fortran in general or the specific Fortran compiler you are using.