• changing font not working in v8.6 (but changing size and color does work)

    25
    1 Votes
    25 Posts
    5k Views
    xomxX

    @Andi-Kiissel said in changing font not working in v8.6 (but changing size and color does work):

    In N++ UI this can not be changed,

    In the upcoming N++ version (probably v8.7.8) these Scintilla rendering modes will be accessible via the standard N++ Preferences > MISC.: GitHub commit.

    @Tobias-Lind said in changing font not working in v8.6 (but changing size and color does work):

    When DirectWrite is enabled, I can only use the base version of the font. Any other variant will default to some fallback font.

    The whole problem with the N++ DirectWrite mode & fonts is in the fact that the DirectWrite uses WSS (Weight-Stretch-Style) font family model whereas the older GDI RBIZ (Regular-Bold-Italic…) one. Notepad++ originally supported the older GDI font handling only, then it makes the newer DirectWrite mode accessible but did not accommodate the existing N++ GDI font handling code to match the WSS model of the DirectWrite font families.

    There was a nice patch for the Scintilla library, which solves exactly that but unfortunately it has not been accepted. So someone has to fix that directly in the N++ codebase. There is the MS DirectWrite interface IDWriteGdiInterop helper intended exactly for that job, so maybe a challenge for someone capable with free time…?

  • 2 Votes
    3 Posts
    81 Views
    CoisesC

    @HalfOffHell said in notepad not responding due to searching the regular expression "(?-i)\u*(?=[^\l])" in a file that has "└" as the last character:

    put “└” at the end of a file
    Ctrl+F
    paste “(?-i)\u*(?=[^\l])” into da box
    click “Find All in Current Document”
    notepad not responding

    I’ve filed an issue for this.

  • Windows content is blank on remote notepad++ sessions

    16
    0 Votes
    16 Posts
    2k Views
    A

    I made an account just to mention this might be an issue with the display driver for windows 11. While the solution of changing the rendering technology works, I have had issues with other programs not rendering correctly due to Windows 11 not knowing what to do when the lid of the laptop is closed with remoting in.

    The solution I am using for my job on remote machines is to add a virtual display adapter. Method 3 uses a display driver which once installed renders windows previously having issues. This is the method I have been using with much success. Here is the link to the explanations and files:

    https://itproexpert.com/windows-server-vm-screen-resolution-fix/

  • Button to toggle Tabs/Spaces?

    12
    0 Votes
    12 Posts
    785 Views
    PeterJonesP

    @Torax-Malu ,

    @Alan-Kilborn messaged me a link to this post, where he shared a script for the PythonScript plugin, which can be used to set the current file to either tabs or spaces

    Useful link: FAQ: How to install and runscripts with PythonScript plugin – also includes instructions for assigning a keyboard shortcut. And you can even use the Plugins > PythonScript > Configuration… to add a script to the toolbar, allowing you to have a toolbar button to activate it.

  • ctrl+alt+5 ALWAYS opens Notepad++ in Windows for me

    4
    0 Votes
    4 Posts
    86 Views
    PeterJonesP

    @mkupper said in ctrl+alt+5 ALWAYS opens Notepad++ in Windows for me:

    on the desktop and possibly elsewhere

    see if there is a shortcut for Notepad++ on your desktop

    a second location that Windows checks is a shortcut pinned to your task bar

    Also, if @ddalthorp has AutoHotKey or similar “hotkey utility”, there might be a “launch Notepad++” action for the keycombo defined in that utility, as well.

  • File ignoring tab settings

    3
    0 Votes
    3 Posts
    79 Views
    W

    @PeterJones Thanks - I had completely forgotten that I had that plugin installed, and somehow also missed it when posting the question. Disabling the plugin resolved the issue.

  • Style Configurator

    10
    0 Votes
    10 Posts
    276 Views
    PeterJonesP

    @Zainab ,

    I just learned something new today. If I don’t have a stylers.xml, and I hadn’t previously selected an existing theme, and the install directory (the one containing notepad++.exe) doesn’t include stylers.model.xml, then Notepad++ prompts with a dialog that tells me loading stylers.xml failed, like:
    3059d3f3-9683-4954-886c-a297fb5e2c86-image.png

    …if I OK from that point, Notepad++ loads, but if I try to launch Style Configurator, nothing happens. This seems similar to your circumstance.

    (yes, the way i accidentally reproduced your problem was by launching a copy I had built without having the right XML support files while running in doLocalConf mode)

    It gives me that same error if I had Dark Mode chosen, so it was using the DarkModeDefault theme selected, but still had no stylers.xml or stylers.model.xml existing, but then deleted DarkModeDefault.xml and restarted Notepad++.

    Given that you seem to have the Dark Mode in your screenshots, but the new 1 looks more like normal light mode, which matches what I saw in my Dark Mode experiment, I am now hypothesizing:

    You had selected Dark Mode, and one of the dark themes (possibly DarkModeDefault, or any of the other dark themes) Your copy of stylers.xml was deleted (or you changed to an alternate config setup, like cloud or --settingsDir or doLocalConf), and your copy of stylers.model.xml is not in the same directory as notepad++.exe You ran Notepad++, and saw that dialog, but closed it and neglected to tell us

    If you can re-populate stylers.model.xml (grab the official zip download, and copy the stylers.model.xml from that and put it in the same directory as the notepad++.exe that you are running), then re-start Notepad++, then it should not give that error, it should create stylers.xml from stylers.model.xml, and the Style Configurator should be able to launch again. (And if you want, you can also copy the themes from the zip to your <install dir>\themes folder to give you access to the other themes as well)

    Sharing your debug info (?-menu, Debug Info, Copy debug info into clipboard and paste into your reply) could confirm for us if you have a non-standard installation or portable Notepad++, which will help us be more specific about where to look for files and where to put them

    update: also, I noticed that you have logged in since our replies, but didn’t respond to any of our requests for more information. We won’t be able to help you any farther if you choose not to respond so that we can try to help you more. The ball is in your court at this point.

  • bookmark line

    2
    0 Votes
    2 Posts
    53 Views
    Alan KilbornA

    @miki-simone

    Sure. Use Regular expression search mode and separate search terms with |, e.g. apple|banana|orange|grape. Use the Mark tab and checkmark the Bookmark line option.

  • move lines containing specific words to a specific position

    5
    0 Votes
    5 Posts
    72 Views
    Rockberto ManentiR

    @Alan-Kilborn

    thanx again. I solved with a macro! Thanks again for the idea and initial regex!

  • Maintain Indent While Pasting Multiple Lines

    26
    1 Votes
    26 Posts
    3k Views
    Dennis BareisD

    @Alan-Kilborn

    Its been a while but I finally got some time to look and this again.

    Now that I saw the comment in the code pointing to PythonScript and setting up a HotKey it ended up being quite easy.

    Your script seems to do the right thing for what I need and if it doesn’t I’ll learn enough Python to update it and update this post.

  • An issue with background color

    2
    0 Votes
    2 Posts
    85 Views
    PeterJonesP

    @sdfh6srtytydgf-65cghj67ed56u8 ,

    Settings > Style Configurator > Language: Global Style > Style: Default > Background sets the default bg color used by any style that doesn’t override the bg.

  • Column mode edit is sticky and needs Esc to leave

    3
    0 Votes
    3 Posts
    74 Views
    CoisesC

    @Peter-Barnett said in Column mode edit is sticky and needs Esc to leave:

    What use is this sticky selection?

    Alan showed you where to switch it off.

    The use is for people who use multiple selections; it allows converting a column selection into a multiple selection. It was included when new features for working with multiple selections were added a few versions ago. For those of us who use rectangular selections often and multiple selections almost never, it’s more annoying than helpful. As I recall, the option to turn it off was added pursuant to our complaints.

    If you never use multiple selections, unchecking the Enable Multi-Editing box will also restore another behavior that was changed. With Multi-Editing enabled, you cannot hold down the Ctrl key, then click and drag in a selection to copy it. You must begin dragging, then press and hold Ctrl. (Ctrl first is used to remove or replace a selection from a multiple selection.) The familiar behavior, in which you can press and hold Ctrl before or after you begin dragging, works when the box is not checked.

  • JSON5 formatted data cannot be fully collapsed.

    7
    1 Votes
    7 Posts
    292 Views
    Mark OlsonM

    @sdurham
    Have you tried changing the logger_level setting to JSON5? As described in the documentation, the parser will log errors for many JSON5 features unless you do that.

    If you can give me an example of valid JSON5 (can be parsed by some major parser) that JsonTools v8.3.1 logs an error on when logger_level is JSON5, that is a bug and I will fix it.

  • Clean up text of non-printing characters

    9
    0 Votes
    9 Posts
    177 Views
    guy038G

    Hello,@ @m-fessler, @mathlete2, @alan-kilborn, @coises and All,

    @m-fessler, here is, below, a list of all the special Unicode characters which belong, either, to :

    The Z separator category ( Zs, Zl and Zp categories )

    The Cc Control character category ( except for the TAB, LF and CR ones )

    The Cf Format character category

    Two So Other Symbol characters ( \x{FFFC} and \x{FFFD} )

    This list contains 121 characters

    •---------•--------------------•--------------------------------------------•----------•------•--------• | Code | Regex | Character | Abbre. | GC | Chr. | •---------•--------------------•--------------------------------------------•----------•------•--------• | 0000 | \x{0000} | NULL | NUL | Cc | | 0001 | \x{0001} | START OF HEADING | SOH | Cc |  | 0002 | \x{0002} | START OF TEXT | STX | Cc |  | 0003 | \x{0003} | END OF TEXT | ETX | Cc |  | 0004 | \x{0004} | END OF TRANSMISSION | EOT | Cc |  | 0005 | \x{0005} | ENQUIRY | ENQ | Cc |  | 0006 | \x{0006} | ACKNOWLEDGE | ACK | Cc |  | 0007 | \x{0007} | BELL | BEL | Cc |  | 0008 | \x{0008} | BACKSPACE | BS | Cc |  | 000B | \x{000B} | VERTICAL TABULATION | VT | Cc | | 000C | \x{000C} | FORM FEED | FF | Cc | | 000E | \x{000E} | SHIFT OUT | SO | Cc |  | 000F | \x{000F} | SHIFT IN | SI | Cc |  | 0010 | \x{0010} | DATA LINK ESCAPE | DLE | Cc |  | 0011 | \x{0011} | DEVICE CONTROL ONE | DC1 | Cc |  | 0012 | \x{0012} | DEVICE CONTROL TWO | DC2 | Cc |  | 0013 | \x{0013} | DEVICE CONTROL THREE | DC3 | Cc |  | 0014 | \x{0014} | DEVICE CONTROL FOUR | DC4 | Cc |  | 0015 | \x{0015} | NEGATIVE ACKNOWLEDGE | NAK | Cc |  | 0016 | \x{0016} | SYNCHRONOUS IDLE | SYN | Cc |  | 0017 | \x{0017} | END OF TRANSMISSION BLOCK | ETB | Cc |  | 0018 | \x{0018} | CANCEL | CAN | Cc |  | 0019 | \x{0019} | END OF MEDIUM | EM | Cc |  | 001A | \x{001A} | SUBSTITUTE | SUB | Cc |  | 001B | \x{001B} | ESCAPE | ESC | Cc |  | 001C | \x{001C} | FILE SEPARATOR | FS | Cc |  | 001D | \x{001D} | GROUP SEPARATOR | GS | Cc |  | 001E | \x{001E} | RECORD SEPARATOR | RS | Cc |  | 001F | \x{001F} | UNIT SEPARATOR | US | Cc |  •---------•-------------------•--------------------------------------------•----------•------•--------• | 007F | \x{007F} | DELETE | DEL | Cc |  •---------•--------------------•--------------------------------------------•----------•------•-------• | 0080 | \x{0080} | PADDING CHARACTER | PAD | Cc | € | 0081 | \x{0081} | HIGH OCTET PRESET | HOP | Cc |  | 0082 | \x{0082} | BREAK PERMITTED HERE | BPH | Cc | ‚ | 0083 | \x{0083} | NO BREAK HERE | NBH | Cc | ƒ | 0084 | \x{0084} | INDEX | IND | Cc | „ | 0085 | \x{0085} | NEXT LINE | NEL | Cc | … | 0086 | \x{0086} | START OF SELECTED AREA | SSA | Cc | † | 0087 | \x{0087} | END OF SELECTED AREA | ESA | Cc | ‡ | 0088 | \x{0088} | HORIZONTAL TABULATION SET | HTS | Cc | ˆ | 0089 | \x{0089} | HORIZONTAL TABULATION WITH JUSTIFICATION | HTJ | Cc | ‰ | 008A | \x{008A} | VERTICAL TABULATION SET | VTS | Cc | Š | 008B | \x{008B} | PARTIAL LINE DOWN | PLD | Cc | ‹ | 008C | \x{008C} | PARTIAL LINE UP | PLU | Cc | Œ | 008D | \x{008D} | REVERSE INDEX | RI | Cc |  | 008E | \x{008E} | SINGLE-SHIFT 2 | SS2 | Cc | Ž | 008F | \x{008F} | SINGLE-SHIFT 3 | SS3 | Cc |  | 0090 | \x{0090} | DEVICE CONTROL STRING | DCS | Cc |  | 0091 | \x{0091} | PRIVATE USE 1 | PU1 | Cc | ‘ | 0092 | \x{0092} | PRIVATE USE 2 | PU2 | Cc | ’ | 0093 | \x{0093} | SET TRANSMIT STATE | STS | Cc | “ | 0094 | \x{0094} | CANCEL CHARACTER | CCH | Cc | ” | 0095 | \x{0095} | MESSAGE WAITING | MW | Cc | • | 0096 | \x{0096} | START OF PROTECTED AREA | SPA | Cc | – | 0097 | \x{0097} | END OF PROTECTED AREA | EPA | Cc | — | 0098 | \x{0098} | START OF STRING | SOS | Cc | ˜ | 0099 | \x{0099} | SINGLE GRAPHIC CHARACTER INTRODUCER | SGCI | Cc | ™ | 009A | \x{009A} | SINGLE CHARACTER INTRODUCER | SCI | Cc | š | 009B | \x{009B} | CONTROL SEQUENCE INTRODUCER | CSI | Cc | › | 009C | \x{009C} | STRING TERMINATOR | ST | Cc | œ | 009D | \x{009D} | OPERATING SYSTEM COMMAND | OSC | Cc |  | 009E | \x{009E} | PRIVACY MESSAGE | PM | Cc | ž | 009F | \x{009F} | APPLICATION PROGRAM COMMAND | APC | Cc | Ÿ •---------•--------------------•--------------------------------------------•----------•------•--------• | 00A0 | \x{00A0} | NO-BREAK SPACE | NBSP | Zs |   •---------•--------------------•--------------------------------------------•----------•------•--------• | 00AD | \x{00AD} | SOFT HYPHEN | SHY | Cf | ­ •---------•--------------------•--------------------------------------------•----------•------•--------• | 061C | \x{061C} | ARABIC LETTER MARK | ALM | Cf | ؜ •---------•--------------------•--------------------------------------------•----------•------•--------• | 070F | \x{070F} | SYRIAC ABBREVIATION MARK | SAM | Cf | ܏ •---------•--------------------•--------------------------------------------•----------•------•--------• | 0890 | \x{0890} | ARABIC POUND MARK ABOVE | | Cf | ࢐ | 0891 | \x{0891} | ARABIC PIASTRE MARK ABOVE | | Cf | ࢑ •---------•--------------------•--------------------------------------------•----------•------•--------• | 1680 | \x{1680} | OGHAM SPACE MARK | OSPM | Zs |   •---------•--------------------•--------------------------------------------•----------•------•--------• | 180E | \x{180E} | MONGOLIAN VOWEL SEPARATOR | MVS | Cf | ᠎ •---------•--------------------•--------------------------------------------•----------•------•--------• | 2000 | \x{2000} | EN QUAD | NQSP | Zs |   | 2001 | \x{2001} | EM QUAD | MQSP | Zs |   | 2002 | \x{2002} | EN SPACE | ENSP | Zs |   | 2003 | \x{2003} | EM SPACE | EMSP | Zs |   | 2004 | \x{2004} | THREE-PER-EM SPACE | 3/MSP | Zs |   | 2005 | \x{2005} | FOUR-PER-EM SPACE | 4/MSP | Zs |   | 2006 | \x{2006} | SIX-PER-EM SPACE | 6/MSP | Zs |   | 2007 | \x{2007} | FIGURE SPACE | FSP | Zs |   | 2008 | \x{2008} | PUNCTUATION SPACE | PSP | Zs |   | 2009 | \x{2009} | THIN SPACE | THSP | Zs |   | 200A | \x{200A} | HAIR SPACE | HSP | Zs |   •---------•--------------------•--------------------------------------------•----------•------•--------• | 200B | \x{200B} | ZERO WIDTH SPACE | ZWSP | Cf | ​ | 200C | \x{200C} | ZERO WIDTH NON-JOINER | ZWNJ | Cf | ‌ | 200D | \x{200D} | ZERO WIDTH JOINER | ZWJ | Cf | ‍ | 200E | \x{200E} | LEFT-TO-RIGHT MARK | LRM | Cf | ‎ | 200F | \x{200F} | RIGHT-TO-LEFT MARK | RLM | Cf | ‏ •---------•--------------------•--------------------------------------------•----------•------•--------• | 2028 | \x{2028} | LINE SEPARATOR | LS | Zl | 
 | 2029 | \x{2029} | PARAGRAPH SEPARATOR | PS | Zp | 
 •---------•--------------------•--------------------------------------------•----------•------•--------• | 202A | \x{202A} | LEFT-TO-RIGHT EMBEDDING | LRE | Cf | ‪ | 202B | \x{202B} | RIGHT-TO-LEFT EMBEDDING | RLE | Cf | ‫ | 202C | \x{202C} | POP DIRECTIONAL FORMATTING | PDF | Cf | ‬ | 202D | \x{202D} | LEFT-TO-RIGHT OVERRIDE | LRO | Cf | ‭ | 202E | \x{202E} | RIGHT-TO-LEFT OVERRIDE | RLO | Cf | ‮ | •---------•--------------------•--------------------------------------------•----------•------•--------• | 202F | \x{202F} | NARROW NO-BREAK SPACE | NNBSP | Zs |   | 205F | \x{205F} | MEDIUM MATHEMATICAL SPACE | MMSP | Zs |   •---------•--------------------•--------------------------------------------•----------•------•--------• | 2060 | \x{2060} | WORD JOINER | WJ | Cf | ⁠ •---------•--------------------•--------------------------------------------•----------•------•--------• | 2061 | \x{2061} | FUNCTION APPLICATION | (FA) | Cf | ⁡ | 2062 | \x{2062} | INVISIBLE TIMES | (IT) | Cf | ⁢ | 2063 | \x{2063} | INVISIBLE SEPARATOR | (IS) | Cf | ⁣ | 2064 | \x{2064} | INVISIBLE PLUS | (IP) | Cf | ⁤ •---------•--------------------•--------------------------------------------•----------•------•--------• | 2066 | \x{2066} | LEFT-TO-RIGHT ISOLATE | LRI | Cf | ⁦ | 2067 | \x{2067} | RIGHT-TO-LEFT ISOLATE | RLI | Cf | ⁧ | 2068 | \x{2068} | FIRST STRONG ISOLATE | FSI | Cf | ⁨ | 2069 | \x{2069} | POP DIRECTIONAL ISOLATE | PDI | Cf | ⁩ | 206A | \x{206A} | INHIBIT SYMMETRIC SWAPPING | ISS | Cf |  | 206B | \x{206B} | ACTIVATE SYMMETRIC SWAPPING | ASS | Cf |  | 206C | \x{206C} | INHIBIT ARABIC FORM SHAPING | IAFS | Cf |  | 206D | \x{206D} | ACTIVATE ARABIC FORM SHAPING | AAFS | Cf |  | 206E | \x{206E} | NATIONAL DIGIT SHAPES | NADS | Cf |  | 206F | \x{206F} | NOMINAL DIGIT SHAPES | NODS | Cf |  •---------•--------------------•--------------------------------------------•----------•------•--------• | 3000 | \x{3000} | IDEOGRAPHIC SPACE | IDSP | Zs |   •---------•--------------------•--------------------------------------------•----------•------•--------• | FEFF | \x{FEFF} | ZERO WIDTH NO-BREAK SPACE | ZWNBSP | Cf |  •---------•--------------------•--------------------------------------------•----------•------•--------• | FFF9 | \x{FFF9} | INTERLINEAR ANNOTATION ANCHOR | IAA | Cf |  | FFFA | \x{FFFA} | INTERLINEAR ANNOTATION SEPARATOR | IAS | Cf |  | FFFB | \x{FFFB} | INTERLINEAR ANNOTATION TERMINATOR | IAT | Cf |  •---------•--------------------•--------------------------------------------•----------•------•--------• | FFFC | \x{FFFC} | OBJECT REPLACEMENT CHARACTER | OBJ | So |  | FFFD | \x{FFFD} | REPLACEMENT CHARACTER | ? | So | � •---------•--------------------•--------------------------------------------•----------•------•--------• | 1BCA0 | \x{D82F}\x{DCA0} | SHORTHAND FORMAT LETTER OVERLAP | SFLO | Cf | 𛲠 | 1BCA1 | \x{D82F}\x{DCA1} | SHORTHAND FORMAT CONTINUING OVERLAP | SFCO | Cf | 𛲡 | 1BCA2 | \x{D82F}\x{DCA2} | SHORTHAND FORMAT DOWN STEP | SFDS | Cf | 𛲢 | 1BCA3 | \x{D82F}\x{DCA3} | SHORTHAND FORMAT UP STEP | SFUS | Cf | 𛲣 •---------•--------------------•--------------------------------------------•----------•------•--------•

    From this list, @m-fessler, which characters do you want to Search / Mark / Replace ?

    Moreover, do you want to ignore all characters above the BMP ( so, over \x{FFFF} ) or do you consider these characters as normal chars ?

    Once, you’ll know which characters you want to consider, it will be easy to get the appropriate REGEX search !

    Best Regards,

    guy038

  • .tab files are opening as toml files. How do I fix this?

    9
    0 Votes
    9 Posts
    90 Views
    mathlete2M

    @PeterJones not a big deal, but a more accurate quote of my reply would have been “[we] actually referenced…”. I understand where the inadvertent mistake was made, but technically, the current quote implies that I was claiming to have referenced two different files within my responses (specifically, the ones prior to the one that you quoted me from), which isn’t true.

    Either way, thanks for explaining the relationship between those two files! You’re quite right to suggest that I hadn’t looked this up yet. Knowing this dynamic now, and given that the OP didn’t actually specify which file(s) he had already looked at, it would have been a good idea to ask him to clarify this; it’s possible that he was looking at the same file that I was, and that the one that you mentioned has the TOML configuration that he was expecting to find.

  • Need to set tab STOPS, not spacing.

    22
    0 Votes
    22 Posts
    3k Views
    Alan KilbornA

    @Coises said :

    If what you want is for the tab key to insert a number of spaces,

    I guess we’re still unsure of what the posters that showed interest really want…

    include the ability to specify the exact tabstops for different file types and activate based on the file extension and/or the language. I don’t know if that degree of complexity is practical in a script.

    It’s certainly not a problem to script something like that.

  • Broken emoji

    5
    0 Votes
    5 Posts
    209 Views
    Hosein GSDH

    @PeterJones
    thank you so much for your detailed answer 🙏

    Actually the problem was with the output files of this chrome extension.
    I wrote a comment for that and today I used it again and noticed that it’s problems is solved.

    (before that, I treied to solve many problems using regex manually which was useful for things like & but not for things like like 💀. )

    @guy038 said in Broken emoji:

    I think that the clever @peterjones’s method can be recorded as a N++ macro !

    absolutely!

  • 0 Votes
    8 Posts
    941 Views
    Vladimír KordíkV

    @xomx Hello. I’m trying this solution DPI settings and it’s not working. I’ve tried all the options, including Application, System, and System (Enhanced).

  • Search for character classes but not replace them

    51
    0 Votes
    51 Posts
    3k Views
    guy038G

    Hi, @coises and All,

    You said,

    My thought is that it should be the same things Scintilla recognizes as line breaks and the Notepad++ documentation states: just \n and \r.

    I think that this reasoning is the right one ! More over, note that we use the same reasoning when we want to find all chars but a specific one, in each single line : we use the regex [^c\r\n], where c is the character we do not want to !

    Thus, against my Total_Chars.txt file, the regex (?s). should return 325,590 occurrences and the regex (?-s). should return 325,588 occurrences

    Now, regarding my question :

    Just because you do not allow backward searches when choosing the Regular expression search mode ! May be you could add it among all the Columns++ options ?

    I do understand all the reasons why you are not inclined to do so ! However, note that, as regularly using the regexBackward4PowerUser="yes" option, in the FindHistory node of the config.xml file, I can assure you that a lot, but not all, of regexes can be processed in backward direction ! Unfortunately, with our present Boost regex engine, you can verify my assertion :

    Backward regex searches, for NON ANSI files, stops as soon as it matches a character with code-point over \x{007F}

    I also tested the search of invalid UTF-8 bytes. To do so :

    Open a new N++ tab. ( I assume that its current encoding is UTF-8 ! )

    Run the Encoding > Convert to ANSI menu option

    Paste the text below, in this new ANSI tab

    ABC ퟿ XYZ \x{D7FF} ED 9F BF LAST valid char BEFORE Surrogates range ABC í € XYZ \x{D800} ED A0 80 FIRST SURROGATE char ABC í¿¿ XYZ \x{DFFF} ED BF BF LAST SURROGATE char ABC  XYZ \x{E000} EE 80 80 First valid char AFTER Surrogates range ABC € XYZ ABC  XYZ ABC ‚ XYZ ABC ƒ XYZ ABC „ XYZ ABC … XYZ ABC † XYZ ABC ‡ XYZ ABC ˆ XYZ ABC ‰ XYZ ABC Š XYZ ABC ‹ XYZ ABC Œ XYZ ABC  XYZ ABC Ž XYZ ABC  XYZ ABC  XYZ ABC ‘ XYZ ABC ’ XYZ ABC “ XYZ ABC ” XYZ ABC • XYZ ABC – XYZ ABC — XYZ ABC ˜ XYZ ABC ™ XYZ ABC š XYZ ABC › XYZ ABC œ XYZ ABC  XYZ ABC ž XYZ ABC Ÿ XYZ ABC   XYZ ABC ¡ XYZ ABC ¢ XYZ ABC £ XYZ ABC ¤ XYZ ABC ¥ XYZ ABC ¦ XYZ ABC § XYZ ABC ¨ XYZ ABC © XYZ ABC ª XYZ ABC « XYZ ABC ¬ XYZ ABC ­ XYZ ABC ® XYZ ABC ¯ XYZ ABC ° XYZ ABC ± XYZ ABC ² XYZ ABC ³ XYZ ABC ´ XYZ ABC µ XYZ ABC ¶ XYZ ABC · XYZ ABC ¸ XYZ ABC ¹ XYZ ABC º XYZ ABC » XYZ ABC ¼ XYZ ABC ½ XYZ ABC ¾ XYZ ABC ¿ XYZ ABC À XYZ ABC Á XYZ ABC  XYZ ABC à XYZ ABC Ä XYZ ABC Å XYZ ABC Æ XYZ ABC Ç XYZ ABC È XYZ ABC É XYZ ABC Ê XYZ ABC Ë XYZ ABC Ì XYZ ABC Í XYZ ABC Î XYZ ABC Ï XYZ ABC Ð XYZ ABC Ñ XYZ ABC Ò XYZ ABC Ó XYZ ABC Ô XYZ ABC Õ XYZ ABC Ö XYZ ABC × XYZ ABC Ø XYZ ABC Ù XYZ ABC Ú XYZ ABC Û XYZ ABC Ü XYZ ABC Ý XYZ ABC Þ XYZ ABC ß XYZ ABC à XYZ ABC á XYZ ABC â XYZ ABC ã XYZ ABC ä XYZ ABC å XYZ ABC æ XYZ ABC ç XYZ ABC è XYZ ABC é XYZ ABC ê XYZ ABC ë XYZ ABC ì XYZ ABC í XYZ ABC î XYZ ABC ï XYZ ABC ð XYZ ABC ñ XYZ ABC ò XYZ ABC ó XYZ ABC ô XYZ ABC õ XYZ ABC ö XYZ ABC ÷ XYZ ABC ø XYZ ABC ù XYZ ABC ú XYZ ABC û XYZ ABC ü XYZ ABC ý XYZ ABC þ XYZ ABC ÿ XYZ Now, choose the Encoding > UTF-8 encoding. So all characters of this ANSI file are re-interpreted as they were UTf_8 chars

    => You should see, between the strings ABC and XYZ :

    -The last VALID UTF-8 char ( ED 9F BF ) before the SURROGATE range

    The 3-bytes sequence of the first SURROGATE char, which is an INVALID sequence

    The 3-bytes sequence of the last SURROGATE char, which is an INVALID sequence

    The first VALID UTF-8 char ( EE 80 80 ) after the SURROGATE range

    Then, a list of the 128 IVALID UTF-8 characters as the UTF-8 encoding does NOT allow any 1-byte character OVER \x{007F} !

    Now :

    Move the caret to the first empty line

    Run the option Plugins > Columns++ > Search...

    Enter the range [\x{DC80}-\x{DCFF}] in the Find what : zone

    Click on the Find First button

    =>

    The Search region is set to the entire document

    The first INVALID byte \xED is selected

    Click on the Find Next button => It will select, one after another, all the other IVALID UTF-8 characters of this new tab !

    So, @coises, your new implementation works correctly, regarding the INVALID UTF-8 chars and I’m longing for your second experimental version ;-))

    Best Regards,

    guy038

  • Notepad++ RegEx question

    4
    0 Votes
    4 Posts
    105 Views
    guy038G

    Hello, @smudge-the-cat, @mkupper and All,

    @mkupper, if we’re certain that each record contains, both, the 2025-02-12 date and a name, an other solution could be :

    FIND / MARK (?-is)(?=.*2025-02-12)(?=.*(Soradesuu|Setokaiba|.......|Sauron80|Karim_Dommez)).+

    In other words, it would match, from beginning of line, all the line contents ONLY IF, BOTH :

    It contains the 2025-02-12 date

    It contains a name of the list, with the same case, between parentheses which can be re-used as ${1}, if necessary

    Best Regards,

    guy038