• convert .txt file to pdf - missing last char

    4
    0 Votes
    4 Posts
    363 Views
    PeterJonesP

    If sodapdf.com is doing the conversion from .txt to .pdf, why do you think it’s a Notepad++ problem?

    If you can show that sodapdf.com works with a .txt file with more than 80 characters on a line that was created in some other application, but does not work with a .txt file with more than 80 characters on a line that was created in Notepad++, then we might be able to give some ideas. But if the same thing happens no matter what creates the >80-character text line, then you might want to ask the folks at sodapdf.

  • Hide/Unhide Non-Bookmarked Lines

    7
    0 Votes
    7 Posts
    6k Views
    Mike SmithM

    @Alan-Kilborn

    It seems like you could run your bookmarking operation, then do a “Inverse Bookmark” command, and then run the script @Ekopalypse provided…to get what you want?

    Yes, that’s a good idea! I did just that, and it hid everything I didn’t need to edit. Thank you.

    24000 things to examine and edit is a huge manual task.

    I don’t thing it’s something that can be automated though. The issue is that I have all the download links in one database (which I’m editing through Notepad++), and the files are stored on a seperate server. The task I am currently processing, is to randomize all filenames:

    https://www.example.com/74547854787fileone.zip https://www.example.com/56647979548filetwo.zip https://www.example.com/01324679462filethree.zip https://www.example.com/64647452105filefour.zip

    So not only do I need to complete the task of randomizing the filenames (A task I’m achieving using Bulk Rename Utility), I have to then ensure the download links are changed to represent the relevant filename.

  • Find&Replace pairs of capitalised words

    4
    1 Votes
    4 Posts
    314 Views
    Joaquin BuenoJ

    Thanks to both, @Ekopalypse and @guy038 :) :) :)

    Both codes worked like a charm.

    For anyone interested in these codes, @Ekopalypse 's code finds every space-separated pair of words starting with a capital letter , and @guy038 's code finds every space between words starting with a capital letter. Both were good for what I needed to do!

  • Colors from stylers.xml are wrong for JSON

    2
    1 Votes
    2 Posts
    273 Views
    raul1roR

    I also tried with other themes and same problem. Wrong colors.

  • marked and copy

    2
    0 Votes
    2 Posts
    137 Views
  • Regex - Positive Look Behind With *

    7
    1 Votes
    7 Posts
    4k Views
    Ray-HR

    Hello @guy038,

    Thank you for the detailed breakdown of the problem. There is a lot of useful knowledge here. I especially appreciate you pointing out the subtleties brought upon the look-arounds by their atomic structure.

    Best,
    Ray

  • Set file type to default to ".txt" file

    7
    0 Votes
    7 Posts
    740 Views
    EkopalypseE

    @Alan-Kilborn

    At the end it depends what one is doing.
    If most of the time I create a new file by cutting/posting/pasting then
    this might be the solution - for other tasks it might be not.

  • one sentence per line

    5
    0 Votes
    5 Posts
    1k Views
    Dragoon 35D

    Thank you @guy038

  • Windows 10 Speech Recognition does not work well with Notepad++

    1
    0 Votes
    1 Posts
    257 Views
    No one has replied
  • shortening url

    8
    0 Votes
    8 Posts
    1k Views
    guy038G

    Hi, @wessel-bogers, @peterjones and All,

    @wessel-bogers, and All, when, you ask for modifications of data, with regular expresions, please, please …

    Give us a fairly large amount of your  INPUT  text ( as you did,@wessel-bogers, in your previous post )

    Give us, also, the expected  OUTPUT  text, from your specific  INPUT  test. That’s the  KEY  point !

    Then, from the differences between these two texts, even if you cannot express simply yourself, regarding your goal, we should be able to guess your needs, most of the time ;-))

    Remember : If you clearly defined the rules to process, in your  INPUT  text, half the job is already done ;-))

    See you later,

    Best Regards,

    guy038

  • UDL code folding

    5
    0 Votes
    5 Posts
    716 Views
    EkopalypseE

    This example doesn’t seem to be appropriate to show your issue with code folding.
    If I put ( and ) in Folding in code 1 style then I get the same what lisp seems to do. Left one is UDL, right one is LISP

    b8ebe726-8822-4082-9458-46da11fa680d-image.png

    The link provided doesn’t explain code workflow.
    At the moment I have to assume that a code block is, indeed,
    identified as the part between an open bracket and a closing bracket.

  • File Too Big Inconsistency

    2
    0 Votes
    2 Posts
    195 Views
    PeterJonesP

    Memory usage varies – both in Notepad++ and in the other applications and background tasks running on your machine. The closing of Notepad++ and reloading will free all the memory except that used by the active file, which gives you the biggest chance. For files in the hundreds of megabytes, it’s probably better to use the sequence of exiting all NPP instances, then looping on “start new NPP instance with no files open; open huge file; process; close file; exit NPP”

    You don’t say which version of Notepad++ you use, or whether you’re 32-bit or 64-bit (? > Debug Info would be able to easily tell us), but 64-bit should be able to handle bigger files than 32-bit (unfortunately, because of the way the Scintilla editor component is written, not files as big as 64-bit addresses would imply are possible).

    Other than that, you might try closing as many other applications as you can, and doing anything you can to limit memory usage. You might try running without plugins ("c:\program files\Notepad++\notepad++.exe" -noPlugin "superhugefile.txt") to limit Notepad++'s memory usage.

    If all your processing involves is running one or more regex inside the file, you might want to try getting a windows copy of sed or windows copy of gawk, or use a full programming language like Perl or Python – all of which may (or may not) be able to make the transformations in your text that you need without requiring the entire file to be held in memory – it depends on the processing you need to do. (Using the PythonScript plugin as a Python interpreter, you could write your script and run it inside Notepad++, even if you aren’t using the PythonScript-specific access to the currently-open-files: you’d basically be using the PythonScript.dll instance of Python instead of installing your own python.exe. Note: it wouldn’t be enough to use PythonScript to load the huge file into Notepad++, because that would still have the full memory requirement, which you are running up against; you’d have to just use standard Python text-processing, trying to do it line-at-a-time or chunk-at-a-time, rather than whole-file-in-memory.)

    (edit: in case I wasn’t clear, the sed/gawk/perl/python solution would be outside of Notepad++; you could use NPP to write your script for one of those tools, and even use its run menu and similar functionality to launch the process… but it wouldn’t be doing the processing to the open files inside Notepad++, and wouldn’t be a solution that we’re really equipped to help you with, since this isn’t a general-programming forum.)

  • Regex single dot character in group behaves differently than not in group

    3
    0 Votes
    3 Posts
    1k Views
    guy038G

    Hi, @matthews-dylan and All,

    I apologize for my very late reply, but I needed to do numerous verifications and tests ! I’m going to start with some general topics, and, then, I’ll come back to your specific problem to tell you why your second regex ^(.)*$ matches empty lines only and I’ll give you a solution in order to delete any line which does not contain any Emoji character. Take your time and have a drink : this post is quite long ;-))

    First, I would say that most of the monospaced fonts, using in code editors, can display the glyphs of traditional characters only ! So, you need to get a more robust font, which could display most of Unicode symbols properly ;-))

    So, refer to the last section of my other post, below :

    https://community.notepad-plus-plus.org/post/50673

    Now, after pasting the input line of your post, with my current N++ Courier New font, I get the line, below, where your character, not handled with that font, is simply replaced with a small white square box :

    `Input line: □

    To get information in that character, refer, again, to the last section of this other post, which speaks about a very handy on-line UTF-8 tool :

    https://community.notepad-plus-plus.org/post/50983

    With the help of this tool, we deduce that your special char has the following characteristics :

    Character name SPLASHING SWEAT SYMBOL Hex code point 1F4A6 Decimal code point 128166 Hex UTF-8 bytes F0 9F 92 A6 Octal UTF-8 bytes 360 237 222 246 UTF-8 bytes as Latin-1 characters bytes ð <9F> <92> ¦ Hex UTF-16 Surrogates D83D DCA6

    Refer to the link, below, to see all the characters of the Unicode Miscellaneous Symbols and Pictographs block :

    http://www.unicode.org/charts/PDF/U1F300.pdf

    Note that the Unicode code-point of this character is 1F4A6, which is over the first 65536 characters of the Basic Multilingual Plane ( BMP ) Therefore, this means that :

    It is correctly encoded in an UTF-8 encoded file. So, you must use the N++ UTF-8 or UTF-8 BOM encodings, which can handle all Unicode characters, from \x{0000} to \x{10FFFF}

    It cannot be inserted in an ANSI encoded file, which handle 256 characters, only, from \x{00} to \x{FF}

    It cannot be inserted in a N++ UCS-2 BE BOM and UCS-2 LE BOM encoded file, which can handle only the 65536 characters of the BMP, from \x{0000} to \x{FFFF}

    Moreover, as the code-point of your character is over \x{FFFF} :

    It cannot be represented with the regex syntax \x{1F4A6}, due a bug of the present Boost regex engine, which does not handle all characters in true 32-bits encoding :-(( Also, searching for \x{1F4A6} results in the error message Find: Invalid regular expression

    The simple regex dot symbol . cannot match a character, with Unicode code-point > \x{FFFF}, too !

    Luckily, if you paste your character in the Find what: zone, it does find all occurrences of the SPLASHING SWEAT SYMBOL character !

    Now, the surrogates mechanism allows the UTF-16 encoding ( not used in Notepad++ ) to be able to code all characters with code-point over \x{FFFF}. Refer below :

    https://en.wikipedia.org/wiki/UTF-16#Description

    And I found out that if I write a regex, involving the surrogates pair ( 2 16-bit units ) of a character, which is over the BMP, the regex engine is able to match this character. For instance, as the surrogates pair of your character are : D83D DCA6, the regex \x{D83D}\x{DCA6} does find all occurrences of your SPLASHING SWEAT SYMBOL character !

    I’ve done a lot of tests and, unfortunately, using a similar syntax, to get any char, with code over \x{FFFF}, most of the regexes do not work.

    Indeed, as the high 16-bits surrogate belongs to the [\x{D800}-\x{DBFF}] range and the low 16-bits surrogate belongs to the [\x{DC00}-\x{DFFF}] range :

    The regex [\x{D800}-\x{DBFF}][\x{DC00}-\x{DFFF}] does not find any match

    The regex [\x{D800}-\x{DBFF}]\x{DCA6} does not find any match, too

    Luckily, the regex \x{D83D}[\x{DC00}-\x{DFFF}] does match your special 💦 character :-))

    So, in summary, because of the wrong handling of characters, in the present implementation of the Boost Regex library, within Notepad++ :

    To match any standard character, from \x{0000} to \x{FFFF} ( NOT EOL chars and the Form Feed char \x0c ), use the simple regex .

    To match any standard character from \x{10000} to \x{10FFFF}, use the regex .[\x{DC00}-\x{DFFF}] OR the shorter syntax ..

    To match all standard characters, from \x{0000} to \x{10FFFF}, use the regex .[\x{DC00}-\x{DFFF}]? OR the shorter syntax ..?

    And :

    To match a specific character of the BMP, from \x{0000} to \x{FFFF} use the regex syntax \x{....}, with four hexadecimal numbers

    To match a specific character over the BMP, from \x{10000} to \x{10FFFF}, use the high and low surrogates equivalent pair, with the regex syntax \x{<high>}\x{<low>}, replacing the <high> and <low> values with their exact hexadecimal values, using 4 hexadecimal numbers

    First example :

    From the list of chars, below : •----------------------------------•------------•-------•-------------------------•-------------------•--------------------------• | Character NAME | Code-Point | Char | In a UTF-8 encoded file | Hex-16 Surrogates | SEARCH Regex | •----------------------------------•------------•-------•-------------------------•-------------------•--------------------------• | LATIN CAPITAL LETTER A | 0041 | A | 41 | N/A | \x{0041} or . | | MATHEMATICAL BOLD CAPITAL A | 1D400 | 𝐀 | F0 9D 90 80 | D835 + DC00 | \x{D835}\x{DC00} or .. | | COMBINING GRAVE ACCENT BELOW | 0316 | ̖ | CC 96 | N/A | \x{0316} or . | | COMBINING LEFT ANGLE ABOVE | 031A | ̚ | CC 9A | N/A | \x{031A} or . | | MUSICAL SYMBOL COMBINING MARCATO | 1D17F | 𝅿 | F0 9D 85 BF | D834 + DD7F | \x{D834}\x{DD7F} or .. | •----------------------------------•------------•-------•-------------------------•-------------------•--------------------------• We may build up some COMPOSED characters, as below : •-----------------------•-------•-------------------------•----------------------------•--------------------------------------------• | Code-Points | Chars | In a UTF-8 encoded file | Hex-16 Surrogates | SEARCH Regex | •-----------------------•-------•-------------------------•----------------------------•--------------------------------------------• | 0041 + 031A | A̚ | 41 CC 9A | NO | \x{0041}\x{031A} or .. | | 0041 + 1D17F | A𝅿 | 41 F0 9D 85 BF | D834 + DD7F ( on 2nd char) | \x{0041}\x{D834}\x{DD7F} or ... | | 1D400 + 031A | 𝐀̚ | F0 9D 90 80 CC 9A | D835 + DC00 ( on 1st char) | \x{D835}\x{DC00}\x{031A} or ... | | 1D400 + 1D17F | 𝐀𝅿 | F0 9D 90 80 F0 9D 85 BF | D835 + DC00 + D834 + DD7F | \x{D835}\x{DC00}\x{D834}\x{DD7F} or .... | | 0041 + 1D17F + 031A | A𝅿̚ | 41 F0 9D 85 BF CC 9A | D834 + DD7F ( on 2nd char) | \x{0041}\x{D834}\x{DD7F}\x{031A} or .... | | 0041 + 031A + 1D17F | A𝅿̚ | 41 CC 9A F0 9D 85 BF | D834 + DD7F ( on 3rd char) | \x{0041}\x{031A}\x{D834}\x{DD7F} or .... | | 1D400 + 031A + 0316 | 𝐀̖̚ | F0 9D 90 80 CC 9A CC 96 | D835 + DC00 ( on 1st char) | \x{D835}\x{DC00}\x{031A}\x{0316} or .... | •-----------------------•-------•-------------------------•----------------------------•--------------------------------------------•

    Second example: If we use any of the 3 following regex S/R :

    SEARCH (?-s)^.+(.[\x{DC00}-\x{DFFF}]).+

    or :

    SEARCH (?-s)^.+\x20(..)\x20.+

    or :

    SEARCH (?-s)^.+(\x{D83D}\x{DCA6}).+

    and :

    REPLACE A necklace of the SPLASHING SWEAT SYMBOL ––\1––\1––\1––\1––\1––\1––\1––\1––\1––

    against the text This is the 💦 character, at the beginning a line, we get the resulting text :

    A necklace of the SPLASHING SWEAT SYMBOL ––💦––💦––💦––💦––💦––💦––💦––💦––💦––

    Now, let’s go back to your problem :

    Fundamentally, the problem arise because your special 💦 character can be matched with the regex .., only, regarding our present regex engine. It looks like, for these characters, the regex engine don’t see the character itself, but the two surrogate 16-bits code units !

    When you process the regex ^.*$ against your text : Input line: 💦, it does match the entire line, as the regex syntax .* means any number of chars ( . or .. or ..., and so on )

    Now, let’s consider the following regex syntaxes, with a capturing group 1, against this 4-lines text, pasted in a new tab :

    💦 Input line: 💦

    Note that the 1st and 3rd line are empty, the 2nd line contains your 💦 special char, only and the 4th line ends with that special char

    Regarding the following regex examples, below, you may test them, using the -->\1<-- Replace zone

    Before, a quick remainder :

    The INPUT text : 167844894321 16784 4566499 with the regex S/R : SEARCH (\d)+ REPLACE -->\1<-- would result in : -->1<-- -->4<-- -->9<--

    As you can see, group 1 always contains the last stored value of the group. So, the regex could also have been rewritten as \d+(\d)

    The regex ^(.)$ cannot find anything, as no character, with code <= \x{FFFF}, exists between beginning and end of line

    The regex ^(..)$ does find, in line 2, your 💦 special character, with code > \x{FFFF}, between beginning and end of line

    Your regex ^(.)*$ simply matches the true empty lines 1 and 3. WHY ?
    Well, as the group contains only one dot ., it cannot match your last 💦 special character, in line 2 and 4, which needs to be considered as a pseudo two-chars entity. So the overall regex fails, in these lines !

    The regex ^(..)*$ does match all the lines of the subject text, because, luckily, the part Input line:, followed with a space char, is exactly 12 chars long, so an even number ! And the last value of group 1 is your 2-chars 💦 special char, right before the end of the line

    Notes :

    The regex ^.*(..)$ would match all the non-empty lines 2 and 4, because group 1, .., represents your 💦 special char, ending these lines

    And the regex ^(?:..){6}(..)$ would match the line 4, only

    The regex ^.............(.)$ does not work properly, because group1 does not contain the 💦 special character ( See after the replacement ! )

    On the contrary, the regex ^............(..)$ does find all contents of line 4, as the group 1, .., contains, exactly, the 💦 special character

    On the other hand :

    The regex ^(.)* selects as many standard characters, with code-point <= \x{FFFF}, so the following strings, but NOT your LAST 💦 special character !

    The null string before your 💦 special char, in line 2

    The string Input line:, followed with a space char, in line 4

    And, finally :

    The two regexes (.*)$ and (.*), with group 1 selecting all line contents, would match the four lines

    Now, your last goal : let’s suppose that you would like to delete any line, which does not contain any Unicode Emojis character :

    First, from that link :

    http://www.unicode.org/charts/PDF/U1F600.pdf

    We learn that the Unicode Emoticons block have code-points between \x{1F600} and \x{1F64F}

    With the on-line UTF-8 toll, we verify that the two Hex UTF-16 surrogates are :

    D83D DE00, for the \x{1F600} emoticon

    D83D DE4F, for the \x{1F64F} emoticon

    So, we should match all the characters of the Unicode Emoticons block, with the search regex :

    SEARCH \x{D83D}[\x{DE00}-\x{DE4F}]

    And, yes, it does work as expected. In that case, deleting any non-empty line which does not contain any Emoticon character(s) is easy with the following regex S/R :

    SEARCH (?-s)^(?!.*\x{D83D}[\x{DE00}-\x{DE4F}]).+\R

    REPLACE Leave EMPTY

    In contrast, the regex S/R :

    SEARCH (?-s)^(?=.*\x{D83D}[\x{DE00}-\x{DE4F}]).+\R

    REPLACE Leave EMPTY

    would delete any non-empty line containing one or more emoticon character(s) !

    Not asleep yet ? That’s good news :-))

    Best Regards,

    guy038

    P.S. :

    Let’s suppose that, instead of the small Unicode Emoticons block, containing 80 characters, we would like to search for any character belonging to the Unicode Miscellaneous Symbols and Pictographs block, which contains 768 characters and where your special 💦 char takes place

    Right now, it’s getting really inextricable ! The Unicode range of that block is from \x{1F300} to \x{1F5FF}, but, because of the surrogates mechanism, it must be split in two parts :

    The range of chars between \x{1F300} and \x{1F3FF}, so with surrogates pairs D83C DF00 to D83C DFFF

    The range of chars between \x{1F400} and \x{1F5FF}, so with surrogates pairs D83D DC00 to D83D DDFF

    Therefore, the correct regex to match all the characters of this block is, indeed :

    \x{D83C}[\x{DF00}-\x{DFFF}]|\x{D83D}[\x{DC00}-\x{DDFF}]

    with an alternative between two regexes, in order to match each subset !

    I confirm that this regex does find the 768 characters of the Unicode Miscellaneous Symbols and Pictographs block, with code-point over \x{FFFF} !

    It’s really a pity that the N++ regex engine does not handle correctly all the characters outside the BMP. If so, we just would have to simply use the classical [\x{1F300}-\x{1F5FF}] character class !!

  • Regex: Delete all signs/operator like (comma) from tags < >

    11
    -1 Votes
    11 Posts
    1k Views
  • Change a series of numbers when they start with 2

    7
    0 Votes
    7 Posts
    339 Views
    massimo la terraM

    ok grazie ;-)

  • 0 Votes
    2 Posts
    176 Views
    PeterJonesP

    Run > Launch In Edge has been deprecated for multiple versions of Notepad++.

    If you have a recent version of Notepad++ (v7.8.4 recommended; 7.8.2 had a bugfix, so I wouldn’t recommend any versions earlier than that), you should be using View > View Current File In > Edge. For me, that allows me to open my HTML document in
    41e9e816-68f4-4119-bb95-917572ea4b6c-image.png . I have no idea if that is “the newest version of” Edge or not, since I almost never run Edge.

  • converting my text in notepad++ to chrome is not converting

    6
    0 Votes
    6 Posts
    394 Views
    PeterJonesP

    @Kenneth-Keen ,

    Unfortunately, we don’t seem to be able to communicate with each other. Your words are English as are mine, but the meaning behind the groups of words seems to be getting lost somewhere. I am sorry for that.

    it tells me than an “error 2” is showing

    Given that you’re on a Chromebook, and I think Chrome (like Android) is a linux-ish OS, error 2 would normally be ENOENT, “No such file or directory”. I have no idea if that’s really the case for you.

    Just like I don’t understand how you think you are running notepad++.exe on your chromebook. Maybe you downloaded notepad++.exe and are trying to execute that natively on the chromebook, which would very much surprise me if it worked. Maybe you are trying to say that when you try to run notepad++.exe, you are getting an error 2 “No such file or directory”. That wouldn’t be the error I would expect for an executable that cannot run, but who knows.

    Can someone inform me about a code editor like “notepad ++” that is for the chromebook

    There may or may not be Chromebook users here in the Notepad++ forum. But this isn’t really the best place. You might find a forum dedicated to chromebook users, or let me google “best html code editors for chromebook” for you, where this article has a couple of suggestions, and the website looks like it might be a good resource for an apparently-new-to-chromebook user.

  • Adding text to toolbar icons

    3
    0 Votes
    3 Posts
    322 Views
    Alan KilbornA

    @Michael-Vincent said in Adding text to toolbar icons:

    turn off the toolbar

    I don’t know if I’d want to do that, although I’m not a toolbar junk, it has its uses (especially, the macro buttons, and the single toolbar button I have tied to a Pythonscript).

    use MenuSearch

    I must say, it’s an excellent plugin (having just started using it). No more hunting thru the Preferences (ugh). And, it will find my Pythonscripts and let me run them (left-click on them) or pull them into the editor (ctrl+left-click on them). This latter function really reduces the value of my "single toolbar button I have tied to a Pythonscript.

  • Returning from Post-it or Full Screen mode without shortcuts

    7
    0 Votes
    7 Posts
    11k Views
    Alan KilbornA

    @Ekopalypse said in Returning from Post-it or Full Screen mode without shortcuts:

    want only see the coding stuff

    Maybe, but at least for the way I have things set up, not much is eliminated and a lot of non-coding stuff remains.

  • Linux EOT on existing files

    6
    0 Votes
    6 Posts
    1k Views
    EkopalypseE

    @Meh-Di

    just to give you an idea how am I working with it.
    We upload parameter files to embedded devices running real time linux.
    The files are stored on a NAS running debian linux.
    Using npp on a windows PC with a mapped drive to the NAS.
    Edit, save and upload to the embedded devices from within npp.
    Never had any issue that those files got corrupted.