Community
    • Login

    New user having trouble getting line/blank operations to work

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    33 Posts 5 Posters 6.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • motreoM
      motreo
      last edited by

      I’m a new Notepad++ user, trying to use this software to create more readable transcripts (generated by downsub.com). I have transcripts that look something like this (I put periods where blank spaces should be to show how some sentences are separated by one blank line while others are separated by two):

      sentence 1
      sentence 2
      .
      sentence 3
      .
      .
      sentence 4
      .
      sentence 5
      sentence 6

      Ideally, what I’d like to do is have a final transcript that looks like this:

      sentence 1 sentence 2 sentence 3

      sentence 4 sentence 5 sentence 6

      What I need to do is find a way to combine consecutive sentences as well as sentences separated by a single blank line into one paragraph, while preserving paragraph breaks where two blank lines would be.

      For some reason, though, I can’t get the functions ‘join lines’, ‘remove leading and trailing space’, or ‘remove empty lines/remove empty lines (containing blank characters)’ to work, whether I’m highlighting groups of lines or highlighting nothing at all when selecting these functions. When I try joining lines, something seems to happen because the selection highlight moves from the first line in the group to the last one, but nothing changes in terms of appearance. The functions to remove leading and trailing space as well as removing empty lines seems to do nothing at all.

      I tried some of the find and replace functions other users have posted about online but haven’t gotten any of them to work. I also updated Notepad++ to see if the software being out of date could’ve been causing these issues, but that didn’t help either. I’m genuinely stumped over what I could be doing wrong. If someone could help me figure out a solution, I’d really appreciate it.

      Terry RT 1 Reply Last reply Reply Quote 0
      • Terry RT
        Terry R @motreo
        last edited by Terry R

        @motreo said in New user having trouble getting line/blank operations to work:

        I’m genuinely stumped over what I could be doing wrong. If someone could help me figure out a solution, I’d really appreciate it.

        So much seems to be going wrong with your efforts that I might wonder if your Notepad++ installation is faulty.

        Start from a known position. So on a blank tab type some characters, a line feed, some more. Also add in what you call blank lines. Test out some of the blank line operations. get familiar with the process. It helps to have the control characters such as line feed, carriage return and blanks/spaces showing. To do this, use the View menu option, show symbol, show all characters.

        Also show those additional characters on the transcripts you are working on. It would help to show some examples in your next post, not in the method you did above, but using the process described in the FAQ Desk: Formatting Forum Posts here

        Also show us your debug information from Notepad++, this comes from the ? menu option, then Debug info. Copy and paste it in your next post.

        As much information you can give us the more helpful it will be and generally the quicker we can help you.

        Terry

        motreoM 1 Reply Last reply Reply Quote 1
        • motreoM
          motreo @Terry R
          last edited by

          @terry-r At first glance it doesn’t seem to be faulty, but I haven’t used it enough to know. Here’s the debug information:

          Notepad++ v8.2.1 (32-bit)
          Build time : Jan 19 2022 - 18:38:49
          Path : C:\Program Files (x86)\Notepad++\notepad++.exe
          Command Line : “C:\Users\name\AppData\Local\Temp[English] The Fermi Paradox Imprisoned Planets [DownSub.com]-3.txt”
          Admin mode : OFF
          Local Conf mode : OFF
          Cloud Config : OFF
          OS Name : Windows 10 Enterprise (64-bit)
          OS Version : 2004
          OS Build : 19041.1415
          Current ANSI codepage : 1252
          Plugins : mimeTools.dll NppConverter.dll NppExport.dll

          When I select show all characters, the only characters that appear are ‘LF’, scattered throughout the transcript at the beginning and end of each sentence. This is what I see:

          2022-02-14 21_34_29-C.png

          If I open a blank note and start typing, blank lines added by hitting enter look different:

          2022-02-14 21_36_54-_new 1 - Notepad++.png

          In this new note, I was able to successfully remove empty lines and join lines by highlighting all the text and then selecting join, which is great. Not sure how to efficiently join lines throughout an entire note, though, rather than going paragraph by paragraph (goal is to join consecutive lines + lines only separated by one empty line into paragraphs, while leaving paragraph breaks where there are two empty lines). Let me play around with it some more.

          motreoM Terry RT 2 Replies Last reply Reply Quote 0
          • motreoM
            motreo @motreo
            last edited by

            @terry-r - one more thing I just noticed. When I export the transcript directly to Notepad++, I get a version with just ‘LF’. If I export it to another app first, like Windows notepad, and then copy and paste into Notepad++, I get a version with ‘CRLF’.

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Terry RT
              Terry R @motreo
              last edited by Terry R

              @motreo said in New user having trouble getting line/blank operations to work:

              When I select show all characters, the only characters that appear are ‘LF’, scattered throughout the transcript at the beginning and end of each sentence

              Interesting, thanks for the images. I can see straight away that the file you are using (transcript) is recognized as a Unix style file due to ONLY having LF control characters at the end of each line.
              Secondly you will see some spaces amongst the words have a red dot, that’s the way a space is shown when using the “show characters” option. That means the other spaces aren’t the standard space character. Possibly they are a “non-breaking space”.

              I think the reason you are having issues is that the line functions you are using expect certain characters, which they then join through, or remove. Your transcript file doesn’t fit that format hence the issue you are having.

              As you found out when typing your own test you created a windows file, one that uses the CR (carriage return) and LF (line feed) characters to denote the end of a line.

              In our FAQ section is a post called Formatting Forum Posts. Read that and post the same example via that method. That allows us to further investigate what the spaces really are. It will be possible to create a regex (regular expression) which will do what you need as obviously the built-in functions will not.

              Terry

              I think Windows Notepad assumes there must be CR and LF denoting end of lines so converts the file to that standard. Then when you copy from there to Notepad++ you get the expected CRLF characters.

              motreoM 1 Reply Last reply Reply Quote 1
              • guy038G
                guy038
                last edited by guy038

                Hello, @motreo, @terry-r and All,

                @motreo :

                May be I should wait a bit for a new post, from you, containing raw text, in reverse video but I suppose that the following regex S/R should solve your problem !

                • Open the Replace dialog ( Ctrl + H )

                  • SEARCH (\R){3,}|(\R){1,2}

                    • REPLACE (?1\r\n\r\n)?2\x20    if the line endings must be CRLF, after replacement
                  • OR

                    • REPLACE (?1\n\n)?2\x20    if the line endings must be LF, after replacement

                    • Untick all box options

                    • Tick the Wrap around option

                    • Select the Regular expression search mode

                    • Click once on the Replace All button or several times, till the end of process, on the Replace button

                    • Hit the ESC button to close the Replace dialog


                Note : the advantage of this method is that, whatever the line ending of each line of your INPUT text, your OUTPUT text will always have normalized line endings ;-))

                Best Regards

                guy038

                motreoM 1 Reply Last reply Reply Quote 1
                • Alan KilbornA
                  Alan Kilborn @motreo
                  last edited by Alan Kilborn

                  @motreo said in New user having trouble getting line/blank operations to work:

                  When I export the transcript directly to Notepad++, I get a version with just ‘LF’. If I export it to another app first, like Windows notepad, and then copy and paste into Notepad++, I get a version with ‘CRLF’.

                  First, there is no “export…to Notepad++”.
                  If you are copying and pasting (which I presume from the rest of your statement), then say that, don’t talk of “export”.

                  If you paste into Notepad++ using Ctrl+v some data you’ve copied from a non-Notepad++ source, your line-endings (e.g. LF) will remain however they exist in the source.

                  The default for Notepad++ new files – like the one you got from when you said:

                  If I open a blank note and start typing, blank lines added by hitting enter look different

                  is to have CRLF line-endings. You can see this in the status bar of Notepad++:

                  cc1ee44f-6a03-4af6-9e05-017d91a3cbe8-image.png

                  You have to decide what you want to end up with for the line-endings, LF or CRLF. You may not know enough to make a good choice; in that case go with CRLF.

                  When pasting into Notepad++, Notepad++ can “correct” your line-endings at the time of paste if you use the Edit menu’s Paste command rather than Ctrl+v. This is somewhat of a “quirk” of Notepad++ and further discussion of it as a possible bug is found HERE.

                  motreoM 1 Reply Last reply Reply Quote 2
                  • motreoM
                    motreo @guy038
                    last edited by

                    @guy038 Thanks so much for taking a stab at this!

                    For both of the options you listed, I unchecked the Transparency box and selected Wrap around and Regular expression.

                    I tried the second option (REPLACE (?1\n\n)?2\x20) first, on the version of the transcript with LF line endings, and it didn’t work unfortunately. I copied and pasted the transcript into a new note, though, which converted the line endings to CRLF and then tried the first option (REPLACE (?1\r\n\r\n)?2\x20). That one worked almost perfectly, save for two things: (1) everything was merged into one long paragraph instead of keeping paragraph breaks in the places where there are two blank lines, and (2) there are some extra spaces in between words. It looks like this:

                    Imagine the day a civilization discovers the   starry night sky above contains billions of  billions of worlds awaiting their arrival.   Now imagine the day they realize  those voyages will never be made.   So earlier this week we were talking about Kessler  Syndrome, collision cascades around planets that
                    

                    If I select Show all characters, you can see places where there are non-standard spaces (Terry pointed this out to me). Here’s a screenshot.

                    2022-02-15 13_21_18-_new 4 - Notepad++.png

                    In the Replace dialog, I’m able to replace two blank spaces with a single blank space using search and replace. If I try doing the same thing using three blank spaces in the ‘Find what’ box, however, then it doesn’t work. Highlighting all the text in the transcript and selecting ‘Trim Leading and Trailing Space’ before running any expressions doesn’t seem to impact whether superfluous spaces are left in the transcript after running your Regex expression.

                    Do you have a sense of how to convert three blank spaces to one blank space using search and replace, as well as maintain paragraph breaks where two empty lines would be? This probably has something to do with those non-standard spaces Terry pointed out, but I don’t know anything about that.

                    Please let me know if I didn’t explain things well. Thanks again for taking the time to help me!

                    Alan KilbornA 1 Reply Last reply Reply Quote 1
                    • Alan KilbornA
                      Alan Kilborn @motreo
                      last edited by Alan Kilborn

                      @motreo said in New user having trouble getting line/blank operations to work:

                      non-standard spaces

                      Hmm, smells like some non-U+0020 space character, of which there are probably a few varieties. You do know about Unicode, right? Very likely these are non-breaking space characters. U+00A0.

                      motreoM 1 Reply Last reply Reply Quote 0
                      • motreoM
                        motreo @Terry R
                        last edited by

                        @terry-r Thanks for explaining things further. Here’s a screenshot of some text showing where dots are placed:

                        2022-02-15 13_35_18-AppData_Local_Temp_.png

                        This is what it looks like after running the regex expression guy038 recommended to me (copy text from transcript with LF endings, paste into new note so endings are converted into CRLF, Ctrl + H, SEARCH (\R){3,}|(\R){1,2}, REPLACE (?1\r\n\r\n)?2\x20):

                        2022-02-15 13_21_18-_new 4 - Notepad++.png

                        Even after running that regex expression, extra spaces (either two or three) are left in between various words. Selecting ‘Trim Leading and Trailing Space’ doesn’t make a difference, whether I do it before or after running the regex expression.

                        Terry RT 1 Reply Last reply Reply Quote 0
                        • motreoM
                          motreo @Alan Kilborn
                          last edited by

                          @alan-kilborn I don’t know anything about Unicode. Is there a way to get rid of these non-breaking space characters?

                          Alan KilbornA 2 Replies Last reply Reply Quote 0
                          • motreoM
                            motreo @Alan Kilborn
                            last edited by

                            @alan-kilborn

                            First, there is no “export…to Notepad++”.
                            If you are copying and pasting (which I presume from the rest of your statement), then say that, don’t talk of “export”.

                            Sorry, export wasn’t the right word to use. What I meant is that when I go to save the file from its source (downsub.com), I get a transcript with LF endings if I choose Open with Notepad++.

                            Alan KilbornA 1 Reply Last reply Reply Quote 0
                            • Alan KilbornA
                              Alan Kilborn @motreo
                              last edited by Alan Kilborn

                              @motreo said in New user having trouble getting line/blank operations to work:

                              export wasn’t the right word to use. What I meant is that when I go to save the file from its source (downsub.com), I get a transcript with LF endings if I choose Open with Notepad++.

                              Well maybe export was the right word! :-)

                              The saving of the file by whatever is saving it is doing so to a Linux file format. No problem for Notepad++, but as a user of the data, you have to know if you want to keep it in Linux format, or change it over to Windows format.

                              1 Reply Last reply Reply Quote 1
                              • Alan KilbornA
                                Alan Kilborn @motreo
                                last edited by

                                @motreo said in New user having trouble getting line/blank operations to work:

                                Is there a way to get rid of these non-breaking space characters?

                                Probably should confirm it first. Do a regular expression search for \xa0 and see if it matches the suspect spaces.

                                motreoM 1 Reply Last reply Reply Quote 1
                                • Terry RT
                                  Terry R @motreo
                                  last edited by

                                  @motreo said in New user having trouble getting line/blank operations to work:

                                  @terry-r Thanks for explaining things further. Here’s a screenshot of some text showing where dots are placed:

                                  At this point I think you REALLY need to provide examples in the format I requested (read that FAQ post). We need actual text to work on to help you. Images do not show the information, so we are only guessing (informed guesses they might be).

                                  Your issue is certainly fixable, just need the “real text”.

                                  Terry

                                  Alan KilbornA 1 Reply Last reply Reply Quote 2
                                  • Alan KilbornA
                                    Alan Kilborn @Terry R
                                    last edited by

                                    @terry-r said in New user having trouble getting line/blank operations to work:

                                    At this point I think you REALLY need to provide examples in the format I requested (read that FAQ post). We need actual text

                                    I think OP tried to do this, when he said:

                                    effbfe1c-4f4a-4502-94d4-d0bacbc1331c-image.png

                                    I copied and pasted this text but the spaces all seemed to be “normal”. :-(

                                    PeterJonesP 1 Reply Last reply Reply Quote 1
                                    • PeterJonesP
                                      PeterJones @Alan Kilborn
                                      last edited by PeterJones

                                      @alan-kilborn said in New user having trouble getting line/blank operations to work:

                                      I copied and pasted this text but the spaces all seemed to be “normal”. :-(

                                      Apparently another quirk of the forum. If I copy/paste from there, or View Source on the webpage, the starry just has normal spaces. If I use my moderator powers to “edit” the post (don’t worry, @motreo , I didn’t save my edits), and copy from the original post, it’s actually the\xA0\xA0\x20starry. So yes, there are two NBSP (\xA0) in between those words.

                                      So @motreo, you do have fancy spaces. I recommend you just do a search for \xA0 and replace with \x20, which will replace all NBSP with normal spaces.

                                      (We regulars will have to try to remember that even the text boxes can edit some characters, including the backslash-[ and NBSP)

                                      motreoM 1 Reply Last reply Reply Quote 3
                                      • Alan KilbornA
                                        Alan Kilborn @motreo
                                        last edited by

                                        @motreo said in New user having trouble getting line/blank operations to work:

                                        I don’t know anything about Unicode

                                        This may be a problem on a bigger scale, given what you seem to be doing. Maybe best to go off and do some learning.

                                        1 Reply Last reply Reply Quote 1
                                        • motreoM
                                          motreo @Alan Kilborn
                                          last edited by

                                          @alan-kilborn that’s it - all the spaces where there isn’t any dot are highlighted when doing a search for \xa0

                                          1 Reply Last reply Reply Quote 0
                                          • motreoM
                                            motreo @PeterJones
                                            last edited by

                                            @peterjones I recommend you just do a search for \xA0 and replace with \x20, which will replace all NBSP with normal spaces.

                                            Worked like a charm! And allows me to get rid of those extra spaces using search/replace :)

                                            1 Reply Last reply Reply Quote 2
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors