Misalignment of text when Notepad++ edited files are opened in other editors (notepad, wordpad, google drive anyfile editor, etc.)



  • @Compu-chan

    Looks strange, indeed. I would open the file in a hex editor and check for unusual bytes.

    You mentioned that you share this file, is it possible to download it already?

    Cheers
    Claudia



  • Sure, here’s one of the 8086 assembly source files:

    https://www.dropbox.com/s/gaxnn56hteuz5n7/g_lib.asm?dl=0

    And here’s a screenshot of bad text alignment in the file. The comment block is shifted beyond 78 columns.
    It just looks bad, and makes it a little awkward to read in a DOS editor.

    https://www.dropbox.com/s/b1b120zj5w28la2/bad_formatting2.png?dl=0



  • @Compu-chan

    Looks like your mixing tabs and spaces.
    Activate the Show all characters

    Convert it by using Edit->Blank Operations->TabToSpace !??

    Cheers
    Claudia



  • Problem solved. Thanks!

    Solution:

    Settings->preferences->Language
    check “Replace by space” box

    Edit->Blank Operations->TAB to space

    Make some arbitrary change to the file (type a character then delete it), then re-save the file.

    Formatting/indentation should be fixed.



  • @Compu-chan said:

    So in your original post you provided misinformation:

    Tabs have been replaced by spaces



  • No, I checked the replace tab with spaces box before posting my question.
    Someone else on a different forum mentioned it as a possible solution to a similar but different problem.

    It only prevents future tabulation symbols from appearing in your file.
    It doesn’t get rid of the tabulations that are already present.

    You do that with:
    Edit->Blank Operations->TAB to space



  • @Compu-chan

    Perhaps the verbage “Replace by space” in the Preferences is poor–it makes it sound like by checking the box it will perform an action rather than simply changing a setting!



  • Also, i’m not sure if there’s already a way to do this, but being able to perfom operations on all open files at the same time would be nice.

    I had to do the TAB to spaces operation on about a dozen files one by one to fix them all.
    Not a big deal, but slightly inconvenient.

    Not sure if i’m supposed to mark this post as solved.
    Don’t really see the option to do so.



  • @Compu-chan

    There is no option to do that specific operation on a group of files, however, it could be done with a regular expression Replace in Files on a bunch of files at once. Replace \t with how many ever spaces you want a tab character to be.

    If you had changed the mode of this thread to “Ask as a Question” then you could later go in and “Mark as Solved”. You can still do it I think…



  • Hi @compu-chan, @scott-sumner and All,

    Scott said :

    There is no option to do that specific operation on a group of files, however, it could be done with a regular expression Replace in Files on a bunch of files at once. Replace \t with how many ever spaces you want a tab character to be.

    Unfortunately, Scott-, it’s not that simple !! Just because the physical length of the tabulation character depends on its position :-(( So I began to investigate a way to simulate all the native Blank Operations of Notepad++, by a regex search replacement.

    The main advantage, of this method, is that you can perform these Blank operations, on multiple files, belonging to a same folder, in the Find in files dialog :-))


    The first 3 operations are quite easy to realize :

    • Trim Trailing Space :

      • SEARCH [\x20\t]+$ and REPLACE let the zone EMPTY
    • Trim Leading Space :

      • SEARCH ^[\x20\t]+ and REPLACE let the zone EMPTY
    • Trim Leading or Trailing Space :

      • SEARCH ^[\x20\t]+|[\x20\t]+$ and REPLACE Let the zone EMPTY

    The following 2 operations are not too difficult to achieve, too !

    • EOL to Space :

      • SEARCH (?<=\x20)\R|(\R) and REPLACE (?1\x20) ( Change of any line-break by a space or suppression, if preceded by a space character )
    • Remove Unnecessary Blank and EOL :

      • SEARCH ^[\x20\t]+|[\x20\t]+$ and REPLACE Let the zone EMPTY ( Suppression of leading and trailing Blank characters )

      • SEARCH \R and REPLACE \x20 ( Replacement of any line break by a space character )


    Now, due to the tabulation’s behaviour, which always stops at column 4*n whatever n > 0, the last 3 Blank Operations are much hardier to elaborate, on the “regex” point of view !!

    • TAB to Space :

      • SEARCH (?-s)(?:()|(.)|(..)|(...))\t|(....) and REPLACE (?1 )(?2\2 )(?3\3 )(?4\4 )(?5$0)

    Briefly, here are, below, the different cases :

    Let C = Any unique STANDARD character
    
    	<   =   ()     + \t   ->   \1 + 4 Spaces   Group 1
    
    1	<   =   (C)    + \t   ->   \2 + 3 Spaces   Group 2
     	<   =   (C)    + \t   ->   \2 + 3 Spaces   Group 2
    
    12	<   =   (CC)   + \t   ->   \3 + 2 Spaces   Group 3
      	<   =   (CC)   + \t   ->   \3 + 2 Spaces   Group 3
    
    123	<   =   (CCC)  + \t   ->   \4 + 1 Space    Group 4
       	<   =   (CCC)  + \t   ->   \4 + 1 Space    Group 4
    
    1234<   =   (CCCC)        ->   $0              Group 5
    

    Notes :

    • Depending on the number of characters, preceding the tabulation character, this S/R rewrites these characters, followed by the appropriate number of spaces

    • Note that if the range of 4 chars does not contain any tabulation, it is simply rewritten ( $0 ). This replacement, seemingly useless, is, however, necessary to go on looking for the next blocks of 4 positions long !


    • Space to TAB (All) :

      • SEARCH (?-s)(?|([^ \t\r\n])\x20(?:\x20[\x20\t]|\t)|([^ \t\r\n]{2})\x20[\x20\t]|([^ \t\r\n]{3})\x20)|(\x20{0,3}\t|\x20{4})|([^ \t\r\n]{1,3}\t|....)

      • REPLACE (?1\1\t)(?2\t)(?3$0)

    Again, below, here is the recapitulation of all cases, with their appropriate replacements :

    Let C = [^ \t\r\n] = Any unique STANDARD character, different of a SPACE and a TABULATION
    
    a 	<   =   (C)    + 1 sp + \t  ->   \1\t   Group 1
    a  	<   =   (C)    + 2 sp + \t  ->   \1\t   Group 1
    a   <   =   (C)    + 3 sp       ->   \1\t   Group 1
    
    ab 	<   =   (CC)   + 1 sp + \t  ->   \1\t   Group 1
    ab  <   =   (CC)   + 2 sp       ->   \1\t   Group 1
    
    abc <   =   (CCC)  + 1 sp       ->   \1\t   Group 1
    
    	<   =   (0 sp + \t)         ->   \t     Group 2
     	<   =   (1 sp + \t)         ->   \t     Group 2
      	<   =   (2 sp + \t)         ->   \t     Group 2
       	<   =   (3 sp + \t)         ->   \t     Group 2
        <   =   (4 sp)              ->   \t     Group 2
    
    a	<   =   (C    + \t)         ->   $0     Group 3
    ab	<   =   (CC   + \t)         ->   $0     Group 3
    abc	<   =   (CCC  + \t)         ->   $0     Group 3
    abcd<   =   (CCCC)              ->   $0     Group 3
    

    Notes :

    • Quickly, all the cases are divide up into 3 main parts, with the appropriate replacements :

      • Standard characters followed by a mix of spaces/tabulations must be rewritten, followed by a tabulation ( (?1\1\t) )

      • Mix of spaces/tabulation, only, must be replaced by a single tabulation ( (?2\t) )

      • Standard characters, followed by an unique tabulation, have to be simply rewritten ( (?3$0) )

    • Note, in the search regex, a special construct (?!......), which resets the sub-expression count, at the start of each | alternative of this construct. So, whatever the branch matched, in our example, the matched expression is always stored in group 1 and will be replaced, according to the conditional form (?1\1\t)


    • Space to TAB (Leading) :

      • SEARCH ^(?:\x20|\t)+ and REPLACE $0#

      • SEARCH (?:\x20{4}|\x20{0,3}\t)(?=.*#)#? and REPLACE \t

    Here are all the cases, of blank characters combination, which may occur, at beginning of lines :

    	<   =   0 sp + \t   ->   \t
     	<   =   1 sp + \t   ->   \t
      	<   =   2 sp + \t   ->   \t
       	<   =   3 sp + \t   ->   \t
        <   =   4 sp        ->   \t
    

    Notes :

    • In the first S/R, we look, at beginning of lines, any range of space or tabulation character(s) and simply rewrite this range, $0 , followed with a # symbol. Note that you may use any symbol, absent from your file, which will be used as a mark

    • In the second S/R, all the combination of leading blank characters, which are followed, further on, by the # character, are changed, along with the mark character, with an unique tabulation character !

    Best Regards,

    guy038

    PS :

    Let consider the special branch reset construct, below :

    (a)(?|x(y)z|(p(q)r)|(t)u(v))(z)

    We have :

    • Group 1 = a

    • Group 2 = y or p(q)r or t

    • Group 3 = None or q or v

    • Group 4 = z

    With a classical list of alternatives, as below :

    (a)(?:x(y)z|(p(q)r)|(t)u(v))(z)

    We would get, instead :

    • Group 1 = a

    • Group 2 = None or y

    • Group 3 = None or p(q)r

    • Group 4 = None or q

    • Group 5 = None or t

    • Group 6 = None or v

    • Group 7 = z



  • @guy038 said:

    the physical length of the tabulation character depends on its position

    When I said “Replace \t with how many ever spaces you want a tab character to be” I was considering leading tab characters only, which is really the only consistent way to use tab characters (IMO). I tend to stay away from tab characters in general, but when I work on projects where they are used, it is always in the “tab-indent, space-align” style. This means that tab characters are the only whitespace that are allowed before the first non-whitespace character on a line, and spaces are the only valid whitespace after the first non-whitespace character on a line. With this usage, my original Replace operation is valid–there should never be a situation in which moving to the next tab-stop is not the full # of spaces (if the conversion were to be done).

    due to the tabulation’s behaviour, which always stops at column 4*n whatever n > 0

    Can you explain this, I’m not understanding what this means??



  • Hi, @scott-sumner,

    Indeed, using leading tabulation characters and space characters, everywhere else, in lines, should be the sensible attitude, while coding :-D And, therefore, your solution, about leading tabs is quite exact !

    As for my assertion :

    due to the tabulation’s behaviour, which always stops at column 4*n whatever n > 0

    I pointed out the fact that a tabulation character always ends at column 4, 8, 12, 16,…, that is to say, on a 4*n position !

    Moreover, if c is the column, > 0, where begins the tabulation character, its physical length l, between 1 and 4, can be found with the formula l = 4 - ((c-1) % 4)

    BTW, I just realize that all this works, only, if the tab size value, in Settings > Preferences… > Language > Tab Settings, is 4. In the general case, if tab size = s, the physical length l of a tabulation, beginning at column c, would be : l = s - ((c-1) % s), with % standing for the mathematical operation modulo !

    Cheers,

    guy038

    P.S : I forgot to give an example :

    Let’s suppose the tab size is 7 and a tabulation character begins at column c = 157. This implies that its length l = 7 - ((157-1) % 7) = 5 and it ends at column 157 + 5 - 1 = 161, which is, effectively, a multiple of 7 ( 161 = 7 * 23 )



  • @guy038

    Thanks…what confused me was that it sounded like you were saying that tabstops were always 4 columns…your most-recent post clears that up.


Log in to reply