PythonScript to replace TextFX Rewrap function?
-
@Ekopalypse said in PythonScript to replace TextFX Rewrap function?:
oopsss - to late :-D
Sorry. ;-)
I think it’s a “good enough” first implementation; if it gets refined, I will just update the link.
-
So I spun up an 8.3.3-32bit with TextFX to see what it did.
With input
this is a group of short lines that will be merged into a line that's around 72 char long
the TextFX “rewrap” will turn it into
this is a group of short lines that will be merged into a line that's around 72 char long
but your script will do
this is a group of short lines that will be merged into a line that's around 72 char long
When I looked at the wrap function, I thought maybe changing to
replace_whitespace=True
… but that appears to do 1 space for each newline character, so CRLF becomes two spaces:123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x this is a group of short lines that will be merged into a line that's around 72 char long
(number line added to make the double space obvious)
Is there an option that will collapse the
\h*\R
to a single space rather than n spaces for each horizontal or vertical character? -
@PeterJones said in PythonScript to replace TextFX Rewrap function?:
this is a group of short lines that will be merged into a line that’s
around 72 char longahh - you see, I always “split” the text and never actually “join” lines.
There is a fill method that does the “joining” - let’s see if that needs to be called in advance. -
It may be worth pointing out that if you always want to hard-wrap lines at a the same column, that THIS THREAD treats the topic fairly exhaustively, and requires no external scripting.
-
@Ekopalypse said in PythonScript to replace TextFX Rewrap function?:
There is a fill method that does the “joining” - let’s see if that needs to be called in advance.
It might not have been the way you thought of, but I used an
re.sub()
in therewrap
function which first merges the equivalent of\h*\R
into a single space. With that, and using thereplace_whitespace=True
, the behavior of the script seems to match the essential nature of the Rewrap command from TextFX:# encoding=utf-8 """ PythonScript replacement of TextFX>Edit>Rewrap https://community.notepad-plus-plus.org/post/78161 Author: @Ekopalypse , with input from @PeterJones """ from Npp import editor, notepad from textwrap import wrap import re def rewrap(text, pos): joined = re.sub(r'\h*(\r\n|\r|\n)', " ", text) return wrap(joined, pos, expand_tabs=False, replace_whitespace=True, break_on_hyphens=False) def main(): pos = int(notepad.prompt('Wrap at position:', 'ReWrap', '72')) if pos < 8 or pos > 2048: pos = 72 start, end = editor.getUserLineSelection() start_pos = editor.positionFromLine(start) end_pos = editor.getLineEndPosition(end) rewrapped = rewrap(editor.getRangePointer(start_pos, end_pos-start_pos), pos) eol = {0:'\r\n', 1:'\r', 2:'\n'}[editor.getEOLMode()] editor.setTarget(start_pos, end_pos) editor.beginUndoAction() editor.replaceTarget(eol.join(rewrapped)) editor.endUndoAction() main()
-
@Ekopalypse My use case is between the two. I have texts that consist of long and short lines, sometimes with excessive white space, that I want to normalize to 78 character text.
-
@PeterJones said in PythonScript to replace TextFX Rewrap function?:
encoding=utf-8
“”"
PythonScript replacement of TextFX>Edit>Rewrap
https://community.notepad-plus-plus.org/post/78161
Author: @Ekopalypse , with input from @PeterJones
“”"
from Npp import editor, notepad
from textwrap import wrap
import redef rewrap(text, pos):
joined = re.sub(r’\h*(\r\n|\r|\n)', " ", text)
return wrap(joined, pos, expand_tabs=False, replace_whitespace=True, break_on_hyphens=False)def main():
pos = int(notepad.prompt(‘Wrap at position:’, ‘ReWrap’, ‘72’))
if pos < 8 or pos > 2048:
pos = 72start, end = editor.getUserLineSelection()
start_pos = editor.positionFromLine(start)
end_pos = editor.getLineEndPosition(end)rewrapped = rewrap(editor.getRangePointer(start_pos, end_pos-start_pos), pos)
eol = {0:‘\r\n’, 1:‘\r’, 2:‘\n’}[editor.getEOLMode()]
editor.setTarget(start_pos, end_pos)
editor.beginUndoAction()
editor.replaceTarget(eol.join(rewrapped))
editor.endUndoAction()main()
I tried that code but it does not keep the paragraphs separated. In TextFX, the rewrap did only join if there was no blank line between two lines. One or more blank lines where treated as a paragraph break and converted to one blank line to keep the paragraphs intact. Can the code be changed to do that?
-
I will try to understand what the TextFX code does and port to Python accordingly. I will post an updated code later today or tomorrow at the latest.
-
I think this accomplishes your goal:
# encoding=utf-8 """ PythonScript replacement of TextFX>Edit>Rewrap https://community.notepad-plus-plus.org/post/78177 Author: @Ekopalypse , with input from @PeterJones """ from Npp import editor, notepad from textwrap import wrap import re def rewrap(text, pos, eol): paragraphed = re.sub(eol+eol, u'\u00B6', text) joined = re.sub(r'\h*(\r\n|\r|\n)', " ", paragraphed) unparagraphed = re.sub(u'\u00B6', eol+eol, joined) retlist = [] for linetext in unparagraphed.splitlines(): if linetext == '': retlist.append('') for partial in wrap(linetext, pos, expand_tabs=False, replace_whitespace=False, break_on_hyphens=False): retlist.append(partial) return retlist def main(): pos = int(notepad.prompt('Wrap at position:', 'ReWrap', '72')) if pos < 8 or pos > 2048: pos = 72 start, end = editor.getUserLineSelection() start_pos = editor.positionFromLine(start) end_pos = editor.getLineEndPosition(end) eol = {0:'\r\n', 1:'\r', 2:'\n'}[editor.getEOLMode()] rewrapped = rewrap(editor.getRangePointer(start_pos, end_pos-start_pos), pos, eol) editor.setTarget(start_pos, end_pos) editor.beginUndoAction() editor.replaceTarget(eol.join(rewrapped)) editor.endUndoAction() main()
The input text selection
these are two really long paragraphs that have lots and lots and lots and lots and lots and lots and lots and lots of words this is the second of the really long paragraphs that have lots and lots and lots and lots and lots and lots and lots and lots of words
when run with a value of
16
will end up like:these are two really long paragraphs that have lots and lots and lots and lots and lots and lots and lots and lots of words this is the second of the really long paragraphs that have lots and lots and lots and lots and lots and lots and lots and lots of words
and that output, when selected and run again with
72
, will end up back as the original. -
@PeterJones the second version of the code does not work if I mark a whole text. If I mark only one paraghraph, then it does work but only for english texts. As soon as I run it against a German text with Umlauts (ä,ö,ü, etc.) (UTF-8 or UTF-8-BOM) it converts the Umlauts to strange characters and paragraph breaks.
I tried it with this example:
Um die sechste Morgenstunde des 3. Juli dieses Jahres war ich gerade, nichts Böses ahnend, dabei, meine Petunien zu begießen, als ich einen großen, bartlosen, blonden jungen Mann bei mir eintreten sah, geschmückt mit einer goldenen Brille, das Haupt bedeckt mit einer deutschen Schirmmütze. Trübselig, wie ein Segel längs des Mastes, wenn der Wind sich gelegt hat, baumelte ein weiter Überzieher aus einem sehr dauerhaften englischen Stoff um seine Person. Handschuhe trug er nicht; seine rohledernen Schuhe hatten derartig mächtige, breite Sohlen, daß deren Rand den Fuß mit einer Art Trottoir umgab. In einer Seitentasche, ungefähr über dem Herzen, machte sich, unter dem glänzenden Stoff vage ihre Form abzeichnend, eine große Porzellanpfeife bemerkbar. Nicht einmal im Traume wäre ich darauf verfallen, den Unbekannten zu fragen, ob er an einer der deutschen Universitäten studiert habe. Ich setzte meine Gießkanne hin und begrüßte ihn sofort auf deutsch mit einem schönen »Guten Morgen!«. »Mein Herr«, erwiderte er französisch, wenn auch mit einem erbärmlichen Akzent, »ich heiße Hermann Schultz; ich habe gerade einige Monate in Griechenland verbracht, wo übrigens Ihr Buch mein ständiger Reisebegleiter war.«
-
an updated version, tested with utf8 and ansi
def rewrap(text, pos, eol): paragraphed = re.sub(eol+eol, '\0xB6', text) joined = re.sub(r'\s*(\r\n|\r|\n)', " ", paragraphed) unparagraphed = re.sub('\0xB6', eol+eol, joined) retlist = [] for linetext in unparagraphed.splitlines(): if linetext == '': retlist.append('') for partial in wrap(linetext, pos, expand_tabs=False, replace_whitespace=False, break_on_hyphens=False): retlist.append(partial.lstrip()) return retlist def main(): _prompt = "Wrap at position:" _title = 'ReWrap' default_pos = 78 if editor.getSelectionEmpty(): _prompt = "ATTENTION!! - Since nothing is selected, the WHOLE document is rewrapped.\n" + _prompt _title = "ATTENTION!! - " + _title pos = notepad.prompt(_prompt, _title, default_pos) if pos is None: return # cancelled else: pos = int(pos) start, end = editor.getUserLineSelection() start_pos = editor.positionFromLine(start) end_pos = editor.getLineEndPosition(end) eol = {0:'\r\n', 1:'\r', 2:'\n'}[editor.getEOLMode()] rewrapped = rewrap(editor.getRangePointer(start_pos, end_pos-start_pos), pos, eol) editor.setTarget(start_pos, end_pos) editor.beginUndoAction() editor.replaceTarget(eol.join(rewrapped)) editor.endUndoAction() main()
The use of the Unicode B6 seemed to confuse the python2 re engine, and furthermore python re does not know
\h
,\s
had to be used instead. -
Sorry - the complete script
# encoding=utf-8 """ PythonScript replacement of TextFX>Edit>Rewrap https://community.notepad-plus-plus.org/post/78262 Author: @Ekopalypse, @PeterJones """ from Npp import editor, notepad from textwrap import wrap import re def rewrap(text, pos, eol): paragraphed = re.sub(eol+eol, '\0xB6', text) joined = re.sub(r'\s*(\r\n|\r|\n)', " ", paragraphed) unparagraphed = re.sub('\0xB6', eol+eol, joined) retlist = [] for linetext in unparagraphed.splitlines(): if linetext == '': retlist.append('') for partial in wrap(linetext, pos, expand_tabs=False, replace_whitespace=False, break_on_hyphens=False): retlist.append(partial.lstrip()) return retlist def main(): _prompt = "Wrap at position:" _title = 'ReWrap' default_pos = 78 if editor.getSelectionEmpty(): _prompt = "ATTENTION!! - Since nothing is selected, the WHOLE document is rewrapped.\n" + _prompt _title = "ATTENTION!! - " + _title pos = notepad.prompt(_prompt, _title, default_pos) if pos is None: return # cancelled else: pos = int(pos) start, end = editor.getUserLineSelection() start_pos = editor.positionFromLine(start) end_pos = editor.getLineEndPosition(end) eol = {0:'\r\n', 1:'\r', 2:'\n'}[editor.getEOLMode()] rewrapped = rewrap(editor.getRangePointer(start_pos, end_pos-start_pos), pos, eol) editor.setTarget(start_pos, end_pos) editor.beginUndoAction() editor.replaceTarget(eol.join(rewrapped)) editor.endUndoAction() main()
-
@Ekopalypse Yes, this version seems to work fine! Many thanks!