@Doctor-Rashir said in How to Print Pretty with missing close tags.:
I am looking at a Quicken QFX log file that is in a sort of XML type format. The format has many missing End tags so this causes the XML Tools - Pretty Print to indent nearly forever.
Is there a way to align the Start and End tags that are present?
XML Tools is designed to work with well-formed XML. If it’s not well-formed (ie, unclosed tags), it’s just too much of an edge case. It’s doubtful there’s any toolmaker out there who could figure out a way to “pretty print” a seemingly-random mixture of closed and unclosed tags in any meaningful way.
If you were to unindent everything (Ctrl+A, then Shift+TAB until it’s gone, or search for ^\h+ and replace with nothing), then if you knew in advance which tags (like <SONRQ>) had closing pairs, you could use the zone-of-text regex forumula from our FAQ, as:
FIND = (?-si:<SONRQ\b|(?!\A)\G)(?s-i:(?!</SONRQ\b).)*?\K(?-si:^(?!\h*</SONRQ))REPLACE = \t
REPLACE ALL
If I do three steps: unindent, formula(SONRQ) and formula(SIGNONMSGSRQV1), then with your example data, I get
<OFX> <SIGNONMSGSRQV1> <SONRQ> <DTCLIENT>20250520104016.123[-7:MST] <USERID>anonymous00000000000000000000000 <USERPASS>X <GENUSERKEY>N <LANGUAGE>ENG <APPID>QWIN <APPVER>2700 </SONRQ> </SIGNONMSGSRQV1> <INTU.BRANDMSGSRQV1> <INTU.BRANDTRNRQ> <TRNUID>19FFC8F0-7EF9-1000-BC8D-909811990026 <INTU.BRANDRQ>I don’t know how many other closed tags there are in your file, so I don’t know whether that’s practical for you or not. But it’s the best I can come up with for now, without invoking a full-on programming language (at which point, it could be done in the contents of the Notepad++ window using a plugin like PythonScript, or it could just be done at the command-line with whatever programming language you wanted to use, without needing the file to be open in Notepad++, and thus make it off-topic here)
I did try to make use of a numbered or named capture group in the BSR section and use a backreference to make the BSR and FR invoke those (see the FAQ for the meaning of BSR / ESR / FR), rather than having to know in advance the names of all the tags… but I couldn’t get those backreference versions to work.