Remove field of information in delimited text
-
Thank god for this forum.
I have an exported report that I’m turning into a delimited load file. I couldn’t limit the fields exported on the front end so I am having to do it on the back end.
Essentially, I want to delete anything between <field id="“1033806” and <field>. It’s a long text field and not consistent in length so I can’t just block and remove. It’s wreaking havoc on my load file.
Any easy way to do this?
All the best,
-Bowe. -
Hello @Bowe-Kurowski,
may I ask if you are 100% sure you want it that way?
The reason of my surprise is that it looks like xml content on the first,
but on the other hand I’m missing something.
Are your double quotes correct?<field id="“1033806”
and I would assume that it would look like (… represent the not needed text)
<field id="1033806" ... ></field>
It could be solved by using a regular expression find and replace but
therefore the data you provided need to be confirmed to be correct
before providing a possible solution.Cheers
Claudia -
Hi Claudia… Here’s the actual cut and paste -
<field id=““1003668"” type=”“4"” name=““Extracted Text”” formatstring=“”“”><oldValue></oldValue></field>
Typically there is something between the “oldValue” markers. But I have like 12,000 records, so I can’t do it by hand.
This will help a TON! Thank you so much Claudia. I feel like such an idiot with this and can’t figure out the actual syntax to what I know is a very simple solution.
-
From :
<field id="“1003668"” type="“4"” name="“Extracted Text”984849498484" formatstring="984984984984984"><oldValue></oldValue></field>i want this<field id="“1003668"” type="“4"” name="“Extracted Text”787557" formatstring="894984984984984"><oldValue></oldValue></field>and that? <field id="“1003668"” type="“4"” name="“Extracted Text”984849498484" formatstring="984984984984984"><oldValue></oldValue></field>also this<field id="“1003668"” type="“4"” name="“Extracted Text”787557" formatstring="894984984984984"><oldValue></oldValue></field>and more of that
To :
i want thisand that? also thisand more of that
Ctrl+F > Find what :
<field id="“1003668"”.*?</field>
, Replace with :leave empty
Dont forget to check Regular expression in Search mode, click replace all -
Hi Bowe,
I would do it like this
given that string is
<field id="“1003668"” type="“4"” name="“Extracted Text”" formatstring=""""><oldValue></oldValue></field>
search and replace with regular expression
find what:(<field id="“1003668")(.*)(<\/field>) replace with:\1\3
would result in
<field id="“1003668"</field>
but, as said, this results in an incorrect formatted xml string as the > is missing.
To solve this,
find what:(<field id="“1003668")(.*)(>)(.+)(<\/field>) replace with:\1\3\5
which results in
<field id="“1003668"></field>
Ths id value looks still strange but …
Cheers
Claudia -
Thanks… let me try playing around with this. I’ve tried to use both Claudia’s as well as Greenzest’s initial string in Notepad++, but it’s not finding it. Maybe I need to play around with the syntax. But Greenzest… you are spot on with what I’m trying to do. I just want to be able to cut and remove entire fields!
-Bowe.
-
Hello Bowe,
I’m confused first you said you want to delete everything between A and B,
which from my point of view means you wanna keep A and B and now you’re
saying you wanna delete the whole field ??What did I miss?
Cheers
Claudia -
Hi Claudia,
You are correct. In re-reading my initial question, I did ask specifically for that. However, I meant that I was looking for the front part of the string and the last part of the string, and then deleting the entire string altogether. I didn’t do a great job on communicating what I was looking for. So your answer was exactly what the question asked. You did not miss anything. It was completely my error in communicating :)
-
:-) thought that I didn’t understand the question, again - but then greenzest
solution should work.Cheers
Claudia