Search multiple words in xml
-
Hi all,
i got xml file with 2600000 lines. In this file describes a number of products. Each of the products is described as follows:<PRODUCT mode=“new”>
<SUPPLIER_PID>285129</SUPPLIER_PID>
<PRODUCT_DETAILS>
<DESCRIPTION_SHORT lang=“pol”></DESCRIPTION_SHORT>
<DESCRIPTION_LONG lang=“pol”></DESCRIPTION_LONG>
<EAN></EAN>
<SUPPLIER_ALT_PID></SUPPLIER_ALT_PID>
<MANUFACTURER_PID></MANUFACTURER_PID>
<MANUFACTURER_NAME></MANUFACTURER_NAME>
<MANUFACTURER_TYPE_DESCR></MANUFACTURER_TYPE_DESCR>
<SPECIAL_TREATMENT_CLASS type=“NOT_RELEVANT”>NONE</SPECIAL_TREATMENT_CLASS>
<KEYWORD lang=“pol”></KEYWORD>
</PRODUCT_DETAILS>
<PRODUCT_ORDER_DETAILS>
<ORDER_UNIT>C62</ORDER_UNIT>
<CONTENT_UNIT>C62</CONTENT_UNIT>
<NO_CU_PER_OU>1</NO_CU_PER_OU>
<PRICE_QUANTITY>1</PRICE_QUANTITY>
<QUANTITY_MIN>1</QUANTITY_MIN>
<QUANTITY_INTERVAL>1</QUANTITY_INTERVAL>
</PRODUCT_ORDER_DETAILS>
<PRODUCT_PRICE_DETAILS>
<DATETIME>
<DATE>2016-01-26</DATE>
</DATETIME>
<PRODUCT_PRICE>
<PRICE_AMOUNT></PRICE_AMOUNT>
<PRICE_CURRENCY>EUR</PRICE_CURRENCY>
<TAX>0.19</TAX>
<LOWER_BOUND>1</LOWER_BOUND>
</PRODUCT_PRICE>
</PRODUCT_PRICE_DETAILS>
</PRODUCT>In line <SUPPLIER_PID> 285129 </ SUPPLIER_PID> is given No. of the product. I need an easy way to find hundreds of No. of the product in this file and remove all lines on this (all that is between the <PRODUCT mode = “new”> and </ PRODUCT>). In my xml file is not repeated No. of products so I want to do it automatically.
Is there any way of doing this?
-
Not sure if I got this right: You are trying to remove PRODUCT Tags for a specific SUPPLIER_PID? If so try this:
- Go to Search->Replace
- Search for
<PRODUCT mode="new">\R<SUPPLIER_PID>285129</SUPPLIER_PID>.*?</PRODUCT>
- Replace with nothing
- Select Regular Expressions. Make sure “. matches \r and \n” is checked
- Hit “Replace all”
But if you have to do this kind of job on a regular basis, you may want to look for a Tool that is more specifically made for manipulation of XML by XPath.