Random line search
- 
 I am using data from Open Street Maps, USFS, and BLM to create KML files of roads, trails, and POI’s for Google Earth Pro. Google Earth tends to choke as the Myplaces.kml file gets real large so much of what I need to do is deleting, or revising sections to make the files smaller. Listed below is a sample of my problem. In this case I was using the line “<SimpleData name=“surface”>asphalt</SimpleData>” to change the color and width of the line based on the surface type being “asphalt”. The problem for me is that the “surface” designation is not always on the same line, and the number of lines between the <Placemark> statements can vary considerably. It is not unusual for me to have to write 4-8 different Regexp’s to handle this problem. Another scenario is that I delete a lot of the Placemark groups based on a test of a data field within the section. I have only been working with Notebook and Regexp’s for less than 6 months, so I would be eternally grateful if someone can provide me some tips on how to handle this scenario. <Placemark> 
 <Style><LineStyle><color>ff95fcff</color><width>2</width></LineStyle></Style>
 <ExtendedData><SchemaData schemaUrl=“#Hwy_Primary__UT_”>
 <SimpleData name=“highway”>primary</SimpleData>
 <SimpleData name=“maxspeed”>55 mph</SimpleData>
 <SimpleData name=“surface”>asphalt</SimpleData>
 <SimpleData name=“ref”>UT 21</SimpleData>
 </SchemaData></ExtendedData>
 <LineString><coordinates>-113.0009,38.3766 -113.00039,38.3758 -112.99967,38.37472 -112.99803,38.37212</coordinates></LineString>
 </Placemark><Placemark> 
 <name>South 100 East Street</name>
 <Style><LineStyle><color>ff95fcff</color><width>2</width></LineStyle></Style>
 <ExtendedData><SchemaData schemaUrl=“#Hwy_Primary__UT_”>
 <SimpleData name=“highway”>primary</SimpleData>
 <SimpleData name=“tiger:name_type”>St</SimpleData>
 <SimpleData name=“tiger:name_base”>State Route 21;100 East;100 East;100 East</SimpleData>
 <SimpleData name=“tiger:county”>Beaver, UT</SimpleData>
 <SimpleData name=“maxspeed”>55 mph</SimpleData>
 <SimpleData name=“surface”>asphalt</SimpleData>
 <SimpleData name=“smoothness”>excellent</SimpleData>
 <SimpleData name=“ref”>UT 21</SimpleData>
 </SchemaData></ExtendedData>
 <LineString><coordinates>-113.00662,38.38557 -113.00613,38.38482 -113.00467,38.38253</coordinates></LineString>
 </Placemark>
- 
 The Generic Regex Formulas => Replacing Text in a Specific Zone formula may help you. For changing color/width based on surface being asphalt, you might want to use a combo of positive and negative lookaheads that say that after the <LineStyle>section, it must contain"surface">asphalt<before it hits</Placemark>. My first attempt (not tried) would be something like(?s)<LineStyle>.*?</LineStyle>(?=((?!</Placemark).)*"surface">asphalt<)… actually, that will probably extend the first group a lot farther than expected, so might want to use(?s)<LineStyle>((?!</Placemark).)*?</LineStyle>(?=((?!</Placemark).)*"surface">asphalt<)instead
- 
 @PeterJones Thank You sooo Much! I will be able to use this for so much of the stuff that I am doing. 
- 
 Just a regex tip inspired by Peter’s discussion: It is almost always a good idea to use the construct x*?(xcould be literal x or some other regex) right out of the gate, when crafting a new regex, rather than most people’s first impulse to use the shorter-to-typex*.The *?version will match minimally, whereas the*version will match maximally, to satisfy its craving for a match.if *?isn’t meeting the need, then consider a potential switch as you refine the regex.
- 
 @Alan-Kilborn Thanks for the help Alan! It’s so amazing that there are people like you and Peter who are willing to share your time and expertise to help people! I spent quite a bit of time trying to find an easier way to accomplish the above, and didn’t get close. 
- 
 @Richard-Lohr said in Random line search: amazing that there are people like you and Peter who are willing to share your time and expertise to help people Ha, well, not everyone likes us and our replies. :-) I have, and maybe it is the same for Peter, free cycles during the day while I do other things that don’t keep me 100% utilized. Doing N++ stuff, including helping out here is a nice diversion and time-filler. And why not use your free cycles to maybe do some good, right? 
- 
 When I initially received the solution, I tried it, and it worked. Then I attempted to use the code and finalize my files and find that it will not process all of the records to the EOF. Once I invoke the search, it usually processes 5-11 records and then it returns an error message: “find: invalid irregular expression”. If I used the find function to find the next occurrence of “asphalt” and attempt to S&R using the function again, it processes 1 or two records and then returns that same error message. I did study Guy’s writeup, I’ve subsequently done a lot of reading about lookaheads and I have found absolutely nothing that will point me in the right direction. Additional Info - Using N++ v8.4.8
- File was orig 3.1 mil lines, reduced file to 9500 lines, symptoms are exactly the same.
- There were a total of 941 occurrences of “asphalt” in file.
- “ALL” of the records in the files are exactly the same format as I initially provided.
 
- 
 I have no reason to doubt you, but I’ve never seen a situation where “invalid regular expression” is returned only sometimes for the same Find what data. I mean, if an expression is valid once, it should be valid for all time; it doesn’t depend upon data, AFAIK. I’m talking about this specific error message, which seems to be what you’re getting as well:  When you obtain that, can you hover of the little 3-dot speech balloon and indicate what that says; for me and my simple bad regex, I obtained:  
- 
 Darn, I have never hovered over that. The additional info says: 
 Ran out of stack space trying to match the regular expression.
- 
 OK, that error message IS data-dependent, in conjunction with the constant Find what regex. What it means is that the combination of data and expression is “too much” for the regex engine. It’s a big hint to refactor your expression so that the engine doesn’t overflow. Not an easy task for a newbie, however. 
- 
 @Alan-Kilborn Not just a newbie, a 66 year old that wishes for the days when he still had half a brain. ;-) 
- 
 OK, well, you’ve got 10 years on me, but we ain’t dead yet as they say… The problem with posting here with a general question is that respondents can only “take their best shot” at the solution; they don’t have your exact data situation. And it is very much likely that something in your specific data is causing part of the regex to “go haywire”, and do some catastrophic (as far as computer memory usage) backtracking. On one hand, it is nice that this can be detected, rather than just Notepad++ crashing. On the other hand, it’s nice to have things just “work”. 
- 
 @Alan-Kilborn I just changed the “</Placemark” statement to “</SchemaData></ExtendedData”, and the expression works. Don’t know exactly why. I wonder why it is some people lose their mental acuity, while others don’t? I am an extremely serious hiker and have remained extremely strong with high endurance. If I had to make a choice, I think that I would rather lose the mental part, people understand it when you tell them that it’s because your old. Again, Thank you! 
