Parse html file - Ideas wanted
-
maybe it is possible to select a folder containing multiple files to combine everything in a new Notepad file or using all actual open Notepad files.
But it is optional. -
ok, I hope I can finished it within my lunch break otherwise, if not someone else is
stepping in, I have to follow up later today. -
Thank You but take your Time. Tomorrow is perfect.
Do i need to install something to get this script working? -
with a python script like this, which means you have to use PythonScript plugin,
it should create a new file with your text.
Lunch break over, as said, will follow up later today.matches = [] regex = "lat:\s*(\d+\.\d+?),\R\s*lon:\s*(\d+\.\d+?),\R\s*title:\s*'(.+)',$" def extract_matches(match): if match.lastindex > 0: text = [editor1.getTextRange(*match.span(i)) for i in range(1, match.lastindex + 1)] matches.append(text) editor1.research(regex, extract_matches) new_text = '' for match in matches: new_text += ','.join(match) new_text += '\r\n' notepad.new() editor.setText(new_text)
-
What’s up with using
editor1
instead of simplyeditor
in a couple of places? -
Do i need to install something to get this script working?
yes, you need to install the pythonscript plugin.
if you are on notepad++ 7.6.3 or above you can use this:
Guide: How to install the PythonScript plugin on Notepad++ 7.6.3, 7.6.4 and aboveon 7.5.9 and below you can install it using the old
plugin manager
. -
Thank You very much for your help. Python is now installed. How can i see if it is working?
Now i struggle a little with the script above.
I placed in the scripts folder as gpx.py
Then i have to open the source file and use the script to extract the data? How?Please excuse my questions. If somebody has later the same situation he or she can find a solution in ths thread.
Kind Regards
Peter -
@peterelli said:
Then i have to open the source file and use the script to extract the data? How?
With the “source file” active, do this:
Plugins (menu) > Pythonscript > Scripts > gpx
I would think that would do it.
-
Thank you to you both for jumping in.
Those things happen if you in hurry.
To quickly test scripts I use editor1 for the data and editor2 for the scripts.
Like here@peterelli - here a slightly modified version - faster if huge files are needed to be scanned.
matches = [] regex = "lat:\s*(\d+\.\d+?),\R\s*lon:\s*(\d+\.\d+?),\R\s*title:\s*'(.+)',$" def extract_matches(match): text = [match.group(i) for i in range(1, match.lastindex + 1)] matches.append(text) editor.research(regex, extract_matches) new_text = '\r\n'.join([','.join(match) for match in matches]) notepad.new() editor.setText(new_text)
-
Thanks to everyone, It works :-)
Unfortunately I tried the first region only to see that the html files for the other regions are a little different formatted. I have the geocode part extracted. I would love to change the script by myself to learn this thing as i have such a job more often to do.
So at the moment i try to understand -> regex = “lat:\s*(\d+.\d+?),\R\slon:\s(\d+.\d+?),\R\stitle:\s’(.+)',$”
Where can i find some informations about the formatting options to separate the needed data?Big Thanks to @Ekopalypse for your Time and Effort.
You find the full souce under [Abbruzzo(https://www.ilturista.info/ch/borghi_castelli/abruzzo/) -> then right clic on the map and show source.
{
lat:41.7552493,lon:13.9920420,title:‘Barrea’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1301683347_1301683347.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=41&lan=ita” class=“link_m”><strong>Barrea »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Barrea (Abruzzo): il lago e la visita al borgo fortificato</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.3566311,lon:13.6883765,title:‘Calascio’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_424811986_1497286893.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=53&lan=ita” class=“link_m”><strong>Calascio »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Calascio (Abruzzo): la Rocca e il borgo sul Gran Sasso</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.267972,lon:13.7508313,title:‘Capestrano’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1509447988_1509447988.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=61&lan=ita” class=“link_m”><strong>Capestrano »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Capestrano (Abruzzo): il Castello Piccolomini e la visita al borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.1572699,lon:14.0029290,title:‘Caramanico Terme’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1514559944_1514559944.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=10&lan=ita” class=“link_m”><strong>Caramanico Terme »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Caramanico Terme, cure termali nel parco della Majella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5192407,lon:14.0601666,title:‘Città Sant’Angelo’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/citta_1277753213.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=27&lan=ita” class=“link_m”><strong>Città Sant’Angelo »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Città Sant’Angelo (Abruzzo), visita al borgo lungo le sue rue</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.7664259,lon:13.9437777,title:‘Civitella Alfedena’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1302766235_1302766235.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=42&lan=ita” class=“link_m”><strong>Civitella Alfedena »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Civitella Alfedena, il borgo medievale e le piste del Passo Godi</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.9136454,lon:13.4286243,title:‘Civitella Roveto’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/municipio_di_civitella_roveto_1539852884.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=72&lan=ita” class=“link_m”><strong>Civitella Roveto »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Civitella Roveto (Abruzzo): visita al borgo della Marsica</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.8722360,lon:13.8672728,title:‘Colonnella’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/colo_1277753065.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=28&lan=ita” class=“link_m”><strong>Colonnella »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Colonnella (Abruzzo), week end nel borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5195408,lon:13.9724141,title:‘Elice’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1501575607_1501575607.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=56&lan=ita” class=“link_m”><strong>Elice »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Elice (Abruzzo): visita al borgo della provincia di Pescara</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.1898146,lon:14.2199462,title:‘Guardiagrele’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/la_chiesa_di_santa_maria_del_carmine_nel_borgo_antico_di_guardiagrele_in_abruzzo_1545901096.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=73&lan=ita” class=“link_m”><strong>Guardiagrele »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Guardiagrele (Abruzzo): visita al centro del borgo sulla Maiella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5821386,lon:13.6272978,title:‘Montorio al Vomano’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1449150693_1449150694.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=49&lan=ita” class=“link_m”><strong>Montorio al Vomano »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Montorio al Vomano, il borgo in Abruzzo ai piedi del Gran Sasso</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2359938,lon:13.7284432,title:‘Navelli’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/civitaretengatorre_1501170918.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=55&lan=ita” class=“link_m”><strong>Navelli »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Navelli (Abruzzo): visita al borgo e all’altopiano dello zafferano</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2849815,lon:13.4741662,title:‘Ocre’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1510151889_1510151889.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=59&lan=ita” class=“link_m”><strong>Ocre »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Ocre (Abruzzo): il Monastero fortezza di Santo Spirito e la visita al borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.3569090,lon:14.4052100,title:‘Ortona’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_191542847_1453274500.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=32&lan=ita” class=“link_m”><strong>Ortona »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Ortona (Abruzzo), vacanza nel suo mare e sulla sua spiaggia</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.0514154,lon:13.9929105,title:‘Pacentro’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_129558167_1448378638.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=48&lan=ita” class=“link_m”><strong>Pacentro »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pacentro (Abruzzo): alla scoperta del borgo medievale di Madonna</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.8885916,lon:14.0642726,title:‘Pescocostanzo’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/peco1_1262275054.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=13&lan=ita” class=“link_m”><strong>Pescocostanzo »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pescocostanzo: visita al Borgo Abruzzese, tra sci e natura</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5230484,lon:13.5533933,title:‘Pietracamela’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/pietra_ico_1266522273.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=21&lan=ita” class=“link_m”><strong>Pietracamela »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pietracamela, visita al borgo e al Gran Sasso d’Italia</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.9344729,lon:13.9774741,title:‘Rocca Pia’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1301594978_1301594978.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=40&lan=ita” class=“link_m”><strong>Rocca Pia »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Rocca Pia (Abruzzo), visita al borgo nel Parco della Majella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2155397,lon:14.0244141,title:‘Roccamorice’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1428385583_1428385583.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=23&lan=ita” class=“link_m”><strong>Roccamorice »</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Roccamorice, visita al borgo e all’Eremo di San Bartolomeo in Legio</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’}, -
@peterelli said:
Where can i find some informations about the formatting options to separate the needed data?
This is a good starting point: https://notepad-plus-plus.org/community/topic/15765/faq-desk-where-to-find-regex-documentation