Community
    • Login

    Parse html file - Ideas wanted

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    17 Posts 4 Posters 5.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EkopalypseE
      Ekopalypse @peterelli
      last edited by

      @peterelli

      ok, I hope I can finished it within my lunch break otherwise, if not someone else is
      stepping in, I have to follow up later today.

      1 Reply Last reply Reply Quote 0
      • peterelliP
        peterelli
        last edited by

        Thank You but take your Time. Tomorrow is perfect.
        Do i need to install something to get this script working?

        Meta ChuhM 1 Reply Last reply Reply Quote 0
        • EkopalypseE
          Ekopalypse @peterelli
          last edited by Ekopalypse

          @peterelli

          with a python script like this, which means you have to use PythonScript plugin,
          it should create a new file with your text.
          Lunch break over, as said, will follow up later today.

          matches = []
          regex = "lat:\s*(\d+\.\d+?),\R\s*lon:\s*(\d+\.\d+?),\R\s*title:\s*'(.+)',$"
          
          def extract_matches(match):
              if match.lastindex > 0: 
                  text = [editor1.getTextRange(*match.span(i)) for i in range(1, match.lastindex + 1)]
                  matches.append(text)
                      
              
          editor1.research(regex, extract_matches)
          
          new_text = ''
          for match in matches:
              new_text += ','.join(match)
              new_text += '\r\n'
          
          notepad.new()
          editor.setText(new_text)
          
          Alan KilbornA 1 Reply Last reply Reply Quote 2
          • Alan KilbornA
            Alan Kilborn @Ekopalypse
            last edited by

            @Ekopalypse

            What’s up with using editor1 instead of simply editor in a couple of places?

            1 Reply Last reply Reply Quote 0
            • Meta ChuhM
              Meta Chuh moderator @peterelli
              last edited by

              @peterelli

              Do i need to install something to get this script working?

              yes, you need to install the pythonscript plugin.

              if you are on notepad++ 7.6.3 or above you can use this:
              Guide: How to install the PythonScript plugin on Notepad++ 7.6.3, 7.6.4 and above

              on 7.5.9 and below you can install it using the old plugin manager.

              1 Reply Last reply Reply Quote 1
              • peterelliP
                peterelli
                last edited by

                Thank You very much for your help. Python is now installed. How can i see if it is working?

                Now i struggle a little with the script above.
                I placed in the scripts folder as gpx.py
                Then i have to open the source file and use the script to extract the data? How?

                Please excuse my questions. If somebody has later the same situation he or she can find a solution in ths thread.

                Kind Regards
                Peter

                Alan KilbornA 1 Reply Last reply Reply Quote 1
                • Alan KilbornA
                  Alan Kilborn @peterelli
                  last edited by Alan Kilborn

                  @peterelli said:

                  Then i have to open the source file and use the script to extract the data? How?

                  With the “source file” active, do this:

                  Plugins (menu) > Pythonscript > Scripts > gpx

                  I would think that would do it.

                  EkopalypseE 1 Reply Last reply Reply Quote 1
                  • EkopalypseE
                    Ekopalypse @Alan Kilborn
                    last edited by Ekopalypse

                    Thank you to you both for jumping in.

                    @Alan-Kilborn

                    Those things happen if you in hurry.
                    To quickly test scripts I use editor1 for the data and editor2 for the scripts.
                    Like here

                    @peterelli - here a slightly modified version - faster if huge files are needed to be scanned.

                    matches = []
                    regex = "lat:\s*(\d+\.\d+?),\R\s*lon:\s*(\d+\.\d+?),\R\s*title:\s*'(.+)',$"
                    
                    
                    def extract_matches(match):
                        text = [match.group(i) for i in range(1, match.lastindex + 1)]
                        matches.append(text)
                    
                    
                    editor.research(regex, extract_matches)
                    new_text = '\r\n'.join([','.join(match) for match in matches])
                    
                    notepad.new()
                    editor.setText(new_text)
                    
                    1 Reply Last reply Reply Quote 2
                    • peterelliP
                      peterelli
                      last edited by peterelli

                      Thanks to everyone, It works :-)
                      Unfortunately I tried the first region only to see that the html files for the other regions are a little different formatted. I have the geocode part extracted. I would love to change the script by myself to learn this thing as i have such a job more often to do.
                      So at the moment i try to understand -> regex = “lat:\s*(\d+.\d+?),\R\slon:\s(\d+.\d+?),\R\stitle:\s’(.+)',$”
                      Where can i find some informations about the formatting options to separate the needed data?

                      Big Thanks to @Ekopalypse for your Time and Effort.

                      You find the full souce under [Abbruzzo(https://www.ilturista.info/ch/borghi_castelli/abruzzo/) -> then right clic on the map and show source.

                      {
                      lat:41.7552493,lon:13.9920420,title:‘Barrea’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1301683347_1301683347.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=41&lan=ita” class=“link_m”><strong>Barrea &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Barrea (Abruzzo): il lago e la visita al borgo fortificato</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.3566311,lon:13.6883765,title:‘Calascio’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_424811986_1497286893.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=53&lan=ita” class=“link_m”><strong>Calascio &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Calascio (Abruzzo): la Rocca e il borgo sul Gran Sasso</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.267972,lon:13.7508313,title:‘Capestrano’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1509447988_1509447988.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=61&lan=ita” class=“link_m”><strong>Capestrano &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Capestrano (Abruzzo): il Castello Piccolomini e la visita al borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.1572699,lon:14.0029290,title:‘Caramanico Terme’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1514559944_1514559944.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=10&lan=ita” class=“link_m”><strong>Caramanico Terme &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Caramanico Terme, cure termali nel parco della Majella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5192407,lon:14.0601666,title:‘Città Sant’Angelo’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/citta_1277753213.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=27&lan=ita” class=“link_m”><strong>Città Sant’Angelo &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Città Sant’Angelo (Abruzzo), visita al borgo lungo le sue rue</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.7664259,lon:13.9437777,title:‘Civitella Alfedena’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1302766235_1302766235.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=42&lan=ita” class=“link_m”><strong>Civitella Alfedena &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Civitella Alfedena, il borgo medievale e le piste del Passo Godi</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.9136454,lon:13.4286243,title:‘Civitella Roveto’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/municipio_di_civitella_roveto_1539852884.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=72&lan=ita” class=“link_m”><strong>Civitella Roveto &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Civitella Roveto (Abruzzo): visita al borgo della Marsica</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.8722360,lon:13.8672728,title:‘Colonnella’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/colo_1277753065.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=28&lan=ita” class=“link_m”><strong>Colonnella &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Colonnella (Abruzzo), week end nel borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5195408,lon:13.9724141,title:‘Elice’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1501575607_1501575607.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=56&lan=ita” class=“link_m”><strong>Elice &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Elice (Abruzzo): visita al borgo della provincia di Pescara</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.1898146,lon:14.2199462,title:‘Guardiagrele’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/la_chiesa_di_santa_maria_del_carmine_nel_borgo_antico_di_guardiagrele_in_abruzzo_1545901096.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=73&lan=ita” class=“link_m”><strong>Guardiagrele &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Guardiagrele (Abruzzo): visita al centro del borgo sulla Maiella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5821386,lon:13.6272978,title:‘Montorio al Vomano’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1449150693_1449150694.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=49&lan=ita” class=“link_m”><strong>Montorio al Vomano &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Montorio al Vomano, il borgo in Abruzzo ai piedi del Gran Sasso</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2359938,lon:13.7284432,title:‘Navelli’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/civitaretengatorre_1501170918.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=55&lan=ita” class=“link_m”><strong>Navelli &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Navelli (Abruzzo): visita al borgo e all’altopiano dello zafferano</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2849815,lon:13.4741662,title:‘Ocre’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1510151889_1510151889.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=59&lan=ita” class=“link_m”><strong>Ocre &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Ocre (Abruzzo): il Monastero fortezza di Santo Spirito e la visita al borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.3569090,lon:14.4052100,title:‘Ortona’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_191542847_1453274500.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=32&lan=ita” class=“link_m”><strong>Ortona &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Ortona (Abruzzo), vacanza nel suo mare e sulla sua spiaggia</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.0514154,lon:13.9929105,title:‘Pacentro’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_129558167_1448378638.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=48&lan=ita” class=“link_m”><strong>Pacentro &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pacentro (Abruzzo): alla scoperta del borgo medievale di Madonna</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.8885916,lon:14.0642726,title:‘Pescocostanzo’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/peco1_1262275054.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=13&lan=ita” class=“link_m”><strong>Pescocostanzo &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pescocostanzo: visita al Borgo Abruzzese, tra sci e natura</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5230484,lon:13.5533933,title:‘Pietracamela’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/pietra_ico_1266522273.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=21&lan=ita” class=“link_m”><strong>Pietracamela &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pietracamela, visita al borgo e al Gran Sasso d’Italia</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.9344729,lon:13.9774741,title:‘Rocca Pia’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1301594978_1301594978.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=40&lan=ita” class=“link_m”><strong>Rocca Pia &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Rocca Pia (Abruzzo), visita al borgo nel Parco della Majella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2155397,lon:14.0244141,title:‘Roccamorice’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1428385583_1428385583.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=23&lan=ita” class=“link_m”><strong>Roccamorice &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Roccamorice, visita al borgo e all’Eremo di San Bartolomeo in Legio</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},

                      Alan KilbornA 1 Reply Last reply Reply Quote 3
                      • Alan KilbornA
                        Alan Kilborn @peterelli
                        last edited by

                        @peterelli said:

                        Where can i find some informations about the formatting options to separate the needed data?

                        This is a good starting point: https://notepad-plus-plus.org/community/topic/15765/faq-desk-where-to-find-regex-documentation

                        1 Reply Last reply Reply Quote 2
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors