Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    Parse html file - Ideas wanted

    Help wanted · · · – – – · · ·
    4
    17
    1660
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • peterelli
      peterelli last edited by

      Hello,

      i have many html files and have to extract the bold data into a new text file seperated with comma and these 3 fields in one row. (lat,lon,name)
      This is my starting Point to generate a POI GPX File for intresting POIs for the next Holiday :-)

      So I am looking for a solution to do this with notepad
      Thank You very much for some steps to realize such solution

      Peter

      				{ 
      								lat: **45.929866**,
      								lon: **10.8264056**,
      								title: '**Canale di Tenno**',
      								html: '<table width="300" border="0" cellspacing="1" cellpadding="3"><tr><td style="width:54px"><img src="/Image/Image/Allegati/una_piazza_del_borgo_medievale_di_canale_di_tenno_in_trentino_1545119769.jpg"></td><td style="line-height:15px;text-align:left"><a href="/guide.php?cat1=4&cat2=8&cat3=12&cat4=151&lan=ita" class="link_m"><strong>Canale di Tenno &nbsp;&nbsp; &raquo;</strong> Leggi la guida<\/a><br><span style="font-size:12px; color:#60615e">Canale di Tenno (Trentino-Alto Adige): il lago e la vista al borgo</span></td></tr></table>',
      								zoom: 11,icon: '/2014/images/ico_borghi_castelli.png'},{ 
      								lat: 46.3756037,
      								lon: 11.0769321,
      								title: 'Casez',
      								html: '<table width="300" border="0" cellspacing="1" cellpadding="3"><tr><td style="width:54px"><img src="/Image/Image/Allegati/s_1487912811.jpg"></td><td style="line-height:15px;text-align:left"><a href="/guide.php?cat1=4&cat2=8&cat3=12&cat4=129&lan=ita" class="link_m"><strong>Casez &nbsp;&nbsp; &raquo;</strong> Leggi la guida<\/a><br><span style="font-size:12px; color:#60615e">Casez (Trentino): visita al borgo e al castello della Val di Non</span></td></tr></table>',
      								zoom: 11,icon: '/2014/images/ico_borghi_castelli.png'},{
      
      Ekopalypse 1 Reply Last reply Reply Quote 2
      • Ekopalypse
        Ekopalypse @peterelli last edited by

        @peterelli

        Looks like this might be a json file, isn’t it?
        I see lat and lon but where is name? Is this the title?
        Colud you give a complete example about what you have and what you want based on the example?

        1 Reply Last reply Reply Quote 0
        • Ekopalypse
          Ekopalypse last edited by

          I see, the markdown didn’t come through, the ** are the fields you are looking for.
          But is it really a html file or something converted from json?

          1 Reply Last reply Reply Quote 0
          • peterelli
            peterelli last edited by

            Its the souce text from the map: link text
            In this case the name is the title. Excuse me.

            Thank You for your Help.

            Ekopalypse 1 Reply Last reply Reply Quote 2
            • Ekopalypse
              Ekopalypse @peterelli last edited by

              @peterelli

              and the result you want to have is like this

              45.929866,10.8264056,Canale di Tenno,
              46.3756037,11.0769321,Casez,
              

              without the single quotes sorounding the place like Casez?

              1 Reply Last reply Reply Quote 2
              • peterelli
                peterelli last edited by

                Yes, and without the last comma in every row
                It looks beautiful

                Ekopalypse 1 Reply Last reply Reply Quote 0
                • peterelli
                  peterelli last edited by

                  maybe it is possible to select a folder containing multiple files to combine everything in a new Notepad file or using all actual open Notepad files.
                  But it is optional.

                  Ekopalypse 1 Reply Last reply Reply Quote 0
                  • Ekopalypse
                    Ekopalypse @peterelli last edited by

                    @peterelli

                    ok, I hope I can finished it within my lunch break otherwise, if not someone else is
                    stepping in, I have to follow up later today.

                    1 Reply Last reply Reply Quote 0
                    • peterelli
                      peterelli last edited by

                      Thank You but take your Time. Tomorrow is perfect.
                      Do i need to install something to get this script working?

                      Meta Chuh 1 Reply Last reply Reply Quote 0
                      • Ekopalypse
                        Ekopalypse @peterelli last edited by Ekopalypse

                        @peterelli

                        with a python script like this, which means you have to use PythonScript plugin,
                        it should create a new file with your text.
                        Lunch break over, as said, will follow up later today.

                        matches = []
                        regex = "lat:\s*(\d+\.\d+?),\R\s*lon:\s*(\d+\.\d+?),\R\s*title:\s*'(.+)',$"
                        
                        def extract_matches(match):
                            if match.lastindex > 0: 
                                text = [editor1.getTextRange(*match.span(i)) for i in range(1, match.lastindex + 1)]
                                matches.append(text)
                                    
                            
                        editor1.research(regex, extract_matches)
                        
                        new_text = ''
                        for match in matches:
                            new_text += ','.join(match)
                            new_text += '\r\n'
                        
                        notepad.new()
                        editor.setText(new_text)
                        
                        Alan Kilborn 1 Reply Last reply Reply Quote 2
                        • Alan Kilborn
                          Alan Kilborn @Ekopalypse last edited by

                          @Ekopalypse

                          What’s up with using editor1 instead of simply editor in a couple of places?

                          1 Reply Last reply Reply Quote 0
                          • Meta Chuh
                            Meta Chuh @peterelli last edited by

                            @peterelli

                            Do i need to install something to get this script working?

                            yes, you need to install the pythonscript plugin.

                            if you are on notepad++ 7.6.3 or above you can use this:
                            Guide: How to install the PythonScript plugin on Notepad++ 7.6.3, 7.6.4 and above

                            on 7.5.9 and below you can install it using the old plugin manager.

                            1 Reply Last reply Reply Quote 1
                            • peterelli
                              peterelli last edited by

                              Thank You very much for your help. Python is now installed. How can i see if it is working?

                              Now i struggle a little with the script above.
                              I placed in the scripts folder as gpx.py
                              Then i have to open the source file and use the script to extract the data? How?

                              Please excuse my questions. If somebody has later the same situation he or she can find a solution in ths thread.

                              Kind Regards
                              Peter

                              Alan Kilborn 1 Reply Last reply Reply Quote 1
                              • Alan Kilborn
                                Alan Kilborn @peterelli last edited by Alan Kilborn

                                @peterelli said:

                                Then i have to open the source file and use the script to extract the data? How?

                                With the “source file” active, do this:

                                Plugins (menu) > Pythonscript > Scripts > gpx

                                I would think that would do it.

                                Ekopalypse 1 Reply Last reply Reply Quote 1
                                • Ekopalypse
                                  Ekopalypse @Alan Kilborn last edited by Ekopalypse

                                  Thank you to you both for jumping in.

                                  @Alan-Kilborn

                                  Those things happen if you in hurry.
                                  To quickly test scripts I use editor1 for the data and editor2 for the scripts.
                                  Like here

                                  @peterelli - here a slightly modified version - faster if huge files are needed to be scanned.

                                  matches = []
                                  regex = "lat:\s*(\d+\.\d+?),\R\s*lon:\s*(\d+\.\d+?),\R\s*title:\s*'(.+)',$"
                                  
                                  
                                  def extract_matches(match):
                                      text = [match.group(i) for i in range(1, match.lastindex + 1)]
                                      matches.append(text)
                                  
                                  
                                  editor.research(regex, extract_matches)
                                  new_text = '\r\n'.join([','.join(match) for match in matches])
                                  
                                  notepad.new()
                                  editor.setText(new_text)
                                  
                                  1 Reply Last reply Reply Quote 2
                                  • peterelli
                                    peterelli last edited by peterelli

                                    Thanks to everyone, It works :-)
                                    Unfortunately I tried the first region only to see that the html files for the other regions are a little different formatted. I have the geocode part extracted. I would love to change the script by myself to learn this thing as i have such a job more often to do.
                                    So at the moment i try to understand -> regex = “lat:\s*(\d+.\d+?),\R\slon:\s(\d+.\d+?),\R\stitle:\s’(.+)',$”
                                    Where can i find some informations about the formatting options to separate the needed data?

                                    Big Thanks to @Ekopalypse for your Time and Effort.

                                    You find the full souce under [Abbruzzo(https://www.ilturista.info/ch/borghi_castelli/abruzzo/) -> then right clic on the map and show source.

                                    {
                                    lat:41.7552493,lon:13.9920420,title:‘Barrea’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1301683347_1301683347.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=41&lan=ita” class=“link_m”><strong>Barrea &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Barrea (Abruzzo): il lago e la visita al borgo fortificato</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.3566311,lon:13.6883765,title:‘Calascio’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_424811986_1497286893.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=53&lan=ita” class=“link_m”><strong>Calascio &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Calascio (Abruzzo): la Rocca e il borgo sul Gran Sasso</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.267972,lon:13.7508313,title:‘Capestrano’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1509447988_1509447988.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=61&lan=ita” class=“link_m”><strong>Capestrano &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Capestrano (Abruzzo): il Castello Piccolomini e la visita al borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.1572699,lon:14.0029290,title:‘Caramanico Terme’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1514559944_1514559944.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=10&lan=ita” class=“link_m”><strong>Caramanico Terme &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Caramanico Terme, cure termali nel parco della Majella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5192407,lon:14.0601666,title:‘Città Sant’Angelo’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/citta_1277753213.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=27&lan=ita” class=“link_m”><strong>Città Sant’Angelo &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Città Sant’Angelo (Abruzzo), visita al borgo lungo le sue rue</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.7664259,lon:13.9437777,title:‘Civitella Alfedena’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1302766235_1302766235.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=42&lan=ita” class=“link_m”><strong>Civitella Alfedena &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Civitella Alfedena, il borgo medievale e le piste del Passo Godi</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.9136454,lon:13.4286243,title:‘Civitella Roveto’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/municipio_di_civitella_roveto_1539852884.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=72&lan=ita” class=“link_m”><strong>Civitella Roveto &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Civitella Roveto (Abruzzo): visita al borgo della Marsica</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.8722360,lon:13.8672728,title:‘Colonnella’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/colo_1277753065.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=28&lan=ita” class=“link_m”><strong>Colonnella &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Colonnella (Abruzzo), week end nel borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5195408,lon:13.9724141,title:‘Elice’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1501575607_1501575607.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=56&lan=ita” class=“link_m”><strong>Elice &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Elice (Abruzzo): visita al borgo della provincia di Pescara</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.1898146,lon:14.2199462,title:‘Guardiagrele’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/la_chiesa_di_santa_maria_del_carmine_nel_borgo_antico_di_guardiagrele_in_abruzzo_1545901096.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=73&lan=ita” class=“link_m”><strong>Guardiagrele &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Guardiagrele (Abruzzo): visita al centro del borgo sulla Maiella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5821386,lon:13.6272978,title:‘Montorio al Vomano’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1449150693_1449150694.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=49&lan=ita” class=“link_m”><strong>Montorio al Vomano &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Montorio al Vomano, il borgo in Abruzzo ai piedi del Gran Sasso</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2359938,lon:13.7284432,title:‘Navelli’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/civitaretengatorre_1501170918.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=55&lan=ita” class=“link_m”><strong>Navelli &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Navelli (Abruzzo): visita al borgo e all’altopiano dello zafferano</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2849815,lon:13.4741662,title:‘Ocre’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1510151889_1510151889.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=59&lan=ita” class=“link_m”><strong>Ocre &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Ocre (Abruzzo): il Monastero fortezza di Santo Spirito e la visita al borgo</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.3569090,lon:14.4052100,title:‘Ortona’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_191542847_1453274500.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=32&lan=ita” class=“link_m”><strong>Ortona &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Ortona (Abruzzo), vacanza nel suo mare e sulla sua spiaggia</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.0514154,lon:13.9929105,title:‘Pacentro’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/shutterstock_129558167_1448378638.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=48&lan=ita” class=“link_m”><strong>Pacentro &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pacentro (Abruzzo): alla scoperta del borgo medievale di Madonna</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.8885916,lon:14.0642726,title:‘Pescocostanzo’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/peco1_1262275054.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=13&lan=ita” class=“link_m”><strong>Pescocostanzo &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pescocostanzo: visita al Borgo Abruzzese, tra sci e natura</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.5230484,lon:13.5533933,title:‘Pietracamela’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/pietra_ico_1266522273.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=21&lan=ita” class=“link_m”><strong>Pietracamela &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Pietracamela, visita al borgo e al Gran Sasso d’Italia</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:41.9344729,lon:13.9774741,title:‘Rocca Pia’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1301594978_1301594978.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=40&lan=ita” class=“link_m”><strong>Rocca Pia &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Rocca Pia (Abruzzo), visita al borgo nel Parco della Majella</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},{lat:42.2155397,lon:14.0244141,title:‘Roccamorice’,html:‘<table width=“300” border=“0” cellspacing=“1” cellpadding=“3”><tr><td style=“width:54px”><img src=“/Image/Image/Allegati/1428385583_1428385583.jpg”></td><td style=“line-height:15px;text-align:left”><a href=“/guide.php?cat1=4&cat2=8&cat3=11&cat4=23&lan=ita” class=“link_m”><strong>Roccamorice &nbsp;&nbsp; &raquo;</strong> Leggi la guida</a><br><span style=“font-size:12px; color:#60615e”>Roccamorice, visita al borgo e all’Eremo di San Bartolomeo in Legio</span></td></tr></table>’,zoom:11,icon:‘/2014/images/ico_borghi_castelli.png’},

                                    Alan Kilborn 1 Reply Last reply Reply Quote 3
                                    • Alan Kilborn
                                      Alan Kilborn @peterelli last edited by

                                      @peterelli said:

                                      Where can i find some informations about the formatting options to separate the needed data?

                                      This is a good starting point: https://notepad-plus-plus.org/community/topic/15765/faq-desk-where-to-find-regex-documentation

                                      1 Reply Last reply Reply Quote 2
                                      • First post
                                        Last post
                                      Copyright © 2014 NodeBB Forums | Contributors