Community
    • Login

    Replacing A unique ID value with another based on a reference table in multiple text files

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    3 Posts 3 Posters 397 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Michael MadisonM
      Michael Madison
      last edited by

      Hi,

      I am sure this has been covered, in one way or another, but the searches come up with very code heavy solution that leave me clueless, and I have zero idea what i am doing.

      I am trying to edit a few thousand files like the below. The Value WELL,“XXXXXX” is unique to each file, and I want to use a reference table to replace that unique 6 digit identifier with a 14 digit unique identifier. Can the find in files and replace function do this?

      d14badde-8c65-44e2-a0d0-7deb1a9b5013-image.png

      Mark OlsonM mkupperM 2 Replies Last reply Reply Quote 0
      • Mark OlsonM
        Mark Olson @Michael Madison
        last edited by Mark Olson

        @Michael-Madison said in Replacing A unique ID value with another based on a reference table in multiple text files:

        I want to use a reference table to replace that unique 6 digit identifier with a 14 digit unique identifier. Can the find in files and replace function do this?

        Yes, technically you could use the find in files functionality for this. If your lookup table was relatively small, it wouldn’t be too bad. However, I suspect that if you had something like 12-15 (or more) rows in your reference table, a scripting solution would be far more efficient.

        I’m not going to describe the scripting solution because that is a generic programming problem that does not have any Notepad++ - specific features. StackOverflow (and honestly probably ChatGPT) are both perfectly capable of supplying solutions to this problem. I don’t use ChatGPT personally, but I’m sure that if you asked it for tips on how to learn Python and then asked it for a script on how to solve this specific problem, that would be a pretty robust long-term solution to your problem, since ChatGPT is good at solving problems that have been solved hundreds of times before on the internet.

        Instead, I’m going to focus on how you could solve this using Notepad++, even though it might not be the best solution.

        For example, to perform the mapping (foo -> R_ONE, bar -> R_TWO, baz -> R_THREE) you would do this:
        Find what: (?-i)(foo)|(bar)|(baz)
        Replace with: R_(?1ONE)(?2TWO)(?3THREE)
        Search Mode: Regular expression

        Hopefully you see the pattern here; here is some relevant documentation.

        It is possible to use regular expressions and Notepad++ menu functions to convert a list of key-value pairs (like foo,ONE bar,TWO baz,THREE) to a find/replace regex pair that will do what I described above, but the fact is that if you have so many key-value pairs that you would need to generate the regular expressions programmatically, the regex-based solution I described would be too inefficient and you’re back to needing a scripting solution.

        1 Reply Last reply Reply Quote 4
        • mkupperM
          mkupper @Michael Madison
          last edited by

          @Michael-Madison said in Replacing A unique ID value with another based on a reference table in multiple text files:

          I am sure this has been covered, in one way or another, but the searches come up with very code heavy solution that leave me clueless, and I have zero idea what i am doing.

          Unfortunately, the problem you described is best dealt with using either code or a data management tool. Notepad++ is a text editor. While you can use Notepad++'s search/replace to do some crude data management tasks you end up with stuff that is both hard to create and harder to maintain.

          I’ll use the idea proposed by @Mark-Olson and show this example. Let’s say we have a list of ten 6-digit WELL numbers and their corresponding 14-digit numbers.

          729647	01816277784739
          832291	05203379882010
          323016	08529698086864
          852429	25301745193483
          613474	43535571764114
          421123	69712418942707
          495790	62415119153289
          901250	16906577703160
          214509	29859837478695
          936045	11344788176363
          

          A search/replace on multiple files of:
          Search: (?-i)(?<=^WELL,")(?:(729647)|(832291)|(323016)|(852429)|(613474)|(421123)|(495790)|(901250)|(214509)|(936045))(?="$)
          Replace: (?1(01816277784739))(?2(05203379882010))(?3(08529698086864))(?4(25301745193483))(?5(43535571764114))(?6(69712418942707))(?7(62415119153289))(?8(16906577703160))(?9(29859837478695))(?10(11344788176363))

          will translate files containing

          WELL,"729647"
          WELL,"832291"
          WELL,"323016"
          WELL,"852429"
          WELL,"613474"
          WELL,"421123"
          WELL,"495790"
          WELL,"901250"
          WELL,"214509"
          WELL,"936045"
          

          into files containing

          WELL,"01816277784739"
          WELL,"05203379882010"
          WELL,"08529698086864"
          WELL,"25301745193483"
          WELL,"43535571764114"
          WELL,"69712418942707"
          WELL,"62415119153289"
          WELL,"16906577703160"
          WELL,"29859837478695"
          WELL,"11344788176363"
          

          You can expand that search/replace to deal with 20, maybe 30, translations at a time. The issue is that there’s a 1000 to 2000 character limit how long search/replace expressions can be. My example works for dealing with ten values at a time. I did not push the concept to see what the limit is. You said you had thousands of values meaning you would need to construct dozens, and possibly hundreds, of search/replace operations and to run them on all the data files.

          This is an area where code shines vs dealing with it in Notepad++ alone as code can easily deal with translating a value such as 421123 into the desired 69712418942707. You run that on all your files and you will have the 14-digit WELL values.

          1 Reply Last reply Reply Quote 3
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors