replace / move numbers from one row to another with regular expression (in parentheses)
-
@Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):
The problem is, how do I copy the parentheses and numbers back to their place between tags?
Not sure where you are heading but you still haven’t answered my question.
I’m going to take a punt since it does seem a bit obvious from the examples:
- both files contain the same number of lines and are in the same order, line 1 of file 1 corresponds with line 1 of file 2.
- I’d prefix each line in each file with an ascending number 1,2,3. I’d also prefix (behind the number) and “a” for file 1 and a “b” for file 2.
- I’d then combine both files and sort lexicographically.
- I’d use a regex to copy the number from #a line and replace the equivalent number in the #b line, at the same time removing the #a line (since we don’t need the file 1 line anymore).
- I’d then remove the prefix from the lines (#b) leaving the changed file 2 content.
Terry
-
ok, I made a regex, it is kind a step forward. I managed to move the numbers and parentheses from the first 3 rows to the next 3, but not in the correct order.
SEARCH:
(?s)(<li><a href=)(.*?)(\(\d+\))(<\/a><\/li>).*?\K(\w+)REPLACE BY:
\3if I could do that, it’s a sign that it’s somehow possible. but I am not very good at regex. Maybe @guy038 will improve my regex . :)
-
Hi, @robin-cruise, @terry-r and All,
Sorry, but a significant lot of information is missing for a good comprehension of your goal :
-
How many lines
<li><a href="page-##.html" title="Page ##">Page ## (##)</a></li>contains yourFile 1.htmlfile ? -
Are all these lines consecutive ?
-
Even if some of these lines are consecutive, are there some other similar sections, containing this same type of lines ?
-
Does the
File 2.htmlfile contains the same number of lines<li><a href="page-##.html" title="Page ##">Page ## (##)</a></li>than theFile 1.htmlfile ? -
If the
File 2.htmlfile contains also some sections of these lines, is the layout quite identical, between the two files ?
In short, could you provide a larger part of your files to get a more precise idea of the changes to do ?
Best Regards,
guy038
-
-
-
both of them, File 1.html just like File 2.html contains 40 lines. both files have the same structure, except the numbers in parentheses.
-
The numbers (in parentheses) are different on both files, but I want them to be the same. Right no, they are not consecutive numbers, but random ones.
-
page-1.html" title=“Page 1”>Page 1 is a short version, just an example.
The real pages are like: <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (22)</a></li>
Basicaly, it’s about a meniu on a website translate in 2 languages, and that number in parenthesis , ex. (22), is the number of the articles on that section. Should be the same numbers in both languages
-
-
@Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):
both of them, File 1.html just like File 2.html contains 40 lines.
Did you ever read my questions? Specifically I asked whether the same number of lines in each. Also is line 1 in file 1 equal to line 1 in file 2, thus the line 1 number (##) copied across to line 1 in file 2.
If so then I have already provided the solution in words (just needs translating into code), which I wrote out for you. Since you seem to have a good idea on how to create regexes, did you not try to follow my instructions?
Tery
-
@Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):
- both of them, File 1.html just like File 2.html contains 40 lines. both files have the same structure, except the numbers in parentheses.
The real pages are like: <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (22)</a></li>
Given these condictions and if you are allowed to install the LuaScript Plugin, there is a script than can select, copy the numbers in parentheses from one page and paste them in the other.
Please confirm and will provide further details.
Take care and have fun!
-
sorry for my objection , but it looks easier to just change the wording of the file2 to make it fit file1 and keep the numbers untouched . so file 1 would get replaced and translated by file2 which is the right one concerning the numbers. idk
-
this solved the problem.
In short, I copy all text before the parentheses from File 1 into the column A from EXCEL. Then I copy the parentheses with numbers from File 2 into the column B from Excel, put the rest after the parentheses into the column C from Excel. Then select all Excel columns into FILE 2, and replace line. Now the parentheses are identical.
Step 1 - Use regex to select all parentheses (with numbers), then copy them to an excel in column 2
SEARCH:
.*?(\(.*?\)).*REPLACE BY:\1\2Step 2 Use regex to select everything on each line before the parentheses:
SEARCH:
\(.*\).*REPLACE BY:(leave empty)Step 3 - Copy the resulting lines to an excel file in column 1
Step 4 - Copy directly to column 4 of excel what is after parentheses:
</a></li>or use regex to obtain this result
</a></li>, select everything after round brackets SEARCH:^(.*\)).*REPLACE BY:\1Step 5 - Copy all excel content to a new notepad ++ file.
If there are too many empty spaces, search and replace 2 spaces with one space
-
Hello, @robin-cruise, @terry-R, @astrosofista, @carypt and All,
Sorry to be very late, as I answered to many posts, recently !
Here is my method :
-
Open your
File 1.htmlandFile 2.htmlfiles in N++ -
At the end of the
File 1.htmlcontents, insert, for instance, a new line===== -
Append the
File 2.htmlcontents, right after that new line
Note that we’ll need two specific characters, which are not used yet in your
HTMLfiles :-
One char to separate the contents of the two files, in *
File 1.html**. I chose the=sign. Hence the line of five=signs -
One char used by the regex S/R in order to mark the numbers between parentheses already processed. I chose the
#character -
Of course, you may choose any character for these two specific chars. Just modify the regex, accordingly
-
Preferably, avoid the true regex symbols
\ ^ $ . | ? * + ( ) [ ] { }
- For instance, after merging the
File 2.htmlcontents intoFile 1.html, we would obtain this tiny text, with the=====separation
<li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (22)</a></li> bla bla bla bla bla bla bla bla <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (18)</a></li> bla bla <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (23)</a></li> ===== <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (2)</a></li> bla bla <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (12)</a></li> bla bla bla bla bla bla bla bla <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (10)</a></li>-
Move to the very beginning of
File 1.html -
Open the Replace dialog (
Ctrl + H)-
SEARCH
(?s)\((\d+)\)(.+=====.+?)\((\d+)\)|^=====.+|#(?!.*^===) -
REPLACE
?1\(\3#\)\2\(\3#\) -
Untick, if necessary, the
Wrap aroundoption -
Select the
Regular expressionsearch mode
-
-
Now, keeping the Replace dialog opened, click on the
Replace Allbutton ( or preferably hit theAlt + Ashortcut ) repeatedly, until the messageReplace All: 0 occurrences were replaced from caret to end-of-fileis displayed !
And you’ll get the expected
File 1.htmlcontents :<li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (2)</a></li> bla bla bla bla bla bla bla bla <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (12)</a></li> bla bla <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (10)</a></li>- Save the new
File 1.htmlcontents, with all the updated numbers between parentheses !
Notes :
-
For
NReplace All operations processed, in totality :-
The
N - 2first operations :-
Replace the numbers of
File 1.htmlwith the corresponding numbers ofFile 2.html, located after the=====line -
Add a
#marker to the two numbers processed
-
-
The
N - 1operation deletes from line=====till the very end of file, in order to suppress the temporary appended part -
The
Noperation deletes all the existing#markers, of theFile 1.html
-
Best Regards,
guy038
-
-
@guy038 said in replace / move numbers from one row to another with regular expression (in parentheses):
=====
brilliant. You really are very good @guy038
THANK YOU !
-
@guy038 said in replace / move numbers from one row to another with regular expression (in parentheses):
?1(\3#)\2(\3#)
by the way, on replace, what does it mean
?1\(\3#\)\2\(\3#\)(step by step, please) ? -
@Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):
by the way, on replace, what does it mean ?1(\3#)\2(\3#) (step by step, please) ?
It’s fairly “easy”: :-)
?1controls the rest of it: If capture group #1 was NOT matched, the replacement is “nothing” (aka deletion)If capture group #1 WAS matched, then the replacement consists of:
- opening parens:
( - what was matched with capture group #3
- a literal
# - closing parens:
) - what was matched with capture group #2
- opening parens:
( - what was matched with capture group #3
- a literal
# - closing parens:
)
- opening parens:
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login