How to copy content from Web page, and paste text and links only into notepad++



  • Hi,
    How to copy content from Web page, and paste text and links only into notepad++

    I want to for example copy mid portion of the page at https://docs.microsoft.com/en-us/azure/devops/organizations/settings/work/customize-process-field?view=azure-devops

    And paste that text into Notepad++, however the links for the text is lost.
    Don’t want the images
    but only text and links

    I am trying to prepare notes, with text along with links it connects for verification by others.

    Please help.



  • @manoharreddyporeddy
    Go to the browser console (F12) and select the necessary code block!

    Imgur



  • Hello
    @andrecool-68

    thanks for the reply
    I already tried that thing

    I want “only text and links”.
    This means don’t want HTML tags either.
    Hope you understood the question better now.

    Thanks for trying to help me and community, this is a basic thing that we would be able to do in NPP.



  • @manoharreddyporeddy said in How to copy content from Web page, and paste text and links only into notepad++:

    this is a basic thing that we would be able to do in NPP

    Why do you think this is a “basic thing”?
    To Notepad++, HTML tags are text.
    So, separating may not be as easy as you think.
    Why don’t you directly show some before text and your desired after text; that is put forth some effort?
    Then maybe some kind people can make suggestions for you.



  • @manoharreddyporeddy Why aren’t you using HTML? Text formatting is very beautiful!



  • @manoharreddyporeddy said in How to copy content from Web page, and paste text and links only into notepad++:

    I want “only text and links”.

    What follows is my assumption of what you’re asking for - it isn’t quite clear to me either what you want.

    Sounds like you want to paste the “links” as clickable URLs with the href name instead of the URL. Like so maybe:

    e590818b-e042-48ef-8942-9ab4cf91cbb7-image.png

    Note this is Microsoft Word pasting “rich text”. Notepad++ is a text editor not capable of translating URL names to hypertext links you can click on. Notepad++ can highlight and make clickable hypertext links if it detects the “http[s]://”-like pattern, but rich copy / paste from a web browser to Notepad++ with some magic URL transformation isn’t possible (unless maybe with a plugin).

    Cheers.



  • Hello, @manoharreddyporeddy, and All,

    May be this road-map could be OK :

    => We’ve been redirected to MicrosoftDocs /azure-devops-docs, on GitHub site

    • Then click on the Raw button, on the right of screen

    • Select all this raw page

    • Copy and paste it in a new N++ tab

    Voila !

    Best Regards,

    guy038



  • This post is deleted!


  • @Alan-Kilborn

    Thanks for the reply :)

    We both love NPP :)

    I want “only text and links”
    I asked above because - both are supported by NPP :)



  • @guy038

    Thanks for the reply :)

    But it is also getting like below HTML too:

    <a id=“open-process-wit”> </a>
    <a id=“add-field”> </a>
    <a id=“add-custom-field”> </a>



  • @Michael-Vincent

    You are close to what I asked :)

    Below is an example.
    I felt there should be a way because both text & URL are supported.

    43a87a8f-3456-438e-9733-08691b38b9da-image.png



  • @andrecool-68

    :) not sure I understand
    but may be because you could not understand the way I wrote

    Non-technical people won’t understand the HTML.

    Pl see the this once:
    https://community.notepad-plus-plus.org/topic/19530/how-to-copy-content-from-web-page-and-paste-text-and-links-only-into-notepad/11



  • @manoharreddyporeddy said in How to copy content from Web page, and paste text and links only into notepad++:

    I felt there should be a way because both text & URL are supported.

    Notepad++ is a text editor, not an HTML parsing engine. “URL are supported” in that when displaying pure text, Notepad++ runs a regex looking for things that are mostly URL-like, and highlights them and provides a clickable interface to launch an external process (your browser) to get a page. But Notepad++'s “URL” interface has nothing to do with HTML.

    What you are asking for is to somehow make your text editor a general purpose rendering engine that takes incoming HTML, processes it, performs all the background tasks that web servers and/or browsers perform to provide the full webpage (not just the raw HTML that the server originally sent), and get it displayed in your preferred format.

    Just a few things to note:

    • the page you referenced has internal and external javascript tags. Javascript can change the content of the webpage in the browser. Good luck getting that to work in a non-browser situation.
    • the paragraph you referenced never actually used that exact URL. Specifically, the code for that and the next paragraph is
    <p>For a list of all fields defined for your organization—which includes all fields defined for system and inherited processes—see <a href="#review-fields" data-linktype="self-bookmark">Review fields</a>.</p>
    
    <p>Once you've added a custom field, you can create <a href="../../../boards/queries/using-queries?view=azure-devops" data-linktype="relative-path">queries</a>, <a href="../../../report/dashboards/charts?view=azure-devops" data-linktype="relative-path">charts</a>, or <a href="../../../report/powerbi/create-quick-report?view=azure-devops" data-linktype="relative-path">Analytics views and Power BI reports</a> to track data related to it.</p>
    
    

    Notice that the URL of the link you showed is #review-fields. How do you expect Notepad++, which is not a web browser, to be able to magically know what the main URL is, and properly transform that relative/partial URL into a full URL? You would have to know beforehand that this page happens to use a <link ... rel="canonical"> tag somewhere arbitrarily else in the document, rather than an equally-valid <base href=...> which is the HTML learned for the same concept lo those many years ago.

    And then it would have to not only know when to append it before URLs like #review-fields, but also how to fix URLs like the ../../../boards/queries/using-queries?view=azure-devops in the next paragraph, which requires changing directories while processing.

    What you are asking for in your one clarification post is that we write for you a whole HTML web-browser that happens to render in your preferred format using regex inside Notepad++. I don’t think so.

    If you wanted something sane, like

    <p>For a list of all fields defined for your organization—which includes all fields defined for system and inherited processes—see <a href="#review-fields" data-linktype="self-bookmark">Review fields</a>.</p>
    

    becomes

    For a list of all fields defined for your organization—which includes all fields defined for system and inherited processes—see Review fields (#review-fields).
    

    That’s barely doable, using a lot of assumptions. But anything more than that is not going to happen in a way that satisfies you.
    For this, I would do it as the following, with the huge caveat that there are many assumptions (explicit and hidden) which must be satisfied for this to work. It will be a two-step process:

    1. Convert <a ... href="url"...>link text</a> to link text (url)
      • FIND = (?si)<a.*?href="(.*?)".*?>(.*?)</a>
      • REPLACE = $2 \($1\)
      • SEARCH MODE = regular expression
      • EXPLICIT ASSUMPTIONS = no <a ...> has a > inside. all href="url" use double quote " rather than single quote '. no links are manipulated by or require javascript
    2. Convert any other tag opening or tag closing into nothingness
      • FIND = (?is)</?\w+.*?>
      • REPLACE = empty / no value
      • SEARCH MODE = regular expression
      • EXPLICIT ASSUMPTIONS = no tag has > inside. no tags are manipulated by or require javascript

    With the two paragraphs I quoted above:

    <p>For a list of all fields defined for your organization—which includes all fields defined for system and inherited processes—see <a href="#review-fields" data-linktype="self-bookmark">Review fields</a>.</p> 
    <p>Once you've added a custom field, you can create <a href="../../../boards/queries/using-queries?view=azure-devops" data-linktype="relative-path">queries</a>, <a href="../../../report/dashboards/charts?view=azure-devops" data-linktype="relative-path">charts</a>, or <a href="../../../report/powerbi/create-quick-report?view=azure-devops" data-linktype="relative-path">Analytics views and Power BI reports</a> to track data related to it.</p>
    

    they would be transformed into

    For a list of all fields defined for your organization—which includes all fields defined for system and inherited processes—see Review fields (#review-fields). 
    Once you've added a custom field, you can create queries (../../../boards/queries/using-queries?view=azure-devops), charts (../../../report/dashboards/charts?view=azure-devops), or Analytics views and Power BI reports (../../../report/powerbi/create-quick-report?view=azure-devops) to track data related to it.
    

    In the old days, I might have suggested “user stylesheets”, which is a feature some browsers implemented that allowed the web-surfer to apply their own stylesheets to a given page. That quick search shows that chrome doesn’t, but maybe firefox does still support them. If you do have access to a browser with “user stylesheet” support, you might be able to apply something like

    a:link:after, a:visited:after {content:" (" attr(href) ") "; font-size:75%; font-family: monospace; }
    

    I embed this in some wiki-like pages that I write (for my own reference) on my company’s intranet, using an @media print, so that when I print pages to PDF, it includes the URL for the links. But I cannot guarantee that there is any “user stylesheet” support, or, if there is, that it would honor a:link:after and similar. However, if you can make that work, it might then show the text you want in the browser, so that you can then copy/paste the rendered text into Notepad++.



  • @PeterJones said in How to copy content from Web page, and paste text and links only into notepad++:

    Convert <a … href=“url”…>link text</a> to link text (url)

    FIND = (?si)<a.?href="(.?)".?>(.?)</a>
    REPLACE = $2 ($1)
    SEARCH MODE = regular expression
    EXPLICIT ASSUMPTIONS = no <a …> has a > inside. all href=“url” use double quote " rather than single quote '. no links are manipulated by or require javascript

    Convert any other tag opening or tag closing into nothingness

    FIND = (?is)</?\w+.*?>

    Yes, you are close:

    1. Convert <a … href=“url”…>link text</a> to link text (url)
    2. Convert any other tag opening or tag closing into nothingness

    A. The above two work should be part of “Paste special” as part of menu/ tool bar icon. (Ideally and doable)

    B. For A to happen, where do we get the html from?
    I am not sure if we copy a HTML content page, it is giving all details.
    May be it does copy all details to clipboard, that is why, when you paste in Word, it provides links linked to text as hyperlinks. NPP only needs to put the URLs aside the text.

    So, since B is possible, A should be possible.

    In 2020, we should be able to do this as part of Paste Special.

    I know that just a user like me feels about NPP.
    I can write a plugin, unfortunately I don’t know how to write for NPP, and too much tied up with other works.
    I am not expecting some solution immediately, at least we were able to discuss objectively for most part, leaving small tech issues in either of our discussions.

    Thanks, hopefully someone will help in Paste special in future.
    Yes, it looks basic functionality even though it looks like HTML parser for others, assuming clipboard has information in HTML, it majorly capturing URL parts using () and placing beside text using \1, and removes all tags.



  • @manoharreddyporeddy ,

    should be part of

    In your opinion. Not everyone wants or needs your use case, and might be really annoyed if Paste Special did that instead of doing some other, important-to-them feature instead of the useless-to-them feature of extracting URLs from HTML.

    hopefully someone will help in Paste special in future.

    This forum is for discussion, and helping fellow users figure out how to accomplish their goals. But feature requests are not tracked here, and asking for something here does not notify the development team that the feature is desired.
    If you think strongly that it’s a feature Notepad++ should have, then the appropriate place to make feature requests is described in this FAQ.

    Yes, you are close:

    If those two steps work for you, then you can record them as a macro, and then assign a keyboard shortcut to that macro, thus allowing you to use that feature to your heart’s content.

    For A to happen, where do we get the html from?

    Every modern browser I know of has a view-source feature (often Ctrl+U) or a save-as feature (often Ctrl+S or Ctrl+Shift+S).

    May be it does copy all details to clipboard, that is why, when you paste in Word, it provides links linked to text as hyperlinks. NPP only needs to put the URLs aside the text.

    In Windows, when you copy from certain sources, it plugs multiple values into the keyboard. Notepad++, being a text editor, requests the plaintext version. To get it something different, you could write a plugin which requests a different component of the clipboard from Windows, and does something with it. Since you are unwilling to do that, that seems an awful lot of work to ask someone else to go through (for free) to save you from having to type Ctrl+U before copying.



  • Hello, @manoharreddyporeddy, and All,

    Assuming your example link, below :

    https://docs.microsoft.com/en-us/azure/devops/organizations/settings/work/customize-process-field?view=azure-devops

    And the raw GitHub page, got by the method described in my previous post and pasted in a new N++ tab,

    Here are some regex S/R in order to get a fairly neat text, close to what you expect to !


    • The process consists of 20 S/Rs, which must be carried out in order (only steps 2 to 7 are interchangeable !)

    • For all the subsequent regex S/R to perform, in the Replace dialog :

      • Tick the Wrap around and Regular expression options

      • Click on the Replace All, option, exclusively


    First, we run some trivial S/R to get rid of some unwanted ranges of text and modify some other parts.

    Note, at step 7, that any zone <a id="xxxxxx"> </a> just represents a bookmark which is replaced with the simple text ◊xxxxxx, with the character ( \x{25CA} ), beginning current line. You may change this character as you like !

    In this regard, not that all headers ( #..# ), in the articles, are, also, absolute self-bookmarks, by default

    Look at the comments column for further explanations on each S/R :

    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    | ST |                                    SEARCH                            |       REPLACEMENT        |                               COMMENTS                               |     NUMBER      |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    |  0 |  (?s)<!--.+?-->                                                      |  Leave EMPTY             |  SUPPRESSION of any XML COMMENT block / line                         |  2 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    |  1 |  (?s)\A^.+(?=^#\x20)                                                 |  Leave EMPTY             |  SUPPRESSION of anything BEFORE the FIRST header text                |  1 Occurrence   |
    |    |                                                                      |                          |                                                                      |                 |
    |  2 |  (?-s)^\h*:::.+\R                                                    |  Leave EMPTY             |  NON-INFORMATIVE zone, to be DELETED                                 |  4 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    |  3 |  (?-si)>\h*[!div\x20class=".+\R                                      |  Leave EMPTY             |  NON-INFORMATIVE zone, to be DELETED                                 |  7 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    |  4 |  (?-i)</?strong>                                                     |  **                      |  REPLACEMENT of the STRING <Strong> or </Strong> with the STRING **  | 30 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    |  5 |  (?-i)&(mdash|#8212|#[xX]2014);                                      |  \x{2014}                |  REPLACEMENT of HTML syntax of EM DASH char with the char ITSELF  —  |  4 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    |  6 |  (?-i)[!(NOTE|TIP|IMPORTANT|WARNING)]                                |  \1:                     |  CHANGE of [!NOTE], [!TIP], [!IMPORTANT], [!WARNING] with XXXXXX:    |  5 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    |  7 |  (?-si)^\h*<a\x20id="(.+?)".+(</a>|/>)                               |  ◊\x20\1                 |  ◊BOOKMARK_Name   ( Char ◊ = char LOSANGE [ \x{25CA} ] )             | 18 Occurrences  |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    

    Now, we have to solve the tricky problem of hyper-links, and, specifically, the relative links !!

    Regarding your example, all these links are, fundamentally, a combination of the main link,( <Addr> ), below, with the text embedded between the parentheses (.....)

    • <Addr> = https://github.com/MicrosoftDocs/azure-devops-docs/blob/master/docs/organizations/settings/work/customize-process-field.md

    Remark : The last part of the path ( customize-process-field.md ), is only used in self-bookmark links ( See below )

    There are 3 general types of links :

    • Self-bookmarks links [aaaa](#zzzz), which represent the real link [aaaa](<Addr>#zzzz)

    • Pictures links ![aaaa](xxxx/yyyy/zzzz/name.png, which represent the real link ![aaaa](<Addr>xxxx/yyyy/zzzz/name.png)

    • Articles links : [aaaa](xxxx/yyyy/zzzz/name.md, which represent the real link [aaaa](<Addr>xxxx/yyyy/zzzz/name.md)


    In addition, regarding the last two types, above, some relative-link operands, as ../ may occur, embedded in parentheses, giving the syntax (../../../xxxxxx/yyyyyy/zzzzzz/name.ext). What does this means ?

    In fact, these links can be considered as relative links ! To explain the usefulness of this syntax, here is a very basic example :

    Let’s take an initial link https://company.com/aaaa/bbbb/cccc/dddd/eeee/ffff/gggg/hhhh/name.md If, for instance, a relative link is :

    • (../../../xxxx/yyyy/zzzz/name.png, its real link is https://company.com/aaaa/bbbb/cccc/dddd/eeee/xxxx/yyyy/zzzz/name.png

    • (../../xxxx/yyyy/zzzz/name.png, its real link is https://company.com/aaaa/bbbb/cccc/dddd/eeee/ffff/xxxx/yyyy/zzzz/name.png

    • (../xxxx/yyyy/zzzz/name.png, its real link is https://company.com/aaaa/bbbb/cccc/dddd/eeee/ffff/gggg/xxxx/yyyy/zzzz/name.png

    • (xxxx/yyyy/zzzz/name.png, its real link is https://company.com/aaaa/bbbb/cccc/dddd/eeee/ffff/gggg/hhhh/xxxx/yyyy/zzzz/name.png

    As you can see, in order to re-build up the correct absolute link, you need to subtract, from the end of the address, as many sub-folders as there are ../ sections in the relative path, before adding the final part of the path !!

    For example, if the relative address is (../../../xxxx/yyyy/zzzz/name.png), you take the entire link, omit the 3 last sub-folders ( ffff/gggg/hhhh/ ) and, then, add the part xxxx/yyyy/zzzz/name.png, giving the absolute path https://company.com/aaaa/bbbb/cccc/dddd/eeee/xxxx/yyyy/zzzz/name.png


    So :

    • First, in step 8, any absolute link is simply rewritten with its final syntax [....] ( http..../.... )

    • Then, in step 9, we normalize the tag <img src="xxxx/yyyy/zzzz/name.png" alt="aaaa" ... /> to the form ![aaaa](xxxx/yyyy/zzzz/name.png)

    • Now, in step 10, we insert the link <Addr> in the special [![.....](.....)](.....) syntax, before each ( symbol

    • Again, in step 11, we insert the link <Addr> in the special [!INCLUDE [.....](.....)] syntax, before the ( symbol

    • And, in step 12, we insert the link <Addr> in all the other relative links ![...](....), [....](....) and [....](....#....), before the ( symbol

    • Finally, in step 13 , we can already change the specific self-bookmark links to their final absolute links !

    Hence, the six subsequent S/Rs, below :

    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    | ST |                                SEARCH                                |       REPLACEMENT        |                               COMMENTS                               |     NUMBER      |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    |  8 |  (?-si)([.+?])\((http.+?)\)                                          |  \1\x20\(\x20\2\x20\)    |  ABSOLUTE link [....](http..../....)  =>  [....] ( http..../.... )   |  0 Occurrence   |
    |    |                                                                      |                          |                                                                      |                 |
    |  9 |  (?-si)^\h*<img src="(.+?)" alt="(.+?)".+                            |  ![\2]\(\1\)             |  REFORMATING of the TAG <img src="......../>  as  ![.....](.....)    |  7 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    | 10 |  (?-s)([![.+?])(\(.+?\)])(\(.+?\))                                   |  \1<Addr>\2<Addr>\3      |  INSERTION of |ADDR|  between  ![...] or [![...](...)]  and  (...)   |  0 occurrence   |
    |    |                                                                      |                          |                                                                      |                 |
    | 11 |  (?-si)[!(INCLUDE).+?](\(.+?\))]                                     |  [\1]<Addr>\2            |  INSERTION of |ADDR|, between      [INCLUDE]       and  (.....)      |  6 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    | 12 |  (?-s)(!?[.+?])(\((?!/).+?\))                                        |  \1<Addr>\2              |  INSERTION of |ADDR|, between  ![.....] or [.....]  and  (.....)     | 56 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    | 13 |  (?-si)([.+?])(http.+?)\((#.+?)\)                                    |  \1\x20\(\x20\2\3\x20\)  |  Change of SELF-BOOKMARKS (#.......) into their ABSOLUTE link        |  7 Occurrences  |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    

    So, after step 13, the remaining relative links are temporarily written, in one of these 3 forms :

    • ![alternate text]<Addr>(relative_link.png), for the pictures

    • [alternate text or INCLUDE]<Addr>(relative_link.md), for the articles

    • [![alternate text]<Addr>(relative_link.png)]<Addr>(relative_link.md) for the composites

    However, note that, at step 12, any relative link, beginning with a /, right after the ( delimiter, is not modified


    Continued in thenext post !

    BR

    guy038



  • Hi, @manoharreddyporeddy, and All,

    Continuation of the previous post !

    Now, we must determinate all the levels of these relative paths, found in a specific text. In other words, how many ../ syntaxes exist !

    We’ll use easy regexes, below, and simply click on the Count button, in the Find dialog

    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    | ST |                                SEARCH                                |       REPLACEMENT        |                               COMMENTS                               |     NUMBER      |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    |    |  \((\.\./){4,}\w                                                     |                          |   0 zone  (../../../       /../xxxx/xxxx/xxxxxxx)                    |  0 Occurrence   |
    |    |                                                                      |                          |                                                                      |                 |
    |    |  \((\.\./){3}\w                                                      |                          |  16 zones (../../../xxxx/xxxxxxxx) =>  Future use of QAUNTIFIER {3}  | 16 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    | 14 |  \((\.\./){2}\w                                                      |                          |   0 zone  (../../xxxx/xxxx/xxxxxx) =>                                |  0 Occurrence   |
    |    |                                                                      |                          |                                                                      |                 |
    |    |  \((\.\./){1}\w                                                      |                          |   6 zones (../xxxx/xxxx/xxxxxxxxx) =>  Future use of QAUNTIFIER {1}  |  6 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    |    |  \.md\(\w                                                            |                          |  33 zones (xxxx/xxxx/xxxxxxxxxxxx) =>  Future use of QAUNTIFIER {0}  | 33 Occurrences  |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    

    From the table, above, we deduce that, in your text, only relative paths beginning with (../../../ or (../ and, of course, some without any part ../, occur

    So, we’re going to reconstitute all the absolute paths, in reverse order. That is to say, beginning with relative paths containing (../../../xxxx/yyyy/zzzz/name.ext, then with (../xxxx/yyyy/zzzz/name.ext and, finally, simple forms (xxxx/yyyy/zzzz/name.ext

    The number of sections ../ are indicated in the quantifier {#}, at two locations for each regex

    Note that each regex dot char ( . ), below, has been replaced with the [^()[] syntax, which represents any single char, different from any parenthesis character ( and ) and from an opening square bracket [. Indeed, one must plan the possibility of several links in a single line and/or possible other block(s) of parentheses, eventually nested !

    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    | ST |                                SEARCH                                |       REPLACEMENT        |                               COMMENTS                               |     NUMBER      |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    | 15 |  (?-si)](http[^()[]+/)([^()[]+?/){3}[^()[]*\((\.\./){3}([^()[]+)\)   |  ]\x20\(\x20\1\4\x20\)   |  RELATIVE links (../../../xxxx/xxxxx) are changed as ABSOLUTE links  | 16 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    | 16 |  (?-si)](http[^()[]+/)([^()[]+?/){2}[^()[]*\((\.\./){2}([^()[]+)\)   |  ]\x20\(\x20\1\4\x20\)   |  RELATIVE links (../../xxxxx/xxxxxxx) are changed as ABSOLUTE links  |  0 Occurrence   |
    |    |                                                                      |                          |                                                                      |                 |
    | 17 |  (?-si)](http[^()[]+/)([^()[]+?/){1}[^()[]*\((\.\./){1}([^()[]+)\)   |  ]\x20\(\x20\1\4\x20\)   |  RELATIVE links (../xxxx/xxxx/xxxxxx) are changed as ABSOLUTE links  |  6 Occurrences  |
    |    |                                                                      |                          |                                                                      |                 |
    | 18 |  (?-si)](http[^()[]+/)([^()[]+?/){0}[^()[]*\((\.\./){0}([^()[]+)\)   |  ]\x20\(\x20\1\4\x20\)   |  RELATIVE links (xxxx/xxxx/xxxxxxxxx) are changed as ABSOLUTE links  | 33 Occurrences  |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    

    Finally, if occurrences were found, at step 10, we slightly modify the output of the composite links [![.....] ( ..... )] ( ..... ), again, for readability :

    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    | ST |                                SEARCH                                |       REPLACEMENT        |                               COMMENTS                               |     NUMBER      |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    | 19 | (?-si)[(![.+?]\x20\(\x20http.+?\))]                                  |  \1\x20-                 |  CHANGE [![....] ( .... )] ( .... )  as  ![....] ( .... ) - ( .... ) |  0 Occurrence   |
    •----•----------------------------------------------------------------------•--------------------------•----------------------------------------------------------------------•-----------------•
    

    To end, some simple tasks remain to do :

    • You may need to remove or add a few blank lines for a better presentation

    • You may delete all trailing blank characters

    • In few cases, some numbered markdown lists are still shown as :

    1. bla blah
    
    1. bla blah
    
    1. bla blah
    

    Simply, renumber these lines, as usual :

    1. bla blah
    
    2. bla blah
    
    3. bla blah
    

    So, in order to recapitulate the links’s management :

    • The syntax [Text](Absolute address) has been changed into [Text] ( Absolute address )

    • The syntax ![Text](Address to xxxxx.png) has been changed into ![Text] ( Absolute address to xxxxx.png )

    • The syntax [Text](Address to xxxxx.md) has been changed into [Text] ( Absolute Address to xxxxx.md )

    • The syntax [![Text](Address to xxxx.png)](Address to xxxxx.md) has been changed into [Text] ( Absolute address to xxxxx.png ) - ( Absolute address to xxxxx.md )

    • The syntax [!INCLUDE [Text](Address to xxxxx.md)] has been changed into [INCLUDE] ( Absolute address to xxxxx.md )

    IMPORTANT :

    • I preferred to keep the alternate text of the links between square brackets for readability => [alternate text]

    • Note that all the links, related to a picture, have an ! symbol before the alternate text

    • I also kept all the absolute paths between parentheses, separated from the [alternate text] part by one space character

    • However, after tests, I realized that in order that a link, between parentheses, is fully functional, you must surround the address with space chars. So, the final syntax used is ( https://....../name.ext )

    • All links are functional but the two links, beginning with (/xxx, at lines 43 and 271 of the initial text


    May be, these links, below, about Markdown syntax, could be valuable :

    Best Regards,

    guy038

    P.S. : I also tested the raw pages, from these two links. This helped me to identify some special cases ;-))


Log in to reply