Saturday, April 18, 2015

UltraEdit macro to select HTML/XML tag

In a previous post from 2010, UltraEdit macro to select HTML/XML tag, I detailed two UltraEdit macros to select HTML/XML tags backwards and forwards. It had a couple of problems, such as not being able to distinguish between PRE and P when you start select P tags, so this version fixes that.

Here are the macros. The first is used to select the previous tag: I have it mapped to control+shift+,.

InsertMode
ColumnModeOff
HexOff
UltraEditReOn
Clipboard 2
IfSel
Find RegExp Up Select "</++^c^{>^}^{[ ^p^r^n^t]+[~>]++>^}"
Else
Find Up "<"
Find RegExp "[A-Za-z]"
SelectWord
Copy
Find Up "<"
Key LEFT ARROW
Find RegExp Select "</++^c^{>^}^{[ ^p^r^n^t]+[~>]++>^}"
EndIf
Clipboard 0

The second is used to select the next tag: I have it mapped to control+shift+..

InsertMode
ColumnModeOff
HexOff
UltraEditReOn
Clipboard 2
IfSel
Find RegExp Select "</++^c^{>^}^{[ ^p^r^n^t]+[~>]++>^}"
Else
Find "<"
Find RegExp "[A-Za-z]"
SelectWord
Copy
Find Up "<"
Key LEFT ARROW
Find RegExp Select "</++^c^{>^}^{[ ^p^r^n^t]+[~>]++>^}"
EndIf
Clipboard 0

A few notes about the macros.

  • Select previous tag.
    1. Use it by leaving the cursor within an opening tag (<p>) or closing tag (</p> or unary tag (<br>) or within the text content of a tag. Do not select any text.
    2. Press the shortcut (control+shift+,).
    3. The macro will begin running Else part of the IfSel condition (because no text was selected).
      1. Find "<"
        • Looks for the first left angle bracket before the cursor.
      2. Find RegExp "[A-Za-z]"
        • Find the next letter after the left angle bracket - which will be the start of the tag name.
      3. SelectWord
        • Select the tag name.
      4. Copy
        • Copy it - to the second clipboard, which was selected earlier in the macro by the command Clipboard 2.
      5. Find Up "<"
        • Select the first left angle bracket before the cursor (again).
      6. Key LEFT ARROW
        • Make sure cursor is to the left of that angle bracket so the next command (a Find) will have that character in scope.
      7. Find RegExp Select "</++^c^{>^}^{[ ^p^r^n^t]+[~>]++>^}"
        1. Select the entire open/close/unary tag.
        2. Find - because we had previously moved to the left of the opening left angle bracket of the tag, the search will take this tag into account.
        3. RegExp - use regular expressions. An earlier macro command (UltraEditReOn) specified that UltraEdit regular expressions are turned on (as opposed to Perl or Unix ones).
        4. Select - whatever we find with the next expression will be selected in UltraEdit.
        5. A breakdown of the expression: </++^c^{>^}^{[ ^p^r^n^t]+[~>]++>^}
          1. </++
            • Find left angle bracket and zero or more forward slashes: matches < or </.
          2. ^c
            • Find text in clipboard 2 (which we selected previously).
          3. ^{>^}^{[ ^p^r^n^t]+[~>]++>^}
            • This is an OR expression. ^{A^}^{B^} says find A or B. So this expression says to find either one of:
              • >
                • The right angle bracket that closes a tag. This covers the simple cases, e.g. <p>.
              • [ ^p^r^n^t]+[~>]++>
                • [ ^p^r^n^t]+ one or zero of: space or newline (DOS, Mac or Unix) or tab.
                • [~>]++ zero or more of any character other than the right angle bracket.
                • > the right angle bracket.
                • This covers tags with attributes, e.g. <p style=""> which may or may not be spread across multiple lines.
    4. Run the macro again with shortcut (control+shift+,).
    5. The macro will run the IfSel condition because now there is text selected from the previous run.
    6. It will run the exact same Find as was described above except for one difference.
      1. Find RegExp Up Select "</++^c^{>^}^{[ ^p^r^n^t]+[~>]++>^}"
      2. The Up part means that we will look for the next complete tag to the left of what we already have selected from the previous run.

Here is a sample of HTML that I used to test this on.

<html>
   <head>
      <title>Some Title</title>
   </head>
   <body>
      <div>
         <div style="color: red;"
               id="divWithId">Nested <span>div</span>.
            <pre>
               monospaced
            </pre>
         </div>
      </div>
   </body>
</html>

For the first run, I place the cursor as indicated below by | (either within the DIV open tag or within the DIV content.

<html>
   <head>
      <title>Some Title</title>
   </head>
   <body>
      <div>
         <div style="color: red;"
               id="divWit|hId">Nested| <span>div</span>.
            <pre>
               monospaced
            </pre>
         </div>
      </div>
   </body>
</html>

Run the select previous tag macro (I have it mapped to control+shift+,) and text will be selected as indicated below.

<html>
   <head>
      <title>Some Title</title>
   </head>
   <body>
      <div>
         <div style="color: red;"
            id="divWithId">Nested <span>div</span>.
            <pre>
               monospaced
            </pre>
         </div>
      </div>
   </body>
</html>

Now run the select next tag macro (I have it mapped to control+shift+.) and text will be selected as indicated below.

<html>
   <head>
      <title>Some Title</title>
   </head>
   <body>
      <div>
         <div style="color: red;"
            id="divWithId">Nested <span>div</span>.
            <pre>
               monospaced
            </pre>
         </div>
      </div>
   </body>
</html>

Run the select previous tag macro again and text will be selected as indicated below.

<html>
   <head>
      <title>Some Title</title>
   </head>
   <body>
      <div>
         <div style="color: red;"
            id="divWithId">Nested <span>div</span>.
            <pre>
               monospaced
            </pre>
         </div>
      </div>
   </body>
</html>

Run the select next tag macro again and text will be selected as indicated below.

<html>
   <head>
      <title>Some Title</title>
   </head>
   <body>
      <div>
         <div style="color: red;"
            id="divWithId">Nested <span>div</span>.
            <pre>
               monospaced
            </pre>
         </div>
      </div>
   </body>
</html>

If you use either macro again at this point, nothing will happen because there are no more DIV elements in the document not already selected.

Importantly, these macros work correctly on similarly named tags such as zip zipfileset (which I have used in XML for Ant build files). If I am select zip tags, it skips nested zipfileset elements.

Two final notes.

  1. Something this macro cannot do is to select an entire element that contains nested elements of the same tag. For example, consider the HTML below.
    <div id="outer">
       <div id="inner">
          Inner DIV.
       </div>
    </div>
    
    Macros in UltraEdit cannot be used to select the entire outer DIV because you can't store state in a macro, which you would need to do in order to count nested elements to make sure you select the entire outer one. My workaround for this situation is to just make it easier to keep selecting next/previous DIV tag so that you can achieve the same effect with a bit of repetition.
  2. UltraEdit Find commands in macros can use Perl regular expressions, which are very powerful too. One thing they can do much more easily is to treat newline characters as part of the wildcard. In a Perl regex, (?s) tells the regular expression to include newline characters when matching a . wildcard. You can also use backreferences in Find and Replace expressions. However, backreferences don't persist between macro calls. So, these two macros store the tag name in clipboard 2 so that each time you call one of the macros afterwards, it "remembers" what tag you were searching for by looking at clipboard 2.

    I wrote this up on the UltraEdit forum here: macro to select HTML tag v2.