How do i get rid of html formatting in word?

How do i get rid of html formatting in word?

Aaron has a document that contains a number of HTML tags, and he would like to remove the tags but maintain the formatting they represent. For instance, if he has a phrase that appears this way, he would like to remove the tags ( and ) but have "a phrase" appear in italics. Aaron is pretty sure this can be done with Find and Replace, but he's not quite sure how to go about it.

You are right, Aaron—you can use Find and Replace to accomplish the removal. The way you would do it is to follow these steps:

  1. Press Ctrl+H. Word displays the Replace tab of the Find and Replace dialog box.
  2. Click the More button, if it is available. (See Figure 1.)
  3. How do i get rid of html formatting in word?

    Figure 1. The Replace tab of the Find and Replace dialog box.

  4. Make sure the Use Wildcards check box is selected.
  5. In the Find What box, enter the following: \([!<]@)\
  6. In the Replace With box, enter the following: \1
  7. With the insertion point still in the Replace With box, press Ctrl+I once. The text "Italic" should appear just below the Replace With box.
  8. Click Replace All.

The code that you enter in the Find What box (step 4) may look a little daunting. All you are telling Word to do is to find the beginning HTML tag () followed by any number of characters and ending with the closing HTML tag (). The very short entry in the Replace With box (step 5) simply says to replace whatever is found with the contents of the first element of the Find What box that is surrounded by parentheses—which just happens to be the text between the two HTML tags.

If you want to eliminate the need to remember (or look up) the contents of the Find What box all the time, you can place the Find and Replace operation into a macro:

Sub ConvertItalicTags()
    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Italic = True
    With Selection.Find
        .Text = "\([!<]@)\"
        .Replacement.Text = "\1"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchAllWordForms = False
        .MatchSoundsLike = False
        .MatchWildcards = True
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
End Sub

Assign the macro to a shortcut key, and you can remove the italic HTML tags anytime you need. You could also expand the macro to make similar changes relative to other HTML tags you may need to remove. You may even want to make sure that alternate tags are dealt with. For instance, HTML uses both and tags to display information in italic, which means you should account for the possibility of both sets of tags in your macro.

Of course, there is an entirely different approach you could use to get rid of the HTML tags and still retain the formatting associated with those tags. That would be to save the HTML-encoded text into a text file, open it in your browser, copy the text within the browser window, and paste it directly into a Word document. If all goes well, you would have the desired formatted text in your finished document.

If you would like to know how to use the macros described on this page (or on any other page on the WordTips sites), I've prepared a special page that includes helpful information. Click here to open that special page in a new browser tab.

WordTips is your source for cost-effective Microsoft Word training. (Microsoft Word is the most popular word processing software in the world.) This tip (10308) applies to Microsoft Word 2007, 2010, 2013, 2016, 2019, and Word in Microsoft 365.

Author Bio

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. Learn more about Allen...

MORE FROM ALLEN

Shortening ZIP Codes

US ZIP Codes can be of two varieties: five-digits or nine-digits. Here's how to convert longer ZIP Codes to the shorter ...

Discover More

Odd Page Numbers Disappearing

Page numbers in printed pages are often a necessary part of formatting a document. What do you do if your printed output ...

Discover More

Forcing a Complete Spelling and Grammar Check

There are a couple of ways that various parts of a document can have spelling and grammar checking "turned off." This tip ...

Discover More

If you use Microsoft Word, you have almost certainly struggled with Word formatting issues, especially using Word documents created by others and edited by many people.

Pro Tip If you are struggling for more than a few minutes with formatting, it is usually best just to clear out the old formatting and then properly format the resulting clean document. Use Ctrl + A to select all text in a document and then click the Clear All Formatting button to remove the formatting from the text (aka character level formatting.)

How do i get rid of html formatting in word?

You can also select just a few paragraphs and use the same method to remove formatting from part of a document.

To quickly remove styles, expand Quick Styles to display the list of available styles. At the top of the options is Clear All which clears all formatting and styles from a document.

Power Users can learn these “Big 3” format removal keystroke shortcuts

CTRL-SPACE removes character-level formatting from the selected text (fonts, italics/bold, font size, etc.) but leaves paragraph formatting (indents, line spacing, etc.) intact.

CTRL-Q leaves fonts and other character formatting intact but reverts paragraph-level formatting to Normal Style of current document.

CTRL-SHIFT-N returns the selected text to Normal formatting, both character-level and paragraph-level.

How do I return to normal style in Word?

Change the default layout Open the template or a document based on the template whose default settings you want to change. On the Format menu, click Document, and then click the Layout tab. Make any changes that you want, and then click Default.

How do I get rid of embedded formatting in Word?

Use Ctrl + A to select all text in a document and then click the Clear All Formatting button to remove the formatting from the text (aka character level formatting.) You can also select just a few paragraphs and use the same method to remove formatting from part of a document.

How do you change HTML format in Word?

What To Know.
File > Save As. Select a location. Name the file, and select . html as the type. Press Save..
Editors like Dreamweaver can convert a Word document to HTML..