How to Strip out HTML Tags from a String in JavaScript

Here are two ways to strip out HTML tags from a string in JavaScript.

  1. Using the replace() method with a regular expression
  2. Using DOMParser API and the textContent property

Method 1: Using the replace() method with a regular expression

To strip out HTML tags from a string in JavaScript, you can use the “replace() method with a regular expression in combination with .textContent and .innerText properties from HTML DOM.”

<body>
  <script>
    function stripHTML(input) {
      return input.replace(/<\/?[^>]+(>|$)/g, "");
    }

   // Usage:
   var htmlString = "<div>Hello <span>World</span>!</div>";
   console.log(stripHTML(htmlString));
 </script>
</body>

Output

Using the replace() method with a regular expression

Method 2: Using DomParser API and the textContent property

To strip out HTML tags from a string in JavaScript, you can also “use a combination of the DOMParser API and the textContent property.”

<body>
  <script>
    function stripHTML(html) {
      var doc = new DOMParser().parseFromString(html, 'text/html');
      return doc.body.textContent || "";
    }

   // Usage:
   var htmlString = "<div>Hello <span>World</span>!</div>";
   console.log(stripHTML(htmlString));
 </script>
</body>

Output

Using DomParser API and the textContent property

However, it’s important to note that using the DOM to parse arbitrary HTML can be a security risk, especially if the HTML content comes from an untrusted source.

Always sanitize and validate data to prevent potential security issues. If you only deal with trusted content, these methods should work fine.

Regular expressions provide a concise way to remove HTML tags, but they may not handle all edge cases, especially with malformed HTML.

If you know the structure and source of your HTML content and are sure it’s consistent, this method can be very effective. Otherwise, for more robust parsing, the DOM-based methods are generally safer.

Leave a Comment