Parsing the DOM

The parse() function from the html-react-parser package converts HTML strings into React elements. It allows you to take HTML and render it as if it were JSX. This can be particularly useful when you’re working with content that comes as HTML from external sources (such as a CMS) and you want to include that content in your React components. It can also be used to filter and modify the React elements.

🚀 TL;DR Show me the code. Look at the 25-dom-parsing branch. This site is deployed here.

Setup

Before we integrate this into our demo site let’s do some simple tests to see how it works. Import some functions from the html-react-parser package.

import parse, { domToReact } from 'html-react-parser';

Here’s what those functions do:

  • parse() — converts HTML strings into React elements; and
  • domToReact() — converts parsed HTML or DOM nodes into React elements.

These two functions do similar yet subtly different things. As we will see below the domToReact() function is often called as an option during the execution of parse().

Parsing

Create a simple HTML string.

const html = '<p class="remove">Remove!</p> <strong id="hello" >Hello World!</strong>';

Now run parse() on the string.

const parsed = parse(html);

The stringified result looks like this:

JSON.stringify(parsed)
[
  {
    "type": "p",
    "key": "0",
    "ref": null,
    "props": {
      "className": "remove",
      "children": "Remove this!"
    },
    "_owner": null
  },
  " ",
  {
    "type": "strong",
    "key": "2",
    "ref": null,
    "props": {
      "id": "hello",
      "children": "Hello World!"
    },
    "_owner": null
  }
]

The HTML tags have been converted into React elements. All of the contents from the HTML (tag attributes and content) have been incorporated into those elements. Note that the whitespace between tags in the original string is propagated into the React elements too.

Parsing with Options

We can also do things with the content during the parsing process by passing an additional argument to parse(). Define a dictionary with a replace key assigned a function that will be applied to each element in the HTML DOM.

const options = {
  trim: true, // Trim whitespace.
  replace: (domNode, index) => {
    if (domNode.attribs && domNode.attribs.class === "remove") {
      return <></>;
    }
    if (domNode.type === "tag" && domNode.name === "strong") {
      return <span className="strong">{domToReact(domNode.children, options)}</span>
    }
    // Elements that are not replaced.
    return null;
  }
};
const parsed = parse(html, options);

This will do two things:

  1. remove all tags with the remove class (this will match the first <p> tag in the HTML) and
  2. convert <strong> tags into <span class="strong">. The domToReact() function is used to handle the contents of the original <strong> tag.

Here’s the stringified result:

[
  {
    "key": "0",
    "ref": null,
    "props": {},
    "_owner": null
  },
  {
    "type": "span",
    "key": "2",
    "ref": null,
    "props": {
      "className": "strong",
      "children": "Hello World!"
    },
    "_owner": null
  }
]

Things to notice:

  • the whitespace has been removed because of the trim: true option;
  • the tag with class remove has been removed; and
  • the <strong> tag has been replaced by a <span> with class strong.

For more details on the replace options (and some other available options) take a look at the package documentation.

Demo

In the demo site the src/templates/article.js file applies the parse() function, replacing <p> tags with <div class="text">. Use Developer Tools to confirm by looking at the source from the What is Gatsby? page.