I would really like to parse the HTML we produce from the library to
ensure that we don't generate malformed-HTML. This is unfortunately
hard because we both want pretty strict parsing and we want to parse
html5 fragments. For now, we just do a basic sanity check.
We also may want to switch to Google Diff-Match-Patch, as that can
clean up the resulting diffs.
(imported from commit 3772f92135cfd7423c335335f861f2c11462a8db)