check-templates: Avoid duplicate tokenizing step.

Now we only tokenize the file once, and we pass
**validated** tokens to the pretty printer.

There are a few reasons for this:

    * It obviously saves a lot of extra computation
      just in terms of tokenization.

    * It allows our validator to add fields
      to the Token objects that help the pretty
      printer.

I also removed/tweaked a lot of legacy tests for
pretty_print.py that were exercising bizarrely
formatted HTML that we now simply ban during the
validation phase.
This commit is contained in:
Steve Howell
2021-12-02 12:19:19 +00:00
committed by Tim Abbott
parent 0decfa8da0
commit c0d72ba236
4 changed files with 68 additions and 236 deletions

View File

@@ -316,7 +316,7 @@ def tag_flavor(token: Token) -> Optional[str]:
raise AssertionError(f"tools programmer neglected to handle {kind} tokens")
def validate(fn: Optional[str] = None, text: Optional[str] = None) -> None:
def validate(fn: Optional[str] = None, text: Optional[str] = None) -> List[Token]:
assert fn or text
if fn is None:
@@ -445,6 +445,8 @@ def validate(fn: Optional[str] = None, text: Optional[str] = None) -> None:
ensure_matching_indentation(fn, tokens, lines)
return tokens
def ensure_matching_indentation(fn: str, tokens: List[Token], lines: List[str]) -> None:
for token in tokens: