The change in lock logic for `partialCached` in 0927cf739f was naive as it didn't consider cached partials calling other cached partials.
This changeset may look on the large side for this particular issue, but it pulls in part of a working branch, introducing `context.Context` in the template execution.
Note that the context is only partially implemented in this PR, but the upcoming use cases will, as one example, include having access to the top "dot" (e.g. `Page`) all the way down into partials and shortcodes etc.
The earlier benchmarks rerun against master:
```bash
name old time/op new time/op delta
IncludeCached-10 13.6ms ± 2% 13.8ms ± 1% ~ (p=0.343 n=4+4)
name old alloc/op new alloc/op delta
IncludeCached-10 5.30MB ± 0% 5.35MB ± 0% +0.96% (p=0.029 n=4+4)
name old allocs/op new allocs/op delta
IncludeCached-10 74.7k ± 0% 75.3k ± 0% +0.77% (p=0.029 n=4+4)
```
Fixes#9519
This commit also
* revises the change detection for templates used by content files in server mode.
* Adds a Page.RenderString method
Fixes#6545Fixes#4663Closes#6043
This is a big commit, but it deletes lots of code and simplifies a lot.
* Resolving the template funcs at execution time means we don't have to create template clones per site
* Having a custom map resolver means that we can remove the AST lower case transformation for the special lower case Params map
Not only is the above easier to reason about, it's also faster, especially if you have more than one language, as in the benchmark below:
```
name old time/op new time/op delta
SiteNew/Deep_content_tree-16 53.7ms ± 0% 48.1ms ± 2% -10.38% (p=0.029 n=4+4)
name old alloc/op new alloc/op delta
SiteNew/Deep_content_tree-16 41.0MB ± 0% 36.8MB ± 0% -10.26% (p=0.029 n=4+4)
name old allocs/op new allocs/op delta
SiteNew/Deep_content_tree-16 481k ± 0% 410k ± 0% -14.66% (p=0.029 n=4+4)
```
This should be even better if you also have lots of templates.
Closes#6594