Move SummaryLength into the ContentSpec struct and refactor the
relevant summary functions to be methods of ContentSpec. The new
summaryLength struct member is configurable by the summaryLength config
value, and the default remains 70. Also updates hugolib/page to use the
refactored methods.
Resolves#3734
This avoids having to execute these expensive operations for sites not using these values.
This commit sums up a set of wordcounting and autosummary related performance improvements.
The effect of these kind of depends on what features your site use, but a benchmark from 4 Hugo sites in the wild shows promise:
```
benchmark old ns/op new ns/op delta
BenchmarkHugo-4 21293005843 20032857342 -5.92%
benchmark old allocs new allocs delta
BenchmarkHugo-4 65290922 65186032 -0.16%
benchmark old bytes new bytes delta
BenchmarkHugo-4 9771213416 9681866464 -0.91%
```
Closes#2378
It is obviously more efficient when we do not care about the actual words.
```
BenchmarkTotalWords-4 100000 18795 ns/op 0 B/op 0 allocs/op
BenchmarkTotalWordsOld-4 30000 46751 ns/op 6400 B/op 1 allocs/op
```
For people using autogenerated summaries, this is one of the hot spots in the memory department.
We don't need to split al the content into words to do proper summary truncation.
This is obviously more effective:
```
BenchmarkTestTruncateWordsToWholeSentence-4 300000 4720 ns/op 0 B/op 0 allocs/op
BenchmarkTestTruncateWordsToWholeSentenceOld-4 100000 17699 ns/op 3072 B/op 3 allocs/op
```
Started to increase coverage in helpers package, now at 74.9% of statements.
In the process, also a few minor changes have been applied to content.go.
* Content.go has undergone a formatting refactor regarding comments
* Unused function TruncateWords has been removed
* RenderingContext's "mmark" has been changed to use MmarkRender
* Content_test.go added to cover content.go's functionality
* add global `hasCJKLanguage` flag, if true, turn on auto-detecting CJKLanguage
* add `isCJKLanguage` frontmatter to force specify whether is CJKLanguage or not
* For .Summary: If isCJKLanguage is true, use the runes as basis for truncation, else keep as today.
* For WordCount: If isCJKLanguage is true, use the runes as basis for calculation, else keep as today.
* Unexport RuneCount
Fixes#1377
First step to use initialisms that golint suggests,
for example:
Line 116: func GetHtmlRenderer should be GetHTMLRenderer
as see on http://goreportcard.com/report/spf13/hugo
Thanks to @bep for the idea!
Note that command-line flags (cobra and pflag)
as well as struct fields like .BaseUrl and .Url
that are used in Go HTML templates need more work
to maintain backward-compatibility, and thus
are NOT yet dealt with in this commit.
First step in fixing #959.
go test -test.run=NONE -bench=".*" -test.benchmem=true ./helpers
Old vs new impl (string.Replace vs string.Replacer):
benchmark old ns/op new ns/op delta
BenchmarkStripHTML 10210 6572 -35.63%
benchmark old allocs new allocs delta
BenchmarkStripHTML 6 5 -16.67%
benchmark old bytes new bytes delta
BenchmarkStripHTML 1456 848 -41.76%