1
0
Fork 0
mirror of https://github.com/gohugoio/hugo.git synced 2025-05-03 20:10:37 +00:00
Commit graph

11 commits

Author SHA1 Message Date
Bjørn Erik Pedersen
0221ddb39e content adapter: Handle <!--more--> separator in content.value
Closes 
2024-06-01 12:04:05 +02:00
Bjørn Erik Pedersen
7285e74090
all: Rework page store, add a dynacache, improve partial rebuilds, and some general spring cleaning
There are some breaking changes in this commit, see .

Closes 
Closes 

This fixes a set of bugs (see issue list) and it is also paying some technical debt accumulated over the years. We now build with Staticcheck enabled in the CI build.

The performance should be about the same as before for regular sized Hugo sites, but it should perform and scale much better to larger data sets, as objects that uses lots of memory (e.g. rendered Markdown, big JSON files read into maps with transform.Unmarshal etc.) will now get automatically garbage collected if needed. Performance on partial rebuilds when running the server in fast render mode should be the same, but the change detection should be much more accurate.

A list of the notable new features:

* A new dependency tracker that covers (almost) all of Hugo's API and is used to do fine grained partial rebuilds when running the server.
* A new and simpler tree document store which allows fast lookups and prefix-walking in all dimensions (e.g. language) concurrently.
* You can now configure an upper memory limit allowing for much larger data sets and/or running on lower specced PCs.
We have lifted the "no resources in sub folders" restriction for branch bundles (e.g. sections).
Memory Limit
* Hugos will, by default, set aside a quarter of the total system memory, but you can set this via the OS environment variable HUGO_MEMORYLIMIT (in gigabytes). This is backed by a partitioned LRU cache used throughout Hugo. A cache that gets dynamically resized in low memory situations, allowing Go's Garbage Collector to free the memory.

New Dependency Tracker: Hugo has had a rule based coarse grained approach to server rebuilds that has worked mostly pretty well, but there have been some surprises (e.g. stale content). This is now revamped with a new dependency tracker that can quickly calculate the delta given a changed resource (e.g. a content file, template, JS file etc.). This handles transitive relations, e.g. $page -> js.Build -> JS import, or $page1.Content -> render hook -> site.GetPage -> $page2.Title, or $page1.Content -> shortcode -> partial -> site.RegularPages -> $page2.Content -> shortcode ..., and should also handle changes to aggregated values (e.g. site.Lastmod) effectively.

This covers all of Hugo's API with 2 known exceptions (a list that may not be fully exhaustive):

Changes to files loaded with template func os.ReadFile may not be handled correctly. We recommend loading resources with resources.Get
Changes to Hugo objects (e.g. Page) passed in the template context to lang.Translate may not be detected correctly. We recommend having simple i18n templates without too much data context passed in other than simple types such as strings and numbers.
Note that the cachebuster configuration (when A changes then rebuild B) works well with the above, but we recommend that you revise that configuration, as it in most situations should not be needed. One example where it is still needed is with TailwindCSS and using changes to hugo_stats.json to trigger new CSS rebuilds.

Document Store: Previously, a little simplified, we split the document store (where we store pages and resources) in a tree per language. This worked pretty well, but the structure made some operations harder than they needed to be. We have now restructured it into one Radix tree for all languages. Internally the language is considered to be a dimension of that tree, and the tree can be viewed in all dimensions concurrently. This makes some operations re. language simpler (e.g. finding translations is just a slice range), but the idea is that it should also be relatively inexpensive to add more dimensions if needed (e.g. role).

Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
2024-01-27 16:28:14 +01:00
Joe Mooring
272484f8bf
markdown: Pass emoji codes to yuin/goldmark-emoji
Removes emoji code conversion from the page and shortcode parsers. Emoji
codes in markdown are now passed to Goldmark, where the goldmark-emoji
extension converts them to decimal numeric character references.

This disables emoji rendering for the alternate content formats: html,
asciidoc, org, pandoc, and rst.

Fixes 
Fixes 
Closes 
2023-10-24 12:04:13 +02:00
Bjørn Erik Pedersen
168858331f Fix shortcode detection in RenderString
Fixes 
2023-01-26 11:41:07 +01:00
Bjørn Erik Pedersen
223bf28004 parser/pageparser: Don't store the byte slices
On its own this change doesn't do any magic, but this is part of a bigger picture about making Hugo leaner in the
memory usage department.
2022-07-09 16:03:11 +02:00
Bjørn Erik Pedersen
eada236f87
Introduce a tree map for all content
This commit introduces a new data structure to store pages and their resources.

This data structure is backed by radix trees.

This simplies tree operations, makes all pages a bundle,  and paves the way for .

It also solves a set of annoying issues (see list below).

Not a motivation behind this, but this commit also makes Hugo in general a little bit faster and more memory effective (see benchmarks). Especially for partial rebuilds on content edits, but also when taxonomies is in use.

```
name                                   old time/op    new time/op    delta
SiteNew/Bundle_with_image/Edit-16        1.32ms ± 8%    1.00ms ± 9%  -24.42%  (p=0.029 n=4+4)
SiteNew/Bundle_with_JSON_file/Edit-16    1.28ms ± 0%    0.94ms ± 0%  -26.26%  (p=0.029 n=4+4)
SiteNew/Tags_and_categories/Edit-16      33.9ms ± 2%    21.8ms ± 1%  -35.67%  (p=0.029 n=4+4)
SiteNew/Canonify_URLs/Edit-16            40.6ms ± 1%    37.7ms ± 3%   -7.20%  (p=0.029 n=4+4)
SiteNew/Deep_content_tree/Edit-16        56.7ms ± 0%    51.7ms ± 1%   -8.82%  (p=0.029 n=4+4)
SiteNew/Many_HTML_templates/Edit-16      19.9ms ± 2%    18.3ms ± 3%   -7.64%  (p=0.029 n=4+4)
SiteNew/Page_collections/Edit-16         37.9ms ± 4%    34.0ms ± 2%  -10.28%  (p=0.029 n=4+4)
SiteNew/Bundle_with_image-16             10.7ms ± 0%    10.6ms ± 0%   -1.15%  (p=0.029 n=4+4)
SiteNew/Bundle_with_JSON_file-16         10.8ms ± 0%    10.7ms ± 0%   -1.05%  (p=0.029 n=4+4)
SiteNew/Tags_and_categories-16           43.2ms ± 1%    39.6ms ± 1%   -8.35%  (p=0.029 n=4+4)
SiteNew/Canonify_URLs-16                 47.6ms ± 1%    47.3ms ± 0%     ~     (p=0.057 n=4+4)
SiteNew/Deep_content_tree-16             73.0ms ± 1%    74.2ms ± 1%     ~     (p=0.114 n=4+4)
SiteNew/Many_HTML_templates-16           37.9ms ± 0%    38.1ms ± 1%     ~     (p=0.114 n=4+4)
SiteNew/Page_collections-16              53.6ms ± 1%    54.7ms ± 1%   +2.09%  (p=0.029 n=4+4)

name                                   old alloc/op   new alloc/op   delta
SiteNew/Bundle_with_image/Edit-16         486kB ± 0%     430kB ± 0%  -11.47%  (p=0.029 n=4+4)
SiteNew/Bundle_with_JSON_file/Edit-16     265kB ± 0%     209kB ± 0%  -21.06%  (p=0.029 n=4+4)
SiteNew/Tags_and_categories/Edit-16      13.6MB ± 0%     8.8MB ± 0%  -34.93%  (p=0.029 n=4+4)
SiteNew/Canonify_URLs/Edit-16            66.5MB ± 0%    63.9MB ± 0%   -3.95%  (p=0.029 n=4+4)
SiteNew/Deep_content_tree/Edit-16        28.8MB ± 0%    25.8MB ± 0%  -10.55%  (p=0.029 n=4+4)
SiteNew/Many_HTML_templates/Edit-16      6.16MB ± 0%    5.56MB ± 0%   -9.86%  (p=0.029 n=4+4)
SiteNew/Page_collections/Edit-16         16.9MB ± 0%    16.0MB ± 0%   -5.19%  (p=0.029 n=4+4)
SiteNew/Bundle_with_image-16             2.28MB ± 0%    2.29MB ± 0%   +0.35%  (p=0.029 n=4+4)
SiteNew/Bundle_with_JSON_file-16         2.07MB ± 0%    2.07MB ± 0%     ~     (p=0.114 n=4+4)
SiteNew/Tags_and_categories-16           14.3MB ± 0%    13.2MB ± 0%   -7.30%  (p=0.029 n=4+4)
SiteNew/Canonify_URLs-16                 69.1MB ± 0%    69.0MB ± 0%     ~     (p=0.343 n=4+4)
SiteNew/Deep_content_tree-16             31.3MB ± 0%    31.8MB ± 0%   +1.49%  (p=0.029 n=4+4)
SiteNew/Many_HTML_templates-16           10.8MB ± 0%    10.9MB ± 0%   +1.11%  (p=0.029 n=4+4)
SiteNew/Page_collections-16              21.4MB ± 0%    21.6MB ± 0%   +1.15%  (p=0.029 n=4+4)

name                                   old allocs/op  new allocs/op  delta
SiteNew/Bundle_with_image/Edit-16         4.74k ± 0%     3.86k ± 0%  -18.57%  (p=0.029 n=4+4)
SiteNew/Bundle_with_JSON_file/Edit-16     4.73k ± 0%     3.85k ± 0%  -18.58%  (p=0.029 n=4+4)
SiteNew/Tags_and_categories/Edit-16        301k ± 0%      198k ± 0%  -34.14%  (p=0.029 n=4+4)
SiteNew/Canonify_URLs/Edit-16              389k ± 0%      373k ± 0%   -4.07%  (p=0.029 n=4+4)
SiteNew/Deep_content_tree/Edit-16          338k ± 0%      262k ± 0%  -22.63%  (p=0.029 n=4+4)
SiteNew/Many_HTML_templates/Edit-16        102k ± 0%       88k ± 0%  -13.81%  (p=0.029 n=4+4)
SiteNew/Page_collections/Edit-16           176k ± 0%      152k ± 0%  -13.32%  (p=0.029 n=4+4)
SiteNew/Bundle_with_image-16              26.8k ± 0%     26.8k ± 0%   +0.05%  (p=0.029 n=4+4)
SiteNew/Bundle_with_JSON_file-16          26.8k ± 0%     26.8k ± 0%   +0.05%  (p=0.029 n=4+4)
SiteNew/Tags_and_categories-16             273k ± 0%      245k ± 0%  -10.36%  (p=0.029 n=4+4)
SiteNew/Canonify_URLs-16                   396k ± 0%      398k ± 0%   +0.39%  (p=0.029 n=4+4)
SiteNew/Deep_content_tree-16               317k ± 0%      325k ± 0%   +2.53%  (p=0.029 n=4+4)
SiteNew/Many_HTML_templates-16             146k ± 0%      147k ± 0%   +0.98%  (p=0.029 n=4+4)
SiteNew/Page_collections-16                210k ± 0%      215k ± 0%   +2.44%  (p=0.029 n=4+4)
```

Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
2020-02-18 09:49:42 +01:00
Bjørn Erik Pedersen
597e418cb0
Make Page an interface
The main motivation of this commit is to add a `page.Page` interface to replace the very file-oriented `hugolib.Page` struct.
This is all a preparation step for issue  , "pages from other data sources".

But this also fixes a set of annoying limitations, especially related to custom output formats, and shortcodes.

Most notable changes:

* The inner content of shortcodes using the `{{%` as the outer-most delimiter will now be sent to the content renderer, e.g. Blackfriday.
  This means that any markdown will partake in the global ToC and footnote context etc.
* The Custom Output formats are now "fully virtualized". This removes many of the current limitations.
* The taxonomy list type now has a reference to the `Page` object.
  This improves the taxonomy template `.Title` situation and make common template constructs much simpler.

See 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
Fixes 
2019-03-23 18:51:22 +01:00
Bjørn Erik Pedersen
9cd54cab20 Move the emoji parsing to pageparser
This avoids double parsing the page content when `enableEmoji=true`.

This commit also adds some general improvements to the parser, making it in general much faster:

```bash
benchmark                     old ns/op     new ns/op     delta
BenchmarkShortcodeLexer-4     90258         101730        +12.71%
BenchmarkParse-4              148940        15037         -89.90%

benchmark                     old allocs     new allocs     delta
BenchmarkShortcodeLexer-4     456            700            +53.51%
BenchmarkParse-4              28             33             +17.86%

benchmark                     old bytes     new bytes     delta
BenchmarkShortcodeLexer-4     69875         81014         +15.94%
BenchmarkParse-4              8128          8304          +2.17%
```

Running some site benchmarks with Emoji support turned on:

```bash
benchmark                                                                                     old ns/op     new ns/op     delta
BenchmarkSiteBuilding/TOML,num_langs=3,num_pages=5000,tags_per_page=5,shortcodes,render-4     924556797     818115620     -11.51%

benchmark                                                                                     old allocs     new allocs     delta
BenchmarkSiteBuilding/TOML,num_langs=3,num_pages=5000,tags_per_page=5,shortcodes,render-4     4112613        4133787        +0.51%

benchmark                                                                                     old bytes     new bytes     delta
BenchmarkSiteBuilding/TOML,num_langs=3,num_pages=5000,tags_per_page=5,shortcodes,render-4     426982864     424363832     -0.61%
```

Fixes 
2018-12-20 20:08:01 +01:00
Bjørn Erik Pedersen
f2167de834
parser/pageparser: Add a benchmark 2018-12-19 20:08:22 +01:00
Bjørn Erik Pedersen
2fdc4a24d5
parser/pageparser: Add front matter etc. support
See 
2018-10-22 19:57:43 +02:00
Bjørn Erik Pedersen
f6863e1ef7
parser/pageparser: File renames and splitting
See 
2018-10-22 19:57:43 +02:00
Renamed from parser/pageparser/shortcodeparser_test.go (Browse further)