This ommmit contains some security hardening measures for the Hugo build runtime.
There are some rarely used features in Hugo that would be good to have disabled by default. One example would be the "external helpers".
For `asciidoctor` and some others we use Go's `os/exec` package to start a new process.
These are a predefined set of binary names, all loaded from `PATH` and with a predefined set of arguments. Still, if you don't use `asciidoctor` in your project, you might as well have it turned off.
You can configure your own in the new `security` configuration section, but the defaults are configured to create a minimal amount of site breakage. And if that do happen, you will get clear instructions in the loa about what to do.
The default configuration is listed below. Note that almost all of these options are regular expression _whitelists_ (a string or a slice); the value `none` will block all.
```toml
[security]
enableInlineShortcodes = false
[security.exec]
allow = ['^dart-sass-embedded$', '^go$', '^npx$', '^postcss$']
osEnv = ['(?i)^(PATH|PATHEXT|APPDATA|TMP|TEMP|TERM)$']
[security.funcs]
getenv = ['^HUGO_']
[security.http]
methods = ['(?i)GET|POST']
urls = ['.*']
```
This commit started out investigating a `concurrent map read write` issue, ending by replacing the map with a struct.
This is easier to reason about, and it's more effective:
```
name old time/op new time/op delta
SiteNew/Regular_Deep_content_tree-16 71.5ms ± 3% 69.4ms ± 5% ~ (p=0.200 n=4+4)
name old alloc/op new alloc/op delta
SiteNew/Regular_Deep_content_tree-16 29.7MB ± 0% 27.9MB ± 0% -5.82% (p=0.029 n=4+4)
name old allocs/op new allocs/op delta
SiteNew/Regular_Deep_content_tree-16 313k ± 0% 303k ± 0% -3.35% (p=0.029 n=4+4)
```
See #8749
Re-add the additional environment checks to determine if its Netlify. Seems that Cloudflare also sets `NETLIFY=true`.
This makes it look, basically, like a variant of the conditional we had before we started fixing this, but I have checked this logic on Netlify now and it should work.
Fixes#8714
The main motivation behind this is simplicity and correctnes, but the new small config library is also faster:
```
BenchmarkDefaultConfigProvider/Viper-16 252418 4546 ns/op 2720 B/op 30 allocs/op
BenchmarkDefaultConfigProvider/Custom-16 450756 2651 ns/op 1008 B/op 6 allocs/op
```
Fixes#8633Fixes#8618Fixes#8630
Updates #8591Closes#6680Closes#5192
This change is mostly motivated to get a more stable CI build (we're building the Hugo site there, with Instagram and Twitter shortcodes sometimes failing).
Fixes#7866
This commit solves the relative path problem with asciidoctor tooling. An include will resolve relatively, so you can refer easily to files in the same folder.
Also `asciidoctor-diagram` and PlantUML rendering works now, because the created temporary files will be placed in the correct folder.
This patch covers just the Ruby version of asciidoctor. The old AsciiDoc CLI EOLs in Jan 2020, so this variant is removed from code.
The configuration is completely rewritten and now available in `config.toml` under the key `[markup.asciidocext]`:
```toml
[markup.asciidocext]
extensions = ["asciidoctor-html5s", "asciidoctor-diagram"]
workingFolderCurrent = true
trace = true
[markup.asciidocext.attributes]
my-base-url = "https://example.com/"
my-attribute-name = "my value"
```
- backends, safe-modes, and extensions are now whitelisted to the popular (ruby) extensions and valid values.
- the default for extensions is to not enable any, because they're all external dependencies so the build would break if the user didn't install them beforehand.
- the default backend is html5 because html5s is an external gem dependency.
- the default safe-mode is safe, explanations of the modes: https://asciidoctor.org/man/asciidoctor/
- the config is namespaced under asciidocext_config and the parser looks at asciidocext to allow a future native Go asciidoc.
- `uglyUrls=true` option and `--source` flag are supported
- `--destination` flag is required
Follow the updated documentation under `docs/content/en/content-management/formats.md`.
This patch would be a breaking change, because you need to correct all your absolute include pathes to relative paths, so using relative paths must be configured explicitly by setting `workingFolderCurrent = true`.
In the internal Radix we stored the directory based nodes without a traling slash, e.g. `/blog`.
The original motivation was probably to make it easy to do prefix searching: Give me all ancestors.
This, however have lead to some ambigouty with overlapping directory names.
This particular problem was, however, not possible to work around in an easy way, so from now we store these as `/blog/`.
Fixes#7301
You can turn off this behaviour:
```toml
[markup]
[markup.goldmark]
[markup.goldmark.parser]
autoHeadingIDAsciiOnly = true
```
Note that the `anchorize` now adapts its behaviour depending on the default Markdown handler.
Fixes#6616
This commit also
* revises the change detection for templates used by content files in server mode.
* Adds a Page.RenderString method
Fixes#6545Fixes#4663Closes#6043
This commit adds the fast and CommonMark compliant Goldmark as the new default markdown handler in Hugo.
If you want to continue using BlackFriday as the default for md/markdown extensions, you can use this configuration:
```toml
[markup]
defaultMarkdownHandler="blackfriday"
```
Fixes#5963Fixes#1778Fixes#6355
This commmit prepares for the addition of Goldmark as the new Markdown renderer in Hugo.
This introduces a new `markup` package with some common interfaces and each implementation in its own package.
See #5963
The org mode renderer supports including other files [1]. We don't want to
allow reading of arbitrary files (go-org defaults to ioutil.ReadFile [2]) but want
to make use of the FileSystem abstractions hugo provides. For starters we will
allow reading from the content directory only
[1]: e.g. `#+INCLUDE: ./foo.py src python` includes `foo.py` as a python source
block.
This is in line with how it behaved before, but it was lifted a little for the project mount for Hugo Modules,
but that could create hard-to-detect loops.
This commit implements Hugo Modules.
This is a broad subject, but some keywords include:
* A new `module` configuration section where you can import almost anything. You can configure both your own file mounts nd the file mounts of the modules you import. This is the new recommended way of configuring what you earlier put in `configDir`, `staticDir` etc. And it also allows you to mount folders in non-Hugo-projects, e.g. the `SCSS` folder in the Bootstrap GitHub project.
* A module consists of a set of mounts to the standard 7 component types in Hugo: `static`, `content`, `layouts`, `data`, `assets`, `i18n`, and `archetypes`. Yes, Theme Components can now include content, which should be very useful, especially in bigger multilingual projects.
* Modules not in your local file cache will be downloaded automatically and even "hot replaced" while the server is running.
* Hugo Modules supports and encourages semver versioned modules, and uses the minimal version selection algorithm to resolve versions.
* A new set of CLI commands are provided to manage all of this: `hugo mod init`, `hugo mod get`, `hugo mod graph`, `hugo mod tidy`, and `hugo mod vendor`.
All of the above is backed by Go Modules.
Fixes#5973Fixes#5996Fixes#6010Fixes#5911Fixes#5940Fixes#6074Fixes#6082Fixes#6092
Sadly, goorgeous has not been updated in over a year and still has a lot of
open issues (e.g. no support for nested lists).
go-org fixes most of those issues and supports a larger subset of Org mode
syntax.
Add the ability to have a `summary` page variable that overrides
the auto-generated summary. Logic for obtaining summary becomes:
* if summary divider is present in content, use the text above it
* if summary variables is present in page metadata, use that
* auto-generate summary from first _x_ words of the content
Fixes#5800
The main motivation of this commit is to add a `page.Page` interface to replace the very file-oriented `hugolib.Page` struct.
This is all a preparation step for issue #5074, "pages from other data sources".
But this also fixes a set of annoying limitations, especially related to custom output formats, and shortcodes.
Most notable changes:
* The inner content of shortcodes using the `{{%` as the outer-most delimiter will now be sent to the content renderer, e.g. Blackfriday.
This means that any markdown will partake in the global ToC and footnote context etc.
* The Custom Output formats are now "fully virtualized". This removes many of the current limitations.
* The taxonomy list type now has a reference to the `Page` object.
This improves the taxonomy template `.Title` situation and make common template constructs much simpler.
See #5074Fixes#5763Fixes#5758Fixes#5090Fixes#5204Fixes#4695Fixes#5607Fixes#5707Fixes#5719Fixes#3113Fixes#5706Fixes#5767Fixes#5723Fixes#5769Fixes#5770Fixes#5771Fixes#5759Fixes#5776Fixes#5777Fixes#5778
This avoids double parsing the page content when `enableEmoji=true`.
This commit also adds some general improvements to the parser, making it in general much faster:
```bash
benchmark old ns/op new ns/op delta
BenchmarkShortcodeLexer-4 90258 101730 +12.71%
BenchmarkParse-4 148940 15037 -89.90%
benchmark old allocs new allocs delta
BenchmarkShortcodeLexer-4 456 700 +53.51%
BenchmarkParse-4 28 33 +17.86%
benchmark old bytes new bytes delta
BenchmarkShortcodeLexer-4 69875 81014 +15.94%
BenchmarkParse-4 8128 8304 +2.17%
```
Running some site benchmarks with Emoji support turned on:
```bash
benchmark old ns/op new ns/op delta
BenchmarkSiteBuilding/TOML,num_langs=3,num_pages=5000,tags_per_page=5,shortcodes,render-4 924556797 818115620 -11.51%
benchmark old allocs new allocs delta
BenchmarkSiteBuilding/TOML,num_langs=3,num_pages=5000,tags_per_page=5,shortcodes,render-4 4112613 4133787 +0.51%
benchmark old bytes new bytes delta
BenchmarkSiteBuilding/TOML,num_langs=3,num_pages=5000,tags_per_page=5,shortcodes,render-4 426982864 424363832 -0.61%
```
Fixes#5534
This commit adds support for a configuration directory (default `config`). The different pieces in this puzzle are:
* A new `--environment` (or `-e`) flag. This can also be set with the `HUGO_ENVIRONMENT` OS environment variable. The value for `environment` defaults to `production` when running `hugo` and `development` when running `hugo server`. You can set it to any value you want (e.g. `hugo server -e "Sensible Environment"`), but as it is used to load configuration from the file system, the letter case may be important. You can get this value in your templates with `{{ hugo.Environment }}`.
* A new `--configDir` flag (defaults to `config` below your project). This can also be set with `HUGO_CONFIGDIR` OS environment variable.
If the `configDir` exists, the configuration files will be read and merged on top of each other from left to right; the right-most value will win on duplicates.
Given the example tree below:
If `environment` is `production`, the left-most `config.toml` would be the one directly below the project (this can now be omitted if you want), and then `_default/config.toml` and finally `production/config.toml`. And since these will be merged, you can just provide the environment specific configuration setting in you production config, e.g. `enableGitInfo = true`. The order within the directories will be lexical (`config.toml` and then `params.toml`).
```bash
config
├── _default
│ ├── config.toml
│ ├── languages.toml
│ ├── menus
│ │ ├── menus.en.toml
│ │ └── menus.zh.toml
│ └── params.toml
├── development
│ └── params.toml
└── production
├── config.toml
└── params.toml
```
Some configuration maps support the language code in the filename (e.g. `menus.en.toml`): `menus` (`menu` also works) and `params`.
Also note that the only folders with "a meaning" in the above listing is the top level directories below `config`. The `menus` sub folder is just added for better organization.
We use `TOML` in the example above, but Hugo also supports `JSON` and `YAML` as configuration formats. These can be mixed.
Fixes#5422
This means that the current `.Site` and ´.Hugo` is available as a globals, so you can do `site.IsServer`, `hugo.Version` etc.
Fixes#5470Fixes#5467Fixes#5503
This commits reworks how file caching is performed in Hugo. Now there is only one way, and it can be configured.
This is the default configuration:
```toml
[caches]
[caches.getjson]
dir = ":cacheDir"
maxAge = -1
[caches.getcsv]
dir = ":cacheDir"
maxAge = -1
[caches.images]
dir = ":resourceDir/_gen"
maxAge = -1
[caches.assets]
dir = ":resourceDir/_gen"
maxAge = -1
```
You can override any of these cache setting in your own `config.toml`.
The placeholders explained:
`:cacheDir`: This is the value of the `cacheDir` config option if set (can also be set via OS env variable `HUGO_CACHEDIR`). It will fall back to `/opt/build/cache/hugo_cache/` on Netlify, or a `hugo_cache` directory below the OS temp dir for the others.
`:resourceDir`: This is the value of the `resourceDir` config option.
`maxAge` is the time in seconds before a cache entry will be evicted, -1 means forever and 0 effectively turns that particular cache off.
This means that if you run your builds on Netlify, all caches configured with `:cacheDir` will be saved and restored on the next build. For other CI vendors, please read their documentation. For an CircleCI example, see 6c3960a8f4/.circleci/config.ymlFixes#5404
The main item in this commit is showing of errors with a file context when running `hugo server`.
This can be turned off: `hugo server --disableBrowserError` (can also be set in `config.toml`).
But to get there, the error handling in Hugo needed a revision. There are some items left TODO for commits soon to follow, most notable errors in content and config files.
Fixes#5284Fixes#5290
See #5325
See #5324
Initially, rst2html was called via the python interpreter which would
fail if the script was wrapped in a launcher as on NixOS.
Ideally, on *nix, binaries should be invoked directly to ensure that
shebangs work properly as is being done now.
Handle the case of windows as it doesn't do shebangs.
In short:
* Avoid double tolower in MakeSegment
* Use MakePathSanitized for taxonomies in pageToPermalinkTitle; this matches what MakeSegment does.
* Move the "double hyphen and space" logic into UnicodeSanitize
The last bullet may be slightly breaking for some that now does not get the "--" in some URLs, but we need to reduce the amount of URL logic.
See #4926
helpers/general.go:448:1: comment on exported function DiffStrings should be of the form "DiffStrings ..."
helpers/hugo.go:42:6: exported type HugoVersionString should have comment or be unexported
helpers/hugo.go:48:1: exported method HugoVersion.Version should have comment or be unexported
helpers/hugo.go:56:1: comment on exported method HugoVersionString.Compare should be of the form "Compare ..."
helpers/hugo.go:62:1: comment on exported method HugoVersionString.Eq should be of the form "Eq ..."
helpers/path.go:548:1: comment on exported function OpenFilesForWriting should be of the form "OpenFilesForWriting ..."
helpers/processing_stats.go:24:6: exported type ProcessingStats should have comment or be unexported
helpers/processing_stats.go:55:1: exported function NewProcessingStats should have comment or be unexported
helpers/processing_stats.go:59:1: exported method ProcessingStats.Incr should have comment or be unexported
helpers/processing_stats.go:63:1: exported method ProcessingStats.Add should have comment or be unexported
helpers/processing_stats.go:67:1: exported method ProcessingStats.Table should have comment or be unexported
helpers/processing_stats.go:83:1: exported function ProcessingStatsTable should have comment or be unexported
In Hugo 0.46 we made the output of what you get from resources.Get and similar static, i.e. language agnostic. This makes total sense, as it is wasteful and time-consuming to do SASS/SCSS/PostCSS processing for lots of languages when the output is lots of duplicates with different filenames.
But since we now output the result once only, this had a negative side effect for multihost setups: We publish the resource once only to the root folder (i.e. not to the language "domain folder").
This commit removes the language code from the processed image keys. This creates less duplication in the file cache, but it means that you should do a `hugo --gc` to clean up stale files.
Fixes#5058
Hugo Pipes added minification support for resources fetched via ´resources.Get` and similar.
This also adds support for minification of the final output for supported output formats: HTML, XML, SVG, CSS, JavaScript, JSON.
To enable, run Hugo with the `--minify` flag:
```bash
hugo --minify
```
This commit is also a major spring cleaning of the `transform` package to allow the new minification step fit into that processing chain.
Fixes#1251
Before this commit, you would have to use page bundles to do image processing etc. in Hugo.
This commit adds
* A new `/assets` top-level project or theme dir (configurable via `assetDir`)
* A new template func, `resources.Get` which can be used to "get a resource" that can be further processed.
This means that you can now do this in your templates (or shortcodes):
```bash
{{ $sunset := (resources.Get "images/sunset.jpg").Fill "300x200" }}
```
This also adds a new `extended` build tag that enables powerful SCSS/SASS support with source maps. To compile this from source, you will also need a C compiler installed:
```
HUGO_BUILD_TAGS=extended mage install
```
Note that you can use output of the SCSS processing later in a non-SCSSS-enabled Hugo.
The `SCSS` processor is a _Resource transformation step_ and it can be chained with the many others in a pipeline:
```bash
{{ $css := resources.Get "styles.scss" | resources.ToCSS | resources.PostCSS | resources.Minify | resources.Fingerprint }}
<link rel="stylesheet" href="{{ $styles.RelPermalink }}" integrity="{{ $styles.Data.Digest }}" media="screen">
```
The transformation funcs above have aliases, so it can be shortened to:
```bash
{{ $css := resources.Get "styles.scss" | toCSS | postCSS | minify | fingerprint }}
<link rel="stylesheet" href="{{ $styles.RelPermalink }}" integrity="{{ $styles.Data.Digest }}" media="screen">
```
A quick tip would be to avoid the fingerprinting part, and possibly also the not-superfast `postCSS` when you're doing development, as it allows Hugo to be smarter about the rebuilding.
Documentation will follow, but have a look at the demo repo in https://github.com/bep/hugo-sass-test
New functions to create `Resource` objects:
* `resources.Get` (see above)
* `resources.FromString`: Create a Resource from a string.
New `Resource` transformation funcs:
* `resources.ToCSS`: Compile `SCSS` or `SASS` into `CSS`.
* `resources.PostCSS`: Process your CSS with PostCSS. Config file support (project or theme or passed as an option).
* `resources.Minify`: Currently supports `css`, `js`, `json`, `html`, `svg`, `xml`.
* `resources.Fingerprint`: Creates a fingerprinted version of the given Resource with Subresource Integrity..
* `resources.Concat`: Concatenates a list of Resource objects. Think of this as a poor man's bundler.
* `resources.ExecuteAsTemplate`: Parses and executes the given Resource and data context (e.g. .Site) as a Go template.
Fixes#4381Fixes#4903Fixes#4858
This commit adds support for theme composition and inheritance in Hugo.
With this, it helps thinking about a theme as a set of ordered components:
```toml
theme = ["my-shortcodes", "base-theme", "hyde"]
```
The theme definition example above in `config.toml` creates a theme with the 3 components with presedence from left to right.
So, Hugo will, for any given file, data entry etc., look first in the project, and then in `my-shortcode`, `base-theme` and lastly `hyde`.
Hugo uses two different algorithms to merge the filesystems, depending on the file type:
* For `i18n` and `data` files, Hugo merges deeply using the translation id and data key inside the files.
* For `static`, `layouts` (templates) and `archetypes` files, these are merged on file level. So the left-most file will be chosen.
The name used in the `theme` definition above must match a folder in `/your-site/themes`, e.g. `/your-site/themes/my-shortcodes`. There are plans to improve on this and get a URL scheme so this can be resolved automatically.
Also note that a component that is part of a theme can have its own configuration file, e.g. `config.toml`. There are currently some restrictions to what a theme component can configure:
* `params` (global and per language)
* `menu` (global and per language)
* `outputformats` and `mediatypes`
The same rules apply here: The left-most param/menu etc. with the same ID will win. There are some hidden and experimental namespace support in the above, which we will work to improve in the future, but theme authors are encouraged to create their own namespaces to avoid naming conflicts.
A final note: Themes/components can also have a `theme` definition in their `config.toml` and similar, which is the "inheritance" part of this commit's title. This is currently not supported by the Hugo theme site. We will have to wait for some "auto dependency" feature to be implemented for that to happen, but this can be a powerful feature if you want to create your own theme-variant based on others.
Fixes#4460Fixes#4450
Add a configuration option "noreferrerLinks". When set to "true" the "HTML_NOREFERRER_LINKS" flag is being passed to Blackfriday. Thereby all *absolute* links will get a "noreferrer" value for their "rel" attribute.
See #4722
Add a configuration option "nofollowLinks". When set to "true" the "HTML_NOFOLLOW_LINKS" flag is being passed to Blackfriday. Thereby all *absolute* links will get a "nofollow" value for the "rel" attribute.
Fixes#4722
This bug was introduced in Hugo 0.40. It is when you use the `<!--more-->` summary marker.
Note that this affects the word stats only. The related `PlainWords`, `Plain`, `Content` all return correct values.
Fixes#4675Fixes#4682
This resolves some surprising behaviour when reading other pages' content from shortcodes. Before this commit, that behaviour was undefined. Note that this has never been an issue from regular templates.
It will still not be possible to get **the current shortcode's page's rendered content**. That would have impressed Einstein.
The new and well defined rules are:
* `.Page.Content` from a shortcode will be empty. The related `.Page.Truncated` `.Page.Summary`, `.Page.WordCount`, `.Page.ReadingTime`, `.Page.Plain` and `.Page.PlainWords` will also have empty values.
* For _other pages_ (retrieved via `.Page.Site.GetPage`, `.Site.Pages` etc.) the `.Content` is there to use as you please as long as you don't have infinite content recursion in your shortcode/content setup. See below.
* `.Page.TableOfContents` is good to go (but does not support shortcodes in headlines; this is unchanged)
If you get into a situation of infinite recursion, the `.Content` will be empty. Run `hugo -v` for more information.
Fixes#4632Fixes#4653Fixes#4655