2020-01-16 02:51:49 +00:00
---
2022-12-24 16:23:11 +00:00
title: "Mirroring or Archiving an Entire Website"
2020-01-16 02:51:49 +00:00
date: 2019-08-02T22:42:16-04:00
draft: false
2022-01-02 19:24:29 +00:00
tags: [ "Archive" ]
2023-01-05 19:04:45 +00:00
medium_enabled: true
2020-01-16 02:51:49 +00:00
---
2020-01-16 03:55:39 +00:00
I have several old Wordpress sites lying around that I would like to archive but not maintain anymore. Since I don't intend to create any more content on these sites, we can use tools like `wget` to scrape an existing site and provide a somewhat *read-only* copy of it. I say read-only not because we can't edit it, but because it's not in the original source format of the website.
2020-01-16 02:51:49 +00:00
2020-01-16 03:55:39 +00:00
There have been several tackles to the problem:
2020-01-16 02:51:49 +00:00
2020-01-16 03:55:39 +00:00
- https://stackoverflow.com/questions/538865/how-do-you-archive-an-entire-website-for-offline-viewing#538878
2022-12-05 17:53:11 +00:00
- [https://letswp.io/download-an-entire-website-wget-windows/ ](https://web.archive.org/web/20190915143432/https://letswp.io/download-an-entire-website-wget-windows/ )
2020-01-16 03:55:39 +00:00
And ultimately after consulting these resources I've came to the following command:
2020-01-16 02:51:49 +00:00
2020-01-16 03:35:58 +00:00
```bash
wget --mirror \
--convert-links \
--adjust-extension \
--page-requisites \
2022-12-24 16:23:11 +00:00
--no-verbose \
2020-01-16 03:35:58 +00:00
https://url/of/web/site
```
2020-01-16 02:51:49 +00:00
2020-01-16 03:55:39 +00:00
There were other solutions in that stack overflow post, but something about the simplicity of `wget` appealed to me.
2020-01-16 02:51:49 +00:00
2020-02-16 22:46:18 +00:00
[Example site I archived with this. ](https://sentenceworthy.com )