New post

2026-01-09 07:40:25 +00:00 · 2022-05-21 00:14:44 -04:00 · 2022-05-21 00:14:44 -04:00 · 12b9738f0b
commit 12b9738f0b
parent a8ecad7bba
1 changed files with 106 additions and 0 deletions
--- a/content/blog/archiving-toots.md
+++ b/content/blog/archiving-toots.md
@ -0,0 +1,106 @@
 ---
 title: "Archiving Toots"
 date: 2022-05-20T22:47:48-04:00
 draft: false
 tags: ["Hugo", "Mastodon"]
 math: false
 ---
 In the spirit of [syndicating Mastodon toots](https://brandonrozek.com/blog/why-i-pesos-from-mastodon/)
 to my own site, I wrote a Python script that turns toots into Hugo markdown
 files.
 In this post we'll go over:
 - [Mastodon API](#mastodon-api)
 - [Reformatting toot](#reformatting-toot)
 - [Creating the Markdown files](#creating-the-markdown-files)
 - [Conclusion](#conclusion)
 ## Mastodon API
 Before we can retrieve our toots, we need to know what user id of our account.
 James Cahill wrote a very clean [web tool](https://prouser123.me/mastodon-userid-lookup/)
 to grab your user id. For the sake of example, we'll use mine which
 is 108219415927856966.
 To grab the statuses, we then need to access the following URL:
 ```
 https://SERVER/api/v1/accounts/USERID
 ```
 For my specific user:
 ```
 https://fosstodon.org/api/v1/accounts/108219415927856966
 ```
 By default, this will return 20 statuses in an array.
 To see how to parse each individual status, check out my
 post on [displaying a single toot](https://brandonrozek.com/blog/displaying-a-toot-hugo/).
 You can use the limit parameter to set how many statuses you wish to see.
 The maximum number you can set it to is 40.
 In order to see more than the last 40 toots, there is another
 parameter that we can use called `max_id` which will specify the maximum
 toot id to respond with.
 You can then use this parameter to grab all your toots, by
 following the following algorithm:
 - Make an initial query to the API
 - Find the smallest toot id in the response
 - While the response is not empty
  - Send a query with the `max_id` set to the smallest toot id known
  - Update the smallest toot id known
 Together with `limit` and `max_id` we can grab any specified number of toots.
 Here's the psuedocode for that:
 ```python
 for _ in range(math.ceil(RETRIEVE_NUM_TOOTS // MAX_TOOTS_PER_QUERY)):
    url = f"{SERVER}/api/v1/accounts/{UID}/statuses"
    url += "?limit=40" if limit_param > 40 else f"?limit={limit_param}"
    url += "&max_id={max_id}" if max_id is not None else ""
    response = query(url)
    if len(resposne) == 0:
        break
    # Process response...
 ```
 ## Reformatting Toot
 Rather than storing the JSON of the toot verbatim, I do make some changes
 to it for the following reasons:
 - By default every toot has a lot of account information, this is an issue because
 if my number of followers update, then I need to update all my toots.
 - Hugo expects certain field names to exist. For example: date.
 I delete the following account information from my toot archive:
 - Lock status
 - Bot
 - Discoverable
 - Group
 - Created_at
 - Note
 - Follower count
 - Following count
 - Status count
 - Last status at
 I create a date field based on `created_at`.
 The URL field in mastodon conflicts with Hugo,
 so I rename it to `syndication`.
 ## Creating the Markdown files
 From here I can construct the following
 markdown file based on this template:
 ```
 ---
 {Toot JSON}
 ---
 JSON.content
 ```
 I save each toot in an individual markdown file under `content/toots`.
 ## Conclusion
 My full [script](https://github.com/Brandon-Rozek/website/blob/master/refreshtoots_v2.py)
 is on GitHub.
 The script will let you know of any toot IDs that are created
 and/or updated. I then add these toots to Git for version control
 just like my posts.