OPML to Markdown to create a blogroll page | Bacardi55's Web Cave

“Yet another post about OPML / blogroll? But you already automated your blogroll page via your build process” you may ask… And indeed I did.

But after some discussion about to blogroll or not, Steve mentioned on mastodon that he created his blogroll page manually and it was a painful process. As I just wrote my automation script, I thought I could help him automate a bit his process.

After a few emails, the goal was simple: take the opml export file from feedly (his feed reader) and convert it into a mar…

“Yet another post about OPML / blogroll? But you already automated your blogroll page via your build process” you may ask… And indeed I did.

After a few emails, the goal was simple: take the opml export file from feedly (his feed reader) and convert it into a markdown file (the format used to generate his blog).

His OPML file use the standard format and looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<opml version="1.0">
<head>
<title>[…] subscriptions in feedly cloud</title>
</head>
<body>
<outline text="Friends">
<outline title="title of the feed" text="description of the field" xmlUrl="<url to the feed>" htmlUrl="<url of the web page" type="rss"></outline>
<outline title="title of the feed" text="description of the field" xmlUrl="<url to the feed>" htmlUrl="<url of the web page" type="rss"></outline>
[…]
<outline text="Interesting People">
[…]
</outline>
[…]
</body>
</opml>

And the markdown file was looking like:

Some introduction text.

**Last updated: <Month> <Year>**

### Friends
- [Blogger Name](https://example.com/)
- [Blogger Name 2](https://example.com/)
- […]
### Interesting People
- [Name 3](https://example.com/)
- […]
[…]

So I created a small (24 loc) python script (opml_to_md.py) to do that:

import xml.etree.ElementTree as ET
from datetime import datetime

# Path to the original opml file:
opml_file = "./feeds.opml"
header = "Some Introduction text."

tree = ET.parse(opml_file)
root = tree.getroot()

md = header + "\n\n"
md += "**Last Updated: " + datetime.today().strftime("%B %Y") + "**\n\n"

for o in root.find("body"):
md += "### " + o.get("text") + "\n"
for child in o:
md += "- [" + child.get("title") + "](" + child.get("htmlUrl") + ")\n"

md += "\n"

f = open("blogroll.md", "w")
f.write(md)
f.close()

Now, he can export the opml file from feedly, rename it feeds.opml and save it in the same directory as this script and then use python opml_to_md.py. It will create the blogroll.md file, with a configurable introduction text and the date set automatically.

Seems it is working now, which is great! One thing I didn’t do though is manage the order of the categories. This script will render categories in the order found in the opml file. I asked Steve about it but he was ok with just copy/pasting categories in the right order, so I didn’t add this to this script.

Feel free to use it too to generate your blogroll page too as I love those! And reach out if it doesn’t work or need help.

Similar Posts