Marty McGuire

Archive for May 2019

Fri May 31
πŸ”– Bookmarked Changing my Mind about AI, Universal Basic Income, and the Value of Data – The Art of Research https://theartofresearch.org/ai-ubi-and-data/

“I am not worried that the lost jobs of the AI-pocalypse will lead to a mass of people with no way to meaningfully contribute to society, however I am worried about the AI-pocalypse leading to people’s real, meaningful, contributions going unacknowledged and uncompensated.”

Wed May 29
πŸ“— Want to read Space Opera by Catherynne M. Valente ISBN: 9781472115072
πŸ“— Want to read What the Hell Did I Just Read by David Wong ISBN: 9781250040206
πŸ“— Want to read This Book Is Full of Spiders by David Wong ISBN: 9780312546342
Tue May 21
πŸ”– Bookmarked https://www.newyorker.com/tech/annals-of-technology/can-indie-social-media-save-us
Fri May 10
β˜‘ RSVP'd to an event https://indieweb.org/events/2019-05-11-homebrew-website-club-nyc
post πŸ—½ Homebrew Website Club NYC
Join us for an afternoon of IndieWeb personal site demos and discussions!
I'm going!

Let’s try a Saturday afternoon!

From 1-3pm, join us for an IndieWeb Meetup at Think Coffee on 8th Av at 14th St in Manhattan!

Come work on your personal website, whether it exists yet or not!

This post from Calum finally brought out the FOMO I had been suppressing for IndieWebCamp Berlin.

Really looking forward to the 2019 IndieWeb Summit June 29-30th in Portland!

https://calumryan.com/blog/indiewebcamp-berlin-2019/

post from
IndieWebCamp was back in Berlin again this month for a weekend of talks, discussion and making, along with a meeting for IndieWeb organisers the day before.

Archiving rooms from a Matrix.org Homeserver (including end-to-end encrypted rooms)

I'm in the middle of a Forever Project, migrating stuff and services off of an old server in my closet at home onto a new (smaller, better, faster!) server in my closet at home.

One such service is a Matrix.org Synapse homeserver that was used as a private Slack-alternative chat for my household, as well as a bridge to some IRC channels. I set it up by hand in haste some years ago and made some not-super-sustainable choices about it, including leaving the database in SQLite (2.2GB and feelin' fine), not documenting my DNS and port-forwarding setup very well, and a few other "oopsies".

I had been keeping the code up to date via "pip install" and the latest "master" tarballs, but when the announcement came about needing valid TLS for federation starting in 0.99.X, I wasn't sure if I was good to upgrade. (I later found out that I was okay, ha!)

I found some docs on the most recent ways to set up Matrix on a new server, and even on how to migrate from SQLite to PostgreSQL. However, I don't know if I'll be able to set aside the time to do it all at once, or if it'll be easier just to set it up fresh, or even if I need a homeserver right now. So, I decided to figure out how to make archives of the rooms I cared about, which included household conversations, recipes, and photos from around the house and on travels.

Overview

The process turned out to be pretty involved, which is why it gets a blog post! It boils down to needing these three things:

  • osteele/matrix-archive - Export a Matrix room message archive and photos.
  • matrix-org/pantalaimon - A proxy to handle end-to-end encrypted (E2EE) room content for matrix-archive
  • matrix-org/Olm - C library to handle the actual E2EE processing. Pantalaimon relies on this library and it's Python extensions.

Getting all the tools built required a pretty recent system, which my old server ain't. I ended up building and running them on my personal laptop, running Ubuntu 19.04.

Since both matrix-archive and pantalaimon are Python-based, I created a Python 3.7 virtualenv to keep everything in, rather than installing everything system-wide.

Olm

The Olm docs recommend building with CMake, but as someone unfamiliar with CMake I could get it to build and run tests, but could not actually get it installed on my system.

I ended up installing the main lib with:

  make && sudo make install

The Python extensions were a challenge and I am not sure that I remember all the details to properly document them here. I spent a good amount of time trying to follow the Olm instructions to get them installed into my Python virtualenv.

In the end, the pantalaimon install built its own version of the Python Olm extensions, so I'm going to guess this was enough for now.

Pantalaimon

The pantalaimon README was pretty straightforward, once I installed Olm system-wide. I activated my virtualenv and ran:

  python setup.py install

That resulted in a "pantalaimon" script installed in my virtualenv's bin dir, so I could (in theory) run it on the command line, pointing it at my running Synapse server:

  pantalaimon https://matrix.example.com:8448

That started a service on http://127.0.0.1:8009/ which matrix-archive would connect over, with pantalaimon handling all the E2EE decryption transparently.

matrix-archive

The matrix-archive setup instructions suggest using a dependency manager called "Pipenv" that I was not familiar with. I installed it in my virtualenv, then ran it to setup and install matrix-archive:

  pip install pipenv
  pipenv install

Pipenv "noticed" it was running in a virtualenv, and said so. This didn't seem to be much of a problem, but any command I tried to run with "pipenv run" would fail. I worked around this by looking in the "Pipfile" to see what commands were actually being run, and it turns out it was just calling specific Python scripts in the matrix-archive directory. So, I resolved to run those by hand.

MongoDB

matrix-archive requires MongoDB. I don't use it for anything else, so I had to "sudo apt install mongodb-server".

Running the Import

First, I set the environment variables needed by matrix-archive:

  export MATRIX_USER=<my username>
  export MATRIX_PASSWORD=<my password>
  export MATRIX_HOST=http://127.0.0.1:8009

Then confirmed it was working by getting a list of rooms with IDs:

  python list_rooms.py

I set up the list of room IDs in an environment variable:

  export MATRIX_ROOM_IDS=!room@server,!room2@server,...

And slurped in all the messages with:

  python import_messages.py

At the end, it said it had a bunch of messages. Hooray!

Running the Export

This is where things kind of ran off the rails. In trying to export messages I kept seeing Python KeyErrors about a missing 'info' key. It seems like maybe the Matrix protocol was updated to make this an optional key, but the upshot was that matrix-archive seemed to assume that every message with an image attached would have an 'info' with info about a thumbnail for that image.

Additionally, the script to download images had some naive handling for turning attachment URLs like "mxc://example.com/..." into downloadable URLs. Matrix supports DNS-based delegation, so you can say "the Matrix server for example.com is matrix.example.com:8448, and this script didn't handle that.

I did some nasty hacks to only get full-sized images, and from the right host:

  • updated the schema to return the full image URL instead of digging in for a thumbnail
  • added handling to export_messages.py to handle missing 'info', which was used to guess image mimetypes
  • added some hardcoding to map converted "mxc://" URLs to the right host.

Afterwards I was able to do an export of alllllll the images to a "images/" folder:

  python download_images.py --no-thumbnails

And could then export a particular room's history with:

  python export_messages.py --room-id ROOM-NAME --local-images --filename ROOM-NAME.html

Note that the "--room-id" flag above actually wants the human-readable room name, unless it's actually a room on the main matrix.org server.

Afterwards, I could open room-name.html in my browser, and see the very important messages and images I worked so hard to archive.

a message exchange. marty asks maktrobot to 'pug me'. maktrobot responds with an image of a pug.
Screenshot from the (very minimal) HTML export, including an image (of a pug, sourced by a chat bot).

What's Next?

For now, I'll be putting these files and images in a safe backup and not worrying about them too much, because I have them. I've already stopped my old Synapse server, and can tackle setting up the new one at my leisure. We've moved our house chats to Signal, and I've moved my IRC usage over to bridged Slack channels.

Running a Matrix Synapse homeserver for the past couple of years has been quite interesting! I really appreciate the hard working community (especially in light of their recent infrastructure troubles), and I recognize that it's a ton of work to build a federating network of real-time, private communication. I enjoyed the freedom of having my own chat service to run bots, share images, and discuss private moments without worrying about who might be reading the messages now or down the road.

That said, there are still some major usability kinks to work out. The end-to-end encryption rollout across homeservers and clients hasn't been the smoothest, and it can be an issue juggling E2EE keys across devices. I look forward to seeing how the community addresses issues like these in the future!

TL;DR - saving an archive of a room's history and files should not be this hard.

Mon May 6

πŸ—“οΈ True Colors - A Diverse Variety Show

Magnet Theater 254 W 29th St. New York, NY 10001
πŸ“† Add to Calendar: iCal | Google Calendar

True Colors is a diverse variety show that celebrates the Magnet community! It will showcase improv, character pieces, stand up, and storytelling and be performed by teachers, students, and performers in the community. Come celebrate the comedy of our community with us!

Sun May 5

πŸ—“οΈ UCB Improv 201 Showcase!

UCB Theatre, NY 555 w 42nd, New York, NY 10036
πŸ“† Add to Calendar: iCal | Google Calendar

We’ve been deep in the lab the past 8 weeks and are excited to share our findings with you!

Come on out for a super-fun improv grad show!

Sat May 4

Cheers! 🍻

πŸ“ Checked in at Kopitiam, New York, NY.
Thu May 2
πŸ”– Bookmarked Component frameworks and web standards https://hiddedevries.nl/en/blog/2019-02-28-component-frameworks-and-web-standards

“There are lots of organisations out there using Angular 1 in projects, and I see them struggle to find developers willing to do the work. They’ve since moved on to newer frameworks. The orgs are stuck with outdated paradigms that are incompatible with the now.”

Sometimes you boop the snoot, and sometimes the snoot boops you.

πŸ‘‰πŸˆ

https://autoboop.com/