Marty McGuire

Posts Tagged flask

2017
Thu Jan 26

Spano - a minimum-viable Micropub Media Endpoint

Micropub is an open API standard to create posts on one's own domain using third-party clients  and currently a W3C Candidate Recommendation. One of the (semi-) recent additions is the idea of a Micropub Media Endpoint. The Media Endpoint provides a way for Micropub clients to upload media files to a Micropub service, receiving a URL that is sent along in place of the file contents when the post is published.

Some of the things I like about Micropub media endpoints include:

  • The spec allows the media endpoint to be on a completely separate domain from the "full" micropub endpoint.
  • The spec doesn't specify anything about how the files are stored or their final URLs or filenames.
  • They make it easy to separate the handling of (large) media files from the (presumably much smaller) content and metadata of a post.
  • They enable Micropub clients to upload multiple files without creating multiple posts. This makes it simpler to create posts that contain multiple images, like a gallery.

Personally, I wanted a Micropub media endpoint server with a few extra properties:

  • It should be able to run completely separately from, and therefore work in conjunction with, any other micropub server implementation.
  • It should not store duplicate files. If the same file is uploaded twice, the same URL should be returned both times.
  • It should not allow overwriting files. If two images of the same name are uploaded, both are kept and receive different URLs.

Enter HashFS

My extra features above essentially describe a content-addressable storage storage system. CAS is a way of storing and accessing data based on some property of the actual content, rather than (potentially arbitrary) files and folders.

HashFS is a Python implementation of a content-addressable file management system. You give it files, it will put them in a directory structure based on a cryptographic hash function of the contents of that file. In other words - HashFS can take any file and give back a unique path to that file which will never change (if you later upload a new version of the file, it gets a different path).

To add the the fun of HashFS, there is a Flask extension called Flask-HashFS which makes it easy to expose a HashFS file store on the web via the Python Flask framework.

Introducing Spano

Spano is a Micropub Media Endpoint server written in Python via the Flask framework which combines Flask-HashFS for file storage with Flask-IndieAuth (introduced earlier) to handle authentication and authorization.

Spano is a server-side web app that basically does one thing: it accepts HTTP POST requests with a valid IndieAuth token and a file named "file", stores that file, and returns a URL to that file. The task of serving uploaded files is left to a dedicated web server like nginx or Apache.

Using Spano

Once Spano has been set up and configured for your domain, uploading is a matter of getting a valid IndieAuth token. IndieAuth-enabled Micropub clients will do this automatically. For testing by hand I like to log in to Quill and copy the access token from the Quill settings page. With token in hand, uploads are as easy as:

curl -D - -F "file=@myfile.jpg" \
  -H"Authorization: Bearer xxxx..." \
  https://media.example.com/micropub/

Which should output a response like:

HTTP/1.1 100 Continue

HTTP/1.0 201 CREATED
Content-Type: text/html; charset=utf-8
Content-Length: 108
Location: https://media.example.com/cc/a5/97/7c/2004..2cb.jpg
Server: Werkzeug/0.11.4 Python/2.7.11
Date: Thu, 26 Jan 2017 02:40:05 GMT

File created: https://media.example.com/cc/a5/97/7c/2004..2cb.jpg

Integrating Spano with your Micropub Endpoint

If you want Micropub clients to use Spano as your Media Endpoint, you need to advertise it. This is handled by your "main" Micropub server using discovery. Essentially, a client will make a configuration request to your server like so:

https://example.com/micropub?q=config

And your server's response should be a JSON-formatted object specifying the "media-endpoint". A bare minimum example:

{
  "media-endpoint": "https://media.example.com/micropub/"
}

In addition to advertising the media-endpoint, your Micropub server must be able to handle lists of URLs in places where it would normally expect a file.

For example, when posting a photo from Quill without a media endpoint, your Micropub server will receive a multipart/form-data encoded file named "photo". When posting from Quill with a media endpoint, your Micropub server will instead receive a list of URLs represented as "photo[]=https://media.example.com/cc/...2cb.jpg". Presumably this pattern would hold for other media types such as video and audio, if you are using Micropub clients that support them.

This particular step has been an interesting challenge for my site, which is a static site generated by Jekyll. My previous Micropub file-handling implementation expected all uploaded assets to live on disk next to the post files, and updating my Jekyll theme and plugins to handle the change is a work in progress. I eventually plan to move all my uploads out of the source for my project in favor of storing them with Spano.

Feedback Welcome!

Spano is probably my second public Python project, so I'd love feedback! If you try it out and run into issues, please drop me a line on GitHub. Or you can find me in the #indieweb chat on freenode IRC.

I'd also like to thank Kyle Mahan for his Woodwind Flask server application, which inspired the structure of Spano.

Flask-IndieAuth - A Python Library for Micropub Servers

One of the things I like about the IndieWeb community is that while they are building tools for themselves, they also tend to release useful parts under Free Software licenses. This helps other developers join the community more quickly, but it also tends to help improve the quality and feature sets of these projects as others use and add to the source.

One of my favorite things to come from the IndieWeb folks is the Micropub API standard, which defines some simple protocols for clients to send post data (the kinds of things you'd share on a blog or social media: images, short plain text, long articles, tags, and more) to servers for posting. One upshot is that if your server accepts Micropub, you can use one of many clients to put content on your site. I'm using a dedicated editor from Aaron Parecki's Quill to write this post, but there are lots of alternatives that are aimed at special use cases. For example, Kyle Mahan's Woodwind is an IndieWeb reader app that happens to include functionality for posting replies, favorites, reposts, and even RSVPs directly to my site via Micropub.

Another favorite is the idea of IndieAuth for web sign-in. At a high level, the idea is that you create two-way links between your website and your user profile on some other silo. For example, on your homepage you add a link to your Twitter profile and on your Twitter profile you link back to your homepage. For a client that supports IndieAuth, I can log in using my homepage URL by verifying that I can log in to my Twitter account.

My own personal Micropub implementation is a little pile of spaghetti Python code making use of the Flask framework. I use IndieAuth to handle authentication (i.e. - proving that a post comes from an app that I've logged into) and authorization (i.e. - proving that I gave that app permission to post to my site). As I've started improving my Micropub implementation, I found it useful to extract that portion of my code into a library that can be used with other Flask applications.

Introducing Flask-IndieAuth

Flask-IndieAuth is a Flask extension that adds the ability to require a client to send a valid IndieAuth token when making requests to any route. For example:

from flask_indieauth import requires_indieauth

@app.route('/micropub', methods=['GET','POST'])
@requires_indieauth
def handle_micropub():
    # ... handle the request

The @requires_indieauth decorator runs before the code for the route. It currently looks for an IndieAuth token in one of three places, in order:

  • HTTP Header (e.g. "Authorization: Bearer xxxx...xx")
  • HTTP form data or query string (e.g. "?access_token=xxxx...xx")
  • The body of a JSON-encoded POST (e.g. {"access_token": "xxxx...xx"})

If a token is found, it will be verified against the configured Token Endpoint to confirm that it is a valid token issued for your server's configured homepage with a sufficient scope.

For more information on how to install, configure, and use Flask-IndieAuth, please check out the README on GitHub.

Next Steps

I'll be using this extension to build my Micropub media endpoint (coming up in a future post) and so far it is working just fine. That said, I know there is a lot of room for improvement. Some things on my list:

  • "scope" can have many values, but only "post" is supported for now. It should probably be passed as an argument to @requires_indieauth so different routes can have different requirements.
  • The configured homepage ("ME") is currently expected in the Flask app's config. I'm not sure if that's "standard".
  • "TOKEN_ENDPOINT" is currently expected in the Flask app's configuration, but since it is required to be specified in the HTTP headers for or as a <link> in the content for the homepage, this could be fetched by the server.
  • Error handling isn't great - all failure conditions currently return HTTP 400 (Bad Request) but should probably be diversified a bit.

Feedback Welcome!

This is my first published Flask extension (heck, it's my first public Python package on PyPI), and I'd really appreciate comments, questions, pull requests, etc. Feel free to reach out on GitHub, or you can find me in the #indieweb chat on freenode IRC.