This Week in the IndieWeb Audio Edition • April 28th - May 4th, 2018

Show/Hide Transcript

Syndication setbacks, WordPress wisdom, and GDPR wrangling. It’s the audio edition for This Week in the IndieWeb for April 29th - May 5th, 2018.

You can find all of my audio editions and subscribe with your favorite podcast app here: martymcgui.re/podcasts/indieweb/.

Music from Aaron Parecki’s 100DaysOfMusic project: Day 85 - Suit, Day 48 - Glitch, Day 49 - Floating, Day 9, and Day 11

Thanks to everyone in the IndieWeb chat for their feedback and suggestions. Please drop me a note if there are any changes you’d like to see for this audio edition!

★ Favorited http://kimberlyhirsh.com/doing-my-part-to-fix-the-internet-a-follow-up/
post from Doing My Part to Fix the Internet: A Follow-up
A little over a year ago, I wrote about how a post by Vicki Boykis and a comment by Chris Aldrich had inspired me to do my part to fix the internet. Since that time, I’ve worked hard to get my WordPress site set up so that I can write content here, send it out to other places where people...
★ Favorited https://eddiehinkle.com/2018/05/02/16/article/
Indigenous Development Log #1
I was inspired by reading Chris Hannah’s Slate Development Log that I should track the progress on my own iOS app, Indigenous. I apologize for the length of this first post, but as this is the first log I want to cover what Indigenous is and what it can currently do. In future development logs I will focus more on...
Reacted 👍 to http://30andcounting.me/2018/05/04/185149.html
post from 30 and Counting, Episode 5: Leaving Facebook... and replying over email?
In this episode, I talk about my plans to leave Facebook and how I plan to in some ways replace it with a monthly newsletter. Then I brainstorm about how to receive replies and reactions from it. Links and Show Notes Monthly Newsletter Sign Up Leaving Facebook article Leaving Facebook August Announcement Notes from last Baltimore Homebrew Website Club You...

Leaving Netflix (and taking my data with me)

Netflix has been a staple in my life for years, from the early days of mailing (and neglecting) mostly-unscratched DVDs through the first Netflix original series and films. With Netflix as my catalog, I felt free to rid myself of countless DVDs and series box sets. Through Netflix I caught up on "must-see" films and shows that I missed the first time around, discovered unexpected (and wonderfully strange) things I would never have seen otherwise. Countless conversations have hinged around something that I've seen / am binging / must add to my list.

At times, this has been a problem. It's so easy to start a show on Netflix and simply let it run. At home we frequently spend a whole evening grinding through the show du jour. Sometimes whole days and weekends disappear. This be can true for more and more streaming services but, in my house, it is most true for Netflix. We want to better use our time, and avoid the temptation to put up our stocking'd feet, settle in, and drop out.

It's easy enough to cancel a subscription, and even easier to start up again later if we change our minds. However, Netflix has one, even bigger, hook into my life: my data. Literal years of viewing history, ratings, and the list of films and shows that I (probably) want to watch some day. I wanted to take that data with me, in case we don't come back and they delete it.

Netflix had an API, but after they shut it down in 2014, it's so far dead that even the blog post announcing the API shutdown is now gone from the internet.

Despite no longer having a formal API, Netflix is really into the single-page application style of development for their website. Typically this is a batch of HTML, CSS, and JavaScript that runs in your browser and uses internal APIs to fetch data from their service. Even better, they are fans of infinite scrolling, so you can open up a page like your "My List", and it loads more data as you scroll, until you've got it all loaded in your browser.

Once you've got all that information in your browser, you can script it right out of there!

After some brief Googling, I found a promising result about using the Developer Console in your browser to grab your My List data. That gave me the inspiration I needed to write some JavaScript snippets to grab the three main lists of data that I care about:

Each of these pages displays slightly different data, in HTML that can be extracted with very little javascript, and each loads more data as you scroll the page until it's all loaded. To extract it, I needed some code to walk the entries on the page, extract the info I want, store it in a list, turn the list into a JSON string, and the copy that JSON data to the clipboard. From there I can paste that JSON data into a data file to mess with later, if I want.

Extracting my Netflix Watch List

A screenshot of my watch list

The My List page has a handful of useful pieces of data that we can extract:

  • Name of the show / film
  • URL to that show on Netflix
  • A thumbnail of art for the show!

After eyeballing the HTML, I came up with this snippet of code to pull out the data and copy it to the clipboard:

(function(list){
  document.querySelectorAll('a.slider-refocus')
    .forEach(item => {
      list.push({
        title: item.getAttribute('aria-label'),
        url: item.getAttribute('href'),
        art: item.querySelector('.video-artwork').style.backgroundImage.slice(5,-2)
      })
    });
  copy(JSON.stringify(list, null, 2));
}([]));

The resulting data in the clipboard is an array of JSON objects like:

[
  {
    "title": "The Magicians",
    "url": "/watch/80092801?tctx=1%2C1%2C%2C%2C",
    "art": "https://occ-0-2433-2430.1.nflxso.net/art/b1eff/9548aa8d5237b3977aba4bddba257e94ee7b1eff.webp"
  },
  ...
]

I like this very much! I probably won't end up using the Netflix URLs or the art, since it belongs to Netflix and not me, but a list of show titles will make a nice TODO list, at least for the shows I want to watch that are not on Netflix.

Extracting my Netflix Ratings

A screenshot of my ratings list

More important to me than my to-watch list was my literal years of rating data. This page is very different from the image-heavy watch list page, and is a little harder to find. It's under the account settings section on the website, and is a largely text-based page consisting of:

  • date of rating (as "day/month/year")
  • title
  • URL (on Netflix)
  • rating (as a number of star elements. Lit up stars indicate the rating that I gave, and these can be counted to get a numerical value of the number of stars.)

The code I used to extract this info looks like this:

(function(list){
  document.querySelectorAll('li.retableRow')
    .forEach(function(item){
      list.push({
        date: item.querySelector('.date').innerText,
        title: item.querySelector('.title a').innerText,
        url: item.querySelector('.title a').getAttribute('href'),
        rating: item.querySelectorAll('.rating .star.personal').length
      });
    });
  copy(JSON.stringify(list, null, 2));
}([]));

The resulting data looks like:

[
  {
    "date": "9/27/14",
    "title": "Print the Legend",
    "url": "/title/80005444",
    "rating": 5
  },
  ...
]

While the URL probably isn't that useful, I find it super interesting to have the date that I actually rated each show!

One thing to note, although I wasn't affect by it, Netflix has replaced their 0-to-5 star rating system with a thumbs up / down system. You'd have to tweak the script a bit to extract those values.

Extracting my Netflix Watch History

A screenshot of my watch history page

One type of data I was surprised and delighted to find available was my watch history. This was another text-heavy page, with:

  • date watched
  • name of show (including episode for many series)
  • URL (on Netflix)

The code I used to extract it looked like this:

(function(list){
  document.querySelectorAll('li.retableRow')
    .forEach(function(item){
      list.push({
        date: item.querySelector('.date').innerText,
        title: item.querySelector('.title a').innerText,
        url: item.querySelector('.title a').getAttribute('href')
      });
    });
  copy(JSON.stringify(list, null, 2));
}([]));

The resulting data looks like:

[
  {
    "date": "3/20/18",
    "title": "Marvel's Jessica Jones: Season 2: \"AKA The Octopus\"",
    "url": "/title/80130318"
  },
  ...
]

Moving forward

One thing I didn't grab was my Netflix Reviews, which are visible to other Netflix users. I never used this feature, so I didn't have anything to extract. If you are leaving and want that data, I hope that it's similarly easy to extract.

With all this data in hand, I felt much safer going through the steps to deactivate my account. Netflix makes this easy enough, though they also make it extremely tempting to reactivate. Not only do they let you keep watching until the end of the current paid period – they tell you clearly that if you reactivate in 10 months, your data won't be deleted.

That particular temptation won't be a problem for me.

HWC Baltimore 2018-05-01 Wrap-Up

It's been a while!

Baltimore's first Homebrew Website Club of May met at the Digital Harbor Foundation Tech Center on May 1st!

In celebration of 5/01 aka HTTP 501 Not Implemented, we'll talk about things we wish that our websites did, but that they don't yet do.

Here are some notes from the "broadcast" portion of the meetup:

jonathanprozzi.net – Been working on lots of other projects. Did two work projects with GatsbyJS. One is deployed but not public. Learned a lot about GraphQL. Working on a handbook for youth training and trying to get a netlify CMS hooked up to it. Also did a small VueJS project to learn a bit more about it. Wants to use the WordPress API with some of these technologies on his site. 501 desire: going headless for his WordPress site because he is obsessed with PageSpeed.

maryreisenwitz.com – Been working on sites and content for work. Trying to capture FAQs about working at DHF in preparation for a couple of dozen new youth to start working here. Finds that good explanations uncover the need for more good explanations and lots of branching docs, as different youth employees will have different responsibilities. Excited about having this resource be a website. Wants to include a youth "face book" of names and faces so the new folks can recognize one another and existing staff. 501 desire: wants a web store on her main site, because Etsy is becoming frustrating.

bouhmad.com – Set up SSL via LetsEncrypt and loves it, the easiest SSL setup he has ever done. Started a blog post about intrusion detection, kept adding to it, and pushed it out last week. Working on a piece about a bug bounty he recently collected, working with the company in question. 501 desire: a mailing list signup and a Hugo-driven RSS feed to a Mailchimp mailing list.

grant.codes – Visiting as he drives across the US! Restructuring his site's data on the backend. Was using something like mf2 data, but now moving to pure mf2. Broke a bunch of features doing that, so going through to fix those now. 501 desire: homepage mentions! He accepts but doesn't store or display them.

eddiehinkle.com – Working on leaving Facebook! Has made a sign-up form for friends/family to sign up for monthly (for now) newsletter. Has a complex (too complex?) tagging for tech, personal, family to generate three RSS feeds. These can be subscribed to in any combination (so 9 possible feeds), and the emails will combine all posts in the desired feeds. The feeds themselves reuse markup that he wrote to make posts look good on micro.blog. Just posted monthly review for March and hopes to keep doing summaries. Uses the "last month" view on his site for the raw data. 501 desire: automated webmentions! His site is Jekyll-based, so that's a can of worms. Loves using Indigenous for the quick responses from the indie reader, but then has to go back to his site and manually send webmentions.

martymcgui.re – Traveled recently and checked in everywhere using Swarm, which feeds back to his site (sorry anyone following feeds)! Really enjoyed it, but slightly regrets giving Swarm all that data. Thinks an app could use the Swarm venue API to do Micropub and skip creating the checkins on the server. 501 desire: unlisted posts! Really wants to make photo gallery posts where each photo has a permalink but only the gallery shows up in feeds. Eventually private posts, too.

Other discussion:

  • Grant's automated year-in-review feature. Cities visited, hours of TV, distance traveled (tracks GPS constantly).
  • Ways to do hidden posts. Categories. Unlisted or private as a property.
  • Email lists vs "followers" on social media and the feeling of reach.
  • Facebook's reaction when you start mass deleting friends. Mary once deleted close to 700 people and found that the interface started rearranging itself, putting people back in the list where it's easy to mis-click and re-add them as a friend.
  • Deleting your posts from Facebook. Does it affect the algorithm? What threats does it eliminate?
  • Family signing up for email blasts: would they reply? What if those replies went to an address that turned them into a comment on your site? If their email address is in your nickname cache, you can show their name and photo and url.
Left-to-right: grant.codes, eddiehinkle.com, bouhmad.com, martymcgui.re, maryreisenwitz.com, jonathanprozzi.net. Photo courtesy grant.codes.

Thanks to everybody who came out! We hope to see you all again at our next meeting on May 15th!

🔖 Bookmarked https://www.theverge.com/2018/4/28/17293056/facebook-deletefacebook-social-network-monopoly
I tried leaving Facebook. I couldn’t - The Verge

“We have a hard time figuring out what Facebook actually is because we have a hard time admitting that at least part of what it supplanted is emotional labor — hard and valuable work that no one wants to admit was work to begin with.”

h/t to Colin Walker