Marty McGuire

Posts Tagged indieweb

2017
Sun Oct 29

This Week in the IndieWeb Audio Edition • October 21st - 27th, 2017

Show/Hide Transcript

Audio edition for This Week in the IndieWeb for October 21st - 27th, 2017.

This week features a brief interview with Tantek Çelik recorded at IndieWebCamp NYC 2017.

You can find all of my audio editions and subscribe with your favorite podcast app here: martymcgui.re/podcasts/indieweb/.

Music from Aaron Parecki’s 100DaysOfMusic project: Day 85 - Suit, Day 48 - Glitch, Day 49 - Floating, Day 9, and Day 11

Thanks to everyone in the IndieWeb chat for their feedback and suggestions. Please drop me a note if there are any changes you’d like to see for this audio edition!

Tue Oct 24

Here’s a brief interview with Oliver Baptiste from the most recent This Week in the IndieWeb Podcast.

🎧 Full Episode: https://martymcgui.re/2017/10/22/094640/

Sun Oct 22

This Week in the IndieWeb Audio Edition • October 14th - 20th, 2017

Show/Hide Transcript

Audio edition for This Week in the IndieWeb for October 14th - 20th, 2017.

This week features a brief interview with Oliver Baptiste recorded at IndieWebCamp NYC 2017.

You can find all of my audio editions and subscribe with your favorite podcast app here: martymcgui.re/podcasts/indieweb/.

Music from Aaron Parecki’s 100DaysOfMusic project: Day 85 - Suit, Day 48 - Glitch, Day 49 - Floating, Day 9, and Day 11

Thanks to everyone in the IndieWeb chat for their feedback and suggestions. Please drop me a note if there are any changes you’d like to see for this audio edition!

Wed Oct 18

HWC Baltimore 2017-10-18 Wrap-Up

Baltimore's second October 2017 meetup for Homebrew Website Club met at the Digital Harbor Foundation Tech Center on October 4th.

Below are notes from the "broadcast" portion of the meetup.

martymcgui.re — Recently posted about his IWC NYC hack day project - showing subtitle and caption tracks for audio files in the browser by marking them up as a video element. Showed off his new personal site, a note-taking and planning site with an audience of one. It lives on his laptop and on a protected Tor hidden service so it can be read from his phone. Also been working on porting his podcast website to Hugo, learning how fragile his micropub server is about mapping URLs to flat files on the server.

bouhmad.com — Started a new job recently, so been working to find time to work on his new site. Wrote an outline and a list of resources about what the site will be about and what content will be on it. Hoping to publish in the next two weeks!

maryreisenwitz.com — Our venue host for the evening! Been writing a lot of stuff and working to organize it. Outlining story points for a narrative piece, consists of lots of dream pieces, which she has been logging for a few years. Been organizing those in Google Keep, cataloging and tagging them by time. Found it really interesting to look over her dream notes for this time past year. Also finding searching incredibly useful. "When did I dream about a horse? There are two dreams!"

rhearamakrishnan.com — Has a set of projects she wants to finish before updating her website. One of them is a podcast that will require lots of collaborative elements, so been planning that.

angelosresu.me — Has been on the job search and realized he needed a website. It's currently a programming demo of a CAD app that uses Paper.js and supports boolean operations over primitives.

Other things:

  • We talked about aligning audio/video content and content in a web page, like this course on O'Reilly. Would be useful for DHF's learning system courses. Could also be fun to have This Week in the IndieWeb Audio Edition show previews of the pages being discussed as they are being discussed.
  • We discussed capturing notes with the lowest friction tools then moving them into more useful/durable systems later.
  • We talked about podcasting tools. Dedicated recorders for in-the-field recording, Audacity for editing.
  • We talked about the recent announcement from Adafruit's founder/owner/engineer ladyada finding herself locked out of her Facebook account with no apparently recourse. This led to general discussion of silos and monopolies, systems that are secretly bad for you because they stalk you or share you or your contacts' data, systems that are directly bad for you like Candy Crush and other addictive apps.
  • Talked about some decentralized systems, like MaidSafe (decentralized p2p filestore, incentivized w/ a cryptocurrency), and Beaker's new decentralized Twitter-alike Rotonde (decentralized p2p websites, host it yourself or pay someone to mirror).

Left-to-right: martymcgui.re, bouhmad.com, maryreisenwitz.com, rhearamakrishnan.com, angelosresu.me

Thanks to everybody who came out! We look forward to seeing you at the Digital Harbor Foundation Tech Center for the next one! Look for the announcement soon!

🗓️ Homebrew Website Club Baltimore October 18th, 2017

📆 Add to Calendar: iCal | Google Calendar

Join us for an evening of quiet writing, IndieWeb demos, and discussions!

  • Create or update your personal web site!
  • Finish that blog post you’ve been writing, edit the wiki!
  • Demos of recent IndieWeb breakthroughs, share what you’ve gotten working!
  • Join a community with like-minded interests. Bring friends that want a personal site!

Any questions? Join the #indieweb chat!

Optional quiet writing hour starts at 6:30pm. Meetup begins at 7:30pm.

More information: https://indieweb.org/events/2017-10-18-homebrew-website-club

Facebook events: https://www.facebook.com/events/174931906396754/

Tue Oct 17

Native HTML5 captions and titles for audio content with WebVTT

This is a write-up of my Sunday hack day project from IndieWebCamp NYC 2017!
You can see my portion of the IWC NYC demos here.

Prelude: Videos for audio content

Feel free to skip this intro if you are just here for the HTML how-to!

I've been doing a short ~10 minute podcast about the IndieWeb community since February, an audio edition of the This Week in the IndieWeb weekly newsletter cleverly titled This Week in the IndieWeb Audio Edition.

After the 2017 IndieWeb Summit, each episode of the podcast also featured a brief ~1 minute interview with one of the participants there. As a way of highlighting these interviews outside the podcast itself, I became interested in the idea of "audiograms" – videos that are primarily audio content for sharing on platforms like Twitter and Facebook. I wrote up my first steps into audiograms using WNYC's audiogram generator.

While these audiograms were able to show visually interesting dynamic elements like waveforms or graphic equalizer data, I thought it would be more interesting to include subtitles from the interviews in the videos. I learned that Facebook supports captions in a common format called SRT. However, Twitter's video offerings have no support for captions.

Thankfully, I discovered the BBC's open source fork of audiogram, which supports subtitles and captioning, including the ability to "bake in" subtitles by encoding the words directly into the video frames. It also relies heavily on BBC web infrastructure, and required quite a bit of hacking up to work with what I had available.

The BBC Audiogram captioning interface.

In the end, my process looked like this:

  • Export the audio of the ~1 minute interview to an mp3.
  • Type up a text transcript of the audio. Using VLC's playback controls and turning the speed down to 0.33 made this pretty easy.
  • Use a "forced alignment" tool called gentle to create a big JSON data file containing all the utterances and their timestamps.
  • Use the jq command line tool to munge that JSON data into a format that my hacked-up version of the BBC audiogram generator can understand.
  • Use the BBC audiogram generator to edit the timings and word groupings for the subtitles and generate the final video.
  • Bonus: the BBC audiogram generator can output subtitles in SRT format - but if I've already "baked them in" this feels redundant.

You can see an early example here. I liked these posts and found them easy to post to my site as well as Facebook, Twitter, Mastodon, etc. Over time I evolved them a bit to include more info about the interviewee. Here's a later example.

One thing that has stuck with me is the idea that Facebook could be displaying these subtitles, if only I was exporting them in the SRT format. Additionally, I had done some research into subtitles for HTML5 video with WebVTT and the <track> element and wondered if it could work for audio content with some "tricks".

TL;DR - Browsers will show captions for audio if you pretend it is a video

Let's skip to the end and see what we're talking about. I wanted to make a version of my podcast where the entire ~10 minutes could be listened to along with timed subtitles, without creating a 10-minute long video. And I did!

Here is a sample from my example post of an audio track inside an HTML5 <video> element with a subtitle track. You will probably have to click the "CC" button to enable the captioning

How does it work? Well, browsers aren't actually too picky about the data types of the <source> elements inside. You can absolutely give them an audio source.

Add in a poster attribute to the <video> element, and you can give the appearance of a "real" video.

And finally, add in the <source> element with your subtitle track and you are good to go.

The relevant source for my example post looks something like this:

<video controls poster="poster.png" crossorigin="anonymous" style="width: 100%" src="audio.mp3">
  <source class="u-audio" type="audio/mpeg" src="audio.mp3">
  <track label="English" kind="subtitles" srclang="en" src="https://media.martymcgui.re/.../subtitles.vtt">
</video>

So, basically:

  • Use a <video> element
  • Give it a poster attribute for a nice background
  • Use an audio file for the <source> inside
  • Use the <track> element for your captions/subtitles/etc.

But is that the whole story? Sadly, no.

Creating Subtitles/Captions in WebVTT Format

In some ways, This Week in the IndieWeb Audio Edition is perfectly suited for automated captioning. In order to keep it short, I spend a good amount of time summarizing the newsletter into a concise script, which I read almost verbatim. I typically end up including the transcript when I post the podcast, hidden inside a <details> element.

This script can be fed into gentle, along with the audio, to find all the alignments - but then I have a bunch of JSON data that is not particularly useful to the browser or even Facebook's player.

Thankfully, as I mentioned above, the BBC audiogram generator can output a Facebook-flavored SRT file, and that is pretty close.

After reading into the pretty expressive WebVTT spec, playing with an SRT to WebVTT converter tool, and finding an in-browser WebVTT validator, I found a pretty quick way of converting those in my favorite text editor which basically boils down to changing something like this:

00:00:02,24 --> 00:00:04,77
While at the 2017 IndieWeb Summit,

00:00:04,84 --> 00:00:07,07
I sat down with some of the
participants to ask:

Into this:

WEBVTT

00:00:02.240 --> 00:00:04.770
While at the 2017 IndieWeb Summit,

00:00:04.840 --> 00:00:07.070
I sat down with some of the
  participants to ask:

Yep. When stripped down to the minimum, the only real differences in these formats is the time format. Decimals delimit subsecond time offsets (instead of commas), and three digits of precision instead of two. Ha!

The Future

If you've been following the podcast, you may have noticed that I have not started doing this for every episode.

The primary reason is that the BBC audiogram tool becomes verrrrry sluggish when working with a 10-minute long transcript. Editing the timings for my test post took the better part of an hour before I had an SRT file I was happy with. I think I could streamline the process by editing the existing text transcript into "caption-sized" chunks, and write a bit of code that will use the pre-chunked text file and the word-timings from gentle to directly create SRT and WebVTT files.

Additionally, I'd like to make these tools more widely available to other folks. My current workflow to get gentle's output into the BBC audiogram tool is an ugly hack, but I believe I could make it as "easy" as making sure that gentle is running in the background when you run the audiogram generator.

Beyond the technical aspects, I am excited about this as a way to add extra visual interest to, and potentially increase listener comprehension for, these short audio posts. There are folks doing lots of interesting things with audio, such as the folks at Gretta, who are doing "live transcripts" with a sort of dual navigation mode where you can click on a paragraph to jump the audio around and click on the audio timeline and the transcript highlights the right spot. Here's an example of what I mean.

I don't know what I'll end up doing with this next, but I'm interested in feedback! Let me know what you think!

🔖 Bookmarked Social Media Systems and Democracy | Vi Hart http://vihart.com/social-media-systems-and-democracy/

“In practice, inspiring and satisfying pieces of content are dead ends for user actions. Thoughtful pieces of content that take twenty minutes to read get one vote in the time it takes for pretty pictures and amusing memes to get dozens.”

Fri Oct 13

This Week in the IndieWeb Audio Edition • October 7th - 13th, 2017

Show/Hide Transcript

Audio edition for This Week in the IndieWeb for October 7th - 13th, 2017.

This week features a brief interview with David Shanske recorded at IndieWebCamp NYC 2017.

You can find all of my audio editions and subscribe with your favorite podcast app here: martymcgui.re/podcasts/indieweb/.

Music from Aaron Parecki’s 100DaysOfMusic project: Day 85 - Suit, Day 48 - Glitch, Day 49 - Floating, Day 9, and Day 11

Thanks to everyone in the IndieWeb chat for their feedback and suggestions. Please drop me a note if there are any changes you’d like to see for this audio edition!

Sun Oct 8

This Week in the IndieWeb Audio Edition • September 30th - October 6th, 2017

Show/Hide Transcript

Audio edition for This Week in the IndieWeb for September 30th - October 6th, 2017.

This week features a brief interview with Emma Hodge recorded at IndieWebCamp NYC 2017.

You can find all of my audio editions and subscribe with your favorite podcast app here: martymcgui.re/podcasts/indieweb/.

Music from Aaron Parecki’s 100DaysOfMusic project: Day 85 - Suit, Day 48 - Glitch, Day 49 - Floating, Day 9, and Day 11

Thanks to everyone in the IndieWeb chat for their feedback and suggestions. Please drop me a note if there are any changes you’d like to see for this audio edition!

Wed Oct 4

HWC Baltimore 2017-10-04 Wrap-Up

Baltimore's first October 2017 meetup for Homebrew Website Club met at the Digital Harbor Foundation Tech Center on October 4th.

Below are notes from the "broadcast" portion of the meetup.

martymcgui.re — Went to IndieWebCamp NYC last weekend! Had a really great time (that he really needs to write up). Figured out how to show closed captions / subtitles on audio content (and needs to write that up). Recently decided that Jekyll was slowing him down too much and decided to jump to Hugo. First steps there - use a sacrificial website to learn on that is much simpler, in this case the We Have to Ask Podcast. Also showed off Rob Weychert's website as one that impressed him from IWC NYC due to the really nice typography, spacing, layout.

djfalcon23.github.io — Added a new slideshow feature. Can now show a model slideshow of past projects. In true HWC fashion, he pushed this feature live during the demo. Will be adding similar modal displays for PDF documents and videos.

lizboren.art — Been changing her art portfolio site. It's hosted on ArtStation which has a pretty affordable "pro" level with good editing tools. She's been happy with it for now. Slightly more problematic is that her .art domain was registered on her behalf by her school and now she doesn't know how to get access to manage it. We tried to use the WHOIS info to track down who to contact at the controlling registrar.

jonathanprozzi.net — Been working on a site for work at Digital Harbor Foundation. They are relaunching blueprint.digitalharbor.org educator resource portal. They've been working on a clear structured landing page for people that are not registered for it, as well as cleaning up navigation for users who are registered. It's a WordPress site and they've been moving their content into "Sensei", a WordPress add-on for education content from WooCommerce.

Other things:

  • We talked about doing design research and taking inspiration from sites that are similar to what you're working on.
  • Talked about the different approaches needed when working on content and structure versus working on making something attractive.
  • Went around talking about pet peeves about the web: bad graphic design, not having an obvious login button, sites that use social logins (e.g. GitHub or Twitter) when they don't work, crucial interactions in modals that aren't clickable on mobile, surveillance and adware crap.
  • Talked about the recent "alternative to ads" where the page runs a JavaScript bitcoin miner, and how wasteful this is in terms of energy vs. coins earned. Maybe all machines need mining ASICs? Talked about other alternatives like Brave and Flattr, compulsory licensing models and Doctorow's Eastern Standard Tribe, paywalls, paying for things with Bitcoin in general, money as an abstract concept.

Left-to-right:lizboren.art, martymcgui.re, jonathanprozzi.net, djfalcon23.github.io

Thanks to everybody who came out! We look forward to seeing you on October 18th at the Digital Harbor Foundation Tech Center!