Taproot

Taproot is Barnaby Walters’ publishing software, written in PHP 5.4 and driving most of waterpigs.co.uk. It is not currently released to the public. However, various parts of it have been extracted into libraries:

php-mf2 for parsing microformats2 data
- also php-mf2-shim for getting mf2 data out of silos
taproot/authentication library concisely implementing indieauth client app + resource server logic
taproot/subscriptions for easily subscribing+crawling any web content
taproot/archive your personal opinionated indieweb/microformats-oriented HTML archiver
original-post-discovery] for determining the canonical URL from POSSE copies
mf-cleaner for handling microformats2 array structures better
diaspora-export for exporting public posts from Diaspora
php-abc for parsing headers out of ABC notation
~~php-helpers~~: static helper functions, some utility classes and POSSE stuff including the truncenator. Slowly having the useful parts extracted into smaller packages.
~~librarian~~ for DB indexing of flat files, no longer used/maintained
~~php-activitystreams~~ activitystreams implementation, no longer used/maintained

Most of the Taproot design artifacts are up on github too in taproot/design.

All content-creation UIs in taproot are publicly viewable — take a peek!

Design Principles

In addition to the indiewebcamp principles, these are some principles I have discovered through building taproot.

Build software which feels safe, all other concerns (privacy, security, technically exciting) are secondary [1] — this is based on personal experience through selfdogfooding taproot
Easy things should be easy, remove friction so hard things are only has hard as they need to be [2]
Use event listener patterns to build upon the same data in intelligent ways [3]
Build reusable, framework-agnostic components and release them to the public, not monolithic products.

Taproot was named on 2014-03-24.

IndieWeb Support

Taproot supports the following:

All content is marked up using microformats2 so parse at your leisure.
Supports posting of notes, articles and music — other post types are faked with HTML inside notes
Sends pingbacks and webmentions to all the links on every post
Receives webmentions and pingacks and displays them as comments, likes, reposts or mentions
- Hooked up to Bridgy to display silo interactions in the same place
reply notes parse the URL they’re in-reply-to and display a reply-context if microformats are present
Notes have webactions with twitter intent fallback if POSSEd to twitter
ATOM fallback is provided for all feeds
Notes and articles are POSSEd to twitter automatically and manually/indirectly to facebook
Logging in using indieauth and distributed-indieauth
New post UI is publicly viewable and can be used to create new posts on other people’s micropub-enabled sites

Itches

Currently working on:

Using Taproot every day to post content
Shrewdness

Next up, in approximate order of priority:

Publishing events and RSVPs, handling incoming RSVPs, figuring out invitations
Publishing recipes
- Particularly I’m interested in presenting them in a way optimised for use in the kitchen from a laptop or tablet
Audio posts, potentially with POSSE to Soundcloud
Publishing likes and reposts
DOM templating for all templates
Extract more of the code behind my listeners and services into reusable open-source libraries

Done (latest first):

Implemented anti-DDOS measures (tried client validation, ditched and went with expiring webmention endpoints)
Publishing venues with reviews, inspired by Tom Morris
Implemented a basic micropub endpoint, capable of creating notes
Made it easier to edit note and article published datetimes and lists of syndication URLs
Timezones now included in all datetimes in feeds and posts
Showing mentions+mention counts on articles as well as notes, made templates and logic more reusable
Composite feed on homepage instead of just notes
Implemented distributed Indieauth and micropub, allowing people to use my new-note UI to create posts on their own site
Integrated with Bridgy instead of manually shimming backfeed
Added file upload control to note uploading UI
Improved new note location UI with interactive map preview for manual adjustments
Removed dedicated ATOM endpoints, replaced with generic mf2 to ATOM conversion endpoint
Basic phone UI on homepage
- works nicely on desktop, needs improvement and Tropo involvement for good mobile UX
Mobile-oriented homepage design as per previous brainstorming
Allowed notes to have names, which display as a small title at the top of the note. For shorter named articles which don’t quite fit into /articles, which is styled for long-form “serious” essays
Improved check in/location note UI, location now shown on interactive Leaflet map before posting, can be corrected by dragging on map
Added last-seen location+map, last 3 articles to homepage
Added a homepage feed after feed reader research demonstrated need for URL -> latest posts
Deploying new rewritten version
Rewriting the whole thing to remove frameworkey nonsense code
- using silex as app library
- no DB — faking it with CSV indexes like the idiot I am (it’s actually surprisingly effective)
Accepting webmentions as well as pingbacks
Federated comments
Post-by-email
Major migration away from DB storage, toward flat files, with lots of codebase cleanup in the process
Import of old posts from Diaspora see them here
Frontend js using Backbone and requirejs

Structure

I got sick of all the complicated frameworkey nonsense which was accumulating in taproot. Symptoms:

not immediately knowing where to look to change any one particular bit of functionality
having to change things in different packages+deploy+update to do simple functionality changes
too many layers of abstraction+indirection — specifically, mixing abstraction layers
no longer fun to work on, simple additions take too much time

And as such I am rewriting taproot to be smaller and flatter and saner.

Documentation of the aforementioned frameworkey nonsense is retained for posterity at the bottom of the page.

Storage

Mentions

Mentions are stored separately from the content they refer to, for ease of editing and moderation. They’re stored with a flattened h-entry-like representation of them which can be easily used in various contexts.

The HTML and HTTP headers from cachable (i.e. public) URLs which are fetched (e.g. webmentioned or webmentioners) are stored in an archive, which also acts as a cache.

Incoming webmentions are processed by a mentions module, which does the following things:

fetches and archives (using taproot/archive) the webmention source URL
resolves the target URL and gets it’s path
parses microformats on the page, looks for the first h-entry. if found, it’s flattened, normalised and purified to avoid XSS attacks
from rel values/u-* values in the h-entry, determine the primary mention relationship type, which is one of reply, rsvp, like, repost or mention — mention is used as a fallback if no more specific one is given
gives the mention an ID based on the source URI, target URI and current datetime
stores the mention in a file containing the source, target, resolved target path and normalised h-entry
indexes that file for querying later

Then, when looking for mentions of a particular URL (e.g. http://waterpigs.co.uk/notes/1000/), the following steps are taken:

Query the index for rows where the resolved path matches the path of the URL we’re looking for
use the rows given to either determine mention type counts for the URL, or fetch the associated files and parse to get the h-entry data for displaying

Posts

Modules all store their data in flat files (usually YAML) which I index in CSV file indexes, read/written using PHP’s fgetcsv and fputcsv functions, which are surprisingly speedy.

For content which has an intrinsic name (like pieces of music, contacts or blog posts) I tend to use an ASCII-fied slug of the name for the ID, generated using Helpers::toAscii. Otherwise (for notes) I create a <= six-character ID from the newbase60 representation of epoch days + number of seconds into that day.

Schema

The data is all stored in a microformats-2 JSON variant structure which is substantially flatter than mf2 JSON — all properties are properties of the parent object, not a properties key, there is no “children” key and only plural properties map to arrays. I keep most of the semantics the same except on the occasions where I prefer a different term (tag over category) or want to aggregate the properties from what would be multiple different mf properties into one (e.g. combining geo, adr and venue into “location”).

Originally I intended on using the activitystreams schema to store and represent all my content, the idea being that it would allow me to make listeners which could work seamlessly on all sorts of content types. In actuality, I have nowhere near as many content types as I anticipated, and want to treat each one very differently. In addition to this, I found some of the semantics within activitystreams irritating (e.g. displayName vs title, both horrible names) and the activity ~= object ambiguity confusing. Coupled with the fact that there is no activitystreams client, I decided to drop activitystreams as the basis for representing my data. --Waterpigs.co.uk 12:33, 14 June 2013 (PDT)

Previous, abandoned structure

Taproot has a highly modular structure. It is made up of an Application class, which handles event dispatching, simple dependency injection, routing and module loading. There are two types of modules — full modules, which expose URLs, and listeners, which listen to events dispatched within taproot.

Currently there are modules for Index, Notes, Articles, Contacts, Tunes, Tags and Mentions.

There are a large number of listeners which handle everything from authentication to POSSE and content transformation. Typically if I want to add a new piece of functionality I’ll do it in a listener, if I want to add a new content type I’ll create a new module.

Notes

(out of date as of 2013-12)

The Notes module is the mainstay of my Taproot usage. It handles the creation and display of short notes — kinda like tweets, but with a potentially much larger scope through the excessive use of tagging.

As in most of Taproot, the notes module handles basic CRUDL of data, and various listeners do everything else, including converting markdown, interpreting hashtags, autolinking @names, syndication of content.

Posting a note looks like this:

A still of the posting UI just after the note has saved:

Recent & Upcoming

Resources

Taproot

Contents

Taproot

Design Principles

IndieWeb Support

Itches

Structure

Storage

Mentions

Posts

Schema

Previous, abandoned structure

Notes

See Also