MutationObservers as Client-side Pseudo-PHP

25Jun14

One of my favourite pieces of technology is WordPress. My main project with it is Computer Science Circles, but I’ve also used it for online course notes and I first learned of it helping my colleague Douglas with a student organization website. It was my head-first introduction to the language PHP.

PHP is capable of a lot of nifty stuff. PHP pages act like little programs that run and whose output is an HTML page that is displayed to the user. This lets you do things like interact with databases or client state (e.g. shopping baskets, logins, etc). But even with stateless pages, one interesting thing about using PHP in place of HTML is that you can define new commands to shorten or clarify repetitive structures.

Here’s some repetitive structure as an example: imagine a page with a lot of links to Wikipedia articles, like this.

<a href='http://en.wikipedia.org/wiki/Cheese'>Cheese</a>
is made from
<a href='http://en.wikipedia.org/wiki/milk'>milk</a>
which comes from
<a href='http://en.wikipedia.org/wiki/mammals'>mammals</a>

Displays as: Cheese is made from milk which comes from mammals

There’s a common pattern repeated 3 times in that HTML fragment: the pattern

<a href='http://en.wikipedia.org/wiki/X'>X</a>

occurs once with X=Cheese, once with X=milk, and once with X=mammals. Were HTML a real programming language, it would be good programming practice to define a function that could handle that pattern, and to call it three times. This can be done in PHP like so:

<?php
function linkish($X) {
  echo "<a href='http://en.wikipedia.org/wiki/$X'>$X</a>";}
?>
<?php linkish("Cheese"); ?> is made from <?php linkish("milk"); ?>
 which comes from <?php linkish("mammals"); ?>

In WordPress, which first made me think of this kind of design pattern, you are allowed to do an essentially identical thing called a “shortcode”. It looks a lot like the above PHP, where you define and register a shortcode inside of a plugin, and you write [linkish]Cheese[/linkish] to call it. Note that it is almost like you are defining a new HTML element in terms of old ones.

However, it seems like a little bit of overkill to ask the server to do so much thinking, whether with plain PHP or with shortcodes. It’s not serving a truly dynamic page: this content will be the same every time a user looks as it. On top of this, served webpages are allowed to contain function calls and definitions, as long as they are written in JavaScript. Can we get JavaScript to do this work for us?

One approach would be to signify these elements and arguments in some way, to wait until the client has the page, and then to have the client do all of the substitutions.

<a class='linkish'>Cheese</a> is made from 
 <a class='linkish'>milk</a> which comes from
 <a class='linkish'>mammals</a>
<script type='text/javascript'>
var elts = document.querySelectorAll('a.linkish');
for (var i=0; i<elts.length; i++)
  elts[i].setAttribute('href', 
    'http://en.wikipedia.org/wiki/'+elts[i].innerHTML);
</script>

This code waits until everything’s displayed, then goes through and retroactively modifies the <a class=’linkish’>X</a> objects as desired.

This is similar to how MathJax is able to insert \LaTeX formulas like \sin^2(x)+\cos^2(x)=1 into virtually any webpage, including this one! After the page is loaded, it searches the whole page for expressions like $latex \sin^2(x)+\cos^2(x)=1$ and then re-renders the contents appropriately.

However, this approach still has one downside compared to PHP. Until the full HTML page has been received by the human browsing the site, no replacements will take place. (You can confirm this by trying out MathJax on a long page: no LaTeX appears until everything is loaded.) This is not ideal since it’s jarring to the user to see weird placeholders sit around for so long, and it prevents them from starting to read the page until everything it totally done. This is the opposite of progressive HTML rendering, which refers to the fact that your browser will try to show you the first parts of the page while it’s still receiving and rendering the parts further down off-screen.

You could periodically re-check the page for new <a class=’linkish’> or ... elements, but this is a pretty hacky solution and causes work to be repeated and therefore does more computation than necessary. What we’d really like is to transform the incoming data on-the-fly. You could imagine it as a filter: as the HTML data is coming in, we would like to have some kind of stream-replacer that would search for something and replace it appropriately. I don’t think there is any way to literally do stream replacement on incoming HTML, but what I learned this week is that you can use a JavaScript MutationObserver to achieve basically the same effect.

A MutationObserver is a device that lets your code get notified whenever HTML elements on your page are changed; including, how the page is built while it is loaded. For example, here’s how it gives a new solution to the task mentioned above.

<script type='text/javascript'>
// construct observer _before_ anything is rendered
var mo = new MutationObserver(
 // constructor argument: callback on MutationRecord[]
 function (events) {
  // for each record,
  for (var i=0; i<events.length; i++)
   // MutationRecord has Node "target" and Node[] "addedNodes"
   for (var j=0; j<events[i].addedNodes.length; j++)
    // we'll define "process(parent, child)" below
    process(events[i].target, events[i].addedNodes[j]);
 }
);

// what should we observe, and with what options?
mo.observe(document, {childList: true, subtree: true});

// as promised, the callback
var process = function (parent, child) {
 if (child.nodeName=="A" && child.classList.contains("linkish"))
 child.setAttribute('href',
 'http://en.wikipedia.org/wiki/'+child.innerHTML);
};
</script>
<!-- now the same page as before -->
<a class='linkish'>Cheese</a> is made from <a class='linkish'>milk</a>
 which comes from <a class='linkish'>mammals</a>

I will admit this is no longer shorter or easier to read, although it is just a toy example. I managed to find two different uses for this technique last week while writing the manual for my websheets Java coding exercise tool. One use was similar to the above (rendering links automatically, in fact rendering two parallel links at once, but it was very cumbersome and error-prone to re-type in the same structure over and over). The other use was to automatically build a table of contents: every time an H2 or H3 header element showed up, it automatically added a new bulleted item to the table of contents. (In fact, if your internet connection is slow enough, you can see the table of contents of the manual extending bit by bit as the H2 and H3 elements further below arrive.) See the manual source code for details. And, for a better list of options and explanations, consult the full MutationObserver API.

I was curious to explore a few of the finer details of this technique. What order will elements be added in? HTML has a tree-like structure, but it’s not clear if parents will be added before children or vice-versa. So I built a small test, replacing process() with a simple logging function, and using the content

<div>This is <b>bold</b><i> and italically <span>nested</span></i></div>

(See source of this page, and its console.) The order of events happening was:

  1. The div was added to its parent
  2. “This is” was added to the div
  3. The <b> was added to the div
  4. “bold” was added to the b
  5. The <i> was added to the div
  6. “and italically” was added to the <i>
  7. The span was added to the <i>
  8. “nested” was added to the span.

So, this would be a tree preorder. But because the MutationObserver sends you updates in batches, there was one more surprise: in step 3, by the time I looked at the <b> element, it already contained “bold”, and similarly for the other grandchildren.

This general design goal, of defining a new kind of element in terms of old ones, is something that HTML feels like it sorely needs. If you are making an awesome code editing widget or database slice displayer or simply trying to add “click to pop up” or “click to hide/show” functionality, this is the most natural approach. In fact, it looks like the new Custom Elements API lets us do precisely that!

Advertisements


One Response to “MutationObservers as Client-side Pseudo-PHP”

  1. 1 Anonymous

    Why not just use a PHP accelerator?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: