Unexpected Lessons from 100% Test Coverage

The conventional wisdom of the software engineering community is that striving to 100% test coverage is a fool’s errand. It won’t necessarily help you catch all bugs, and it might lead you down questionable paths when writing your code.

My recent attempts at 100% test coverage showed me the answer is much more subtle. At times I was tempted to make questionable code changes just for the sake of coverage. In some of those times, I did succumb. Yet, I found that often, there is an enlightened way to both cover a branch and make the code better for it. Blind 100% coverage can cause us to make unacceptable compromises. If we constrain ourselves with only making the codebase better, however, then thinking about 100% coverage can change the way we think about a codebase. The story of my 100% test coverage attempt is a story of both the good and the bad.


Last year I came across a thread from NPM creator Isaac Z. Schlueter advocating for 100% test coverage:

Schlueter alluded to a mentality shift that a developer achieves that piqued my interest:

Road to 100

Coveralls.io screen grab for schema-dts showing how it got to 100% Test Coverage

I decided that schema-dts would be the perfect candidate for the 100% test coverage experiment. Given N-Triples as an input, schema-dts generates TypeScript types describing valid JSON-LD literals for that ontology. I’ve been more and more interested recently in getting it to stability, and understanding where there’s headroom in the codebase.

The Setup

To start, I didn’t have a way to compute test coverage of my project in its current state. I ended up switching my test runner from Jasmine to Mocha for its lcov support. This being a TypeScript project, though, I had to enable source maps, and use ts-node to get coverage numbers of my actual .ts source (PR #56). I used Istanbul’s nyc to get coverage runs working locally. Coveralls integrates with nyc nicely to host online tracking of code coverage time. Coveralls also integrates seamlessly with Travis CI and gates all PRs by their ΔCoverage (PR #57).

My first real run after setting things up had a %78.72 test coverage. That’s not too bad, I thought. My existing tests belonged to broadly two general categories:

These baseline tests definitely covered a lot of lines of code that they didn’t really exercise, which is part of why that number was high. That itself can be an argument that 100% test coverage is a meaningless number. Schlueter’s promise of 100% test coverage, however, is that the act of getting to that long tail can have transformative effects of how I think about my own code. I wanted to try my luck at that first hand. If we wanted to be more confident about our covered lines are truly being tested, mutation testing might do better wonders than test coverage.

Happy Times: The Low Hanging Fruit

A Schema.org-like ontology can declare a certain class, property, or enum value as deprecated by marking it with a supersededBy predicate. schema-dts handles this in one of two ways: either marking it with @deprecated JSDoc comments in the code, or stripping those declarations entirely.

Looking at my coverage report, swaths of untested code become apparent. For example, I never attempted to generate a @deprecated class. Ok, let’s fix that. And I catch a real bug that my few unit tests hadn’t caught. I increased coverage by 9.8%, added some baseline tests of deprecation, and added some N-Triple parsing unit tests that I had never gotten around to.

Testing my Argparse setup showed me one of my default flag values was wrong (albeit in a harmless way).

Questionable Times

Log Tests?

A lot of uncovered lines I was seeing had to do with logging statements for things we skip or drop, or recoverable conditions we handle. A lot of these logs are warnings that happen in real Schema.org N-Triple files. For example, we never handle the sameAs or inverseOf triples describing properties. And if we see two comments describing the same class or property, the newer one wins.

Intuition would have it that a log statement should not be tested. But for good and bad reasons, I decided some baseline tests on log output might be desirable.

And I used that to add a log test for the multiple comments case when a unit test would have sufficed. That’s pretty questionable. But I was caught up in my zeal.

Some level of logging-based test might be defensible insofar as (1) we’re observing changes to our user interactions, and (2) it’s documenting the warnings/limitations of our code. Maybe one can view some logging-based tests as an equivalent to a screenshot diffing UI test? Or maybe I’m really trying to explain myself. Still, I felt less bad about adding a test showing a warning when parsing a common triple than I did about that comment test.

Code Golf?

Another way to get around some of these if (case) { Log(...); } cases is to instead write warnIf(case, ...). I think most people would agree a change like this is unhelpful at best. One might make the Machiavellian argument that such code golf is justified by the ends: once you get to 100% test coverage, you’ll stay there, and you’ll think critically about all your future diffs.

Constraining myself to Neutral and Positive ‘Code Golf’

One thing I tried doing halfway through is to make sure I don’t engage in any ‘code golf’ that makes the codebase worse in the name of test coverage. (Depending on how you define code golf, it might tautologically mean it makes the codebase worse; here, I really just mean it as any creative rearranging of code for the sake of coverage.)

What I found, though, is some code golf actually helped me think more clearly about the code. Even in cases where the codebase itself looked about the same, I now had a new vocabulary to talk about error conditions. In some other cases, covering lines of code drove me from making run-time guarantees in the code to making compile-time guarantees in the code (definitely a positive).

Let’s walk through some examples.

Neutral: Clarifying Assertions versus Errors

I had transformation functions that took RDF triples and converted them to an intermediate representation of classes. These functions had some issues:

A: Intra-function Impossibility

There was a line in my code that my tests never covered. It looked something like this:

// Top-level module:
const wellKnownTypes = [ ... ];  // Top-level var with more than one "Type".
const dataType = ...;

function ForwardDeclareClasses(topics: ReadonlyArray<TypedTopic>): ClassMap {
  const classes = new Map<string, Class>();
  for (const wk of wellKnownTypes) {
    classes.set(wk.subject.toString(), wk);
  }
  classes.set(dataType.subject.toString(), wk);
  for (const topic of topics) {
    // ...
  }

  if (classes.size === 0) {
    throw new Error('Expected Class topics to exist.');
  }

  return classes;
}

As you can see just reading this function and knowing: (classes.size === 0) will never happen. For one, there’s a classes.set(K, V) a few sections above. And we set a few other key-value pairs from this wellKnownTypes array that is hard-coded to always have a set number of elements.

One can try to understand the point of this error: It could be to show that none of the RDF triples we got are turned into classes (in which case we might want to compare classes.size with wellKnownTypes.length + 1 instead). Alternatively, it can be a poorly-placed verification for when we had less confidence that we were building classes properly, and had no clear concept of such “well known” types.

In my case, creating a map with just the well knows seemed fine. If the ontology is empty or missing data, we’ll likely find it at earlier steps or later ones. And the error gives no clear context as to what’s going wrong. So in my case, the answer as to kill it:

-  if (classes.size === 0) {
-    throw new Error('Expected Class topics to exist.');
-  }
B: Intra-Function Assertion

Another error I saw looked like this:

function ForwardDeclareClasses(topics: ReadonlyArray<TypedTopic>): ClassMap {
// ...
// ...
for (const topic of topics) {
if (!IsClass(topic)) continue;
// ...
classes.set(
topic.Subject.toString(), new Class(topic.Subject, /* ... */));
}
// ...
return classes;
}
function BuildClasses(topics: ReadonlyArray<TypedTopic>, classes: ClassMap) {
for (const topic of topics) {
if (!IsClass(topic)) continue;
const cls = classes.get(topic.Subject.toString());
if (!cls) {
throw new Error(/**... class should have been forward declared */);
}
toClass(cls, topic, classes);
}
}

In this case, line 21 never happened (and the (!cls) condition was always false). This should make sense: ForwardDeclareClasses literally checks if a TypedTopic satisfies IsClass(), and, if so, unconditionally adds it to the map. BuildClasses assert that a topic matching IsClass exists in the map.

One way to get test coverage for this line is to export BuildClasses and test it. But that seems like it goes against the spirit of making the codebase better. A better approach is ask what this line is trying to do:

Interlude: Expectations, Errors, and Assertions

Sometimes, we assert things because they either…

  1. are error conditions that can happen in the wild due to poor data or input,
  2. shouldn’t happen, and if they did it’s a sign of a bug in our code, or
  3. shouldn’t happen, and if they did it’s a sign of cosmic radiation.

I decided to differentiate these. If my test coverage report complains about an uncovered…

  1. error condition, I should test it. If I can’t, I should refactor my code to make it testable;
  2. assertion that might indicate a bug, some code golf to make these always run might be in order (more on that in a bit);
  3. assertion that is totally impossible, maybe I should delete it.

I refer to #1 as an error condition. Test these. For assertions, I often found that the line between #2 and #3 is often the function boundary (this isn’t always true). Cases of intra-function assertions (like case study A above) seem so useless that we’re better off removing them. Cases of inter-function assertions (like this case) seem useful enough to stay.

The Fix

I found that this distinction is not just helpful to split hairs: it’s also very helpful for someone reading the code: Is this error something that can happen in normal operation? or, is it a sign of a bug? I decided to make this clear:

  1. Normal error conditions: if + throw, or similar.
  2. Bug assertions: assert and assertXyz variants.

With that, I ended up with this change:

+import {ok} from 'assert';
+

+const assert: <T>(item: T|undefined) => asserts
+

 function BuildClasses(topics: ReadonlyArray<TypedTopic>, classes: ClassMap) {
   for (const topic of topics) {
     if (!IsClass(topic)) continue;
 
     const cls = classes.get(topic.Subject.toString());
+    assert(cls);
-    if (!cls) {
-      throw new Error(/**... class should have been forward declared */);
-    }
     toClass(cls, topic, classes);
   }
 }
The fix increased my code coverage totals by 1.3% to 99.053% at the itme
This fix increased my code coverage totals by 1.3%

Here, thinking about covering a line of code fundamentally helped me communicate what my code does more effectively. A lot of the “moving things around” that I had to do is very much semantic code golf (that happily happens to give a better test coverage score), but I’d like to think it’s net positive.

Positive: Restructure code to achieve compile-time guarantees rather than run-time guarantees

I already showed that some lines that are never covered by test runs are assertions that should never happen. Sometimes, we can restructure our code to make compile-time claims about our code’s structure, rather than worry about it.

I’ll be more specific: my code has a parseComment function that uses htmlparser2 to turn HTML comments in JSDoc tagged comments. In that code, we define a new htmlparser2.Parser that handles known tags and throws on unknown tags. It looks something like this:

function parseComment(comment: string): string {
const result: string[] = [];
const parser = new Parser({
ontext: text => result.push(replace(text)),
onopentag: (tag, attrs) => {
switch (tag) {
case 'a': result.push(`{@link ${attrs['href']} `); break;
case 'em': case 'i': result.push('_'); break;
case 'strong': case 'b': reslt.push('__'); break;
// ...
default: throw new Error(`Unknown tag "${tag}".`);
},
onclosetag: tag => {
case 'a': result.push('}'); break;
case 'em': case 'i': result.push('_'); break;
case 'strong': case 'b': reslt.push('__'); break;
// ...
default: throw new Error(`Unknown tag "${tag}".`);
}
}
});
parser.write(comment);
parser.end();
// ... turn result into 'lines'
return lines.length === 1 ? `* lines[0] ` :
('*\n *' + lines.join('\n *') + '\n ');
}

Initially, lines 11 and 18 in the above snippet where uncovered. Line 11, in onopentag is easy: I added a test for an unknown tag and saw it failed. Great. Our test coverage now includes this line.

I couldn’t get line 18 covered, though: If an unknown opening tag exists, it will throw on open. A self-closing tag counts as an open tag. Closing a never-opened tag is bad HTML, but doesn’t actually register as a closed tag. This is valid HTML: an end tag token with no matching start tag is not a valid token.

In other words, line 18 will never trigger.

The Fix

Should we remove it? Well… right now, with the code as stated, this line actually has some utility: if the developer adds some handling for a start tag (e.g. <table> or <td>), we’ll notice a run-time error if we omit adding the end tag handler as well.

That’s kind of useful. But the nice thing about using TypeScript is that we can structure our code so that run-time guarantees are turned into compile-time guarantees.

In our case, this change made things much better. Here’s a summarized version:

// Tag Management: Define functions describing what to do with HTML tags in our
// comments.
interface OnTag {
  open(attrs: {[key: string]: string}): string;
  close(): string;
}

// Some handlers for behaviors that apply to multiple tags:
const em: OnTag = { open: () => '_', close: () => '_' };
const strong = { open: () => '__', close: () => '__' };
// ...

// Our top-level tag handler.
const onTag = new Map<string, OnTag>([
  ['a', {open: (attrs) => `{@link ${attrs['href']} `, close: () => '}'}],
  ['em', em], ['i', em],
  ['strong', strong], ['b', strong],
  // ...
]);

function parseComment(comment: string): string {
  const result: string[] = [];
  const parser = new Parser({
    ontext: (text: string) => result.push(replace(text)),
    onopentag: (tag: string, attrs: {[key: string]: string}) => {
      const handler = onTag.get(tag);
      if (!handler) {
        throw new Error(`Unknown tag "${tag}".`);
      }

      if (handler.open) {
        result.push(handler.open(attrs));
      }
    },
    onclosetag: (tag: string) => {
      const handler = onTag.get(tag);
      assert(handler);

      if (handler.close) {
        result.push(handler.close());
      }
    }
  });
  parser.write(comment);
  parser.end();

  // ... turn result into 'lines'
  return lines.length === 1 ? `* ${lines[0]} ` :
                              ('*\n * ' + lines.join('\n * ') + '\n ');
}

By unifying tag handlers between open and close tags, a user wouldn’t forget to add a closed tag handler or vice-versa, as the compiler defines a unified interface for what “handling a tag” looks line.

In general, assertions and guarantees about parallel code structures (e.g. parallel switch cases, etc.) and parallel data structures (e.g. parallel lists or maps) are tenuous. If you have two structures (code, or a data structure) that need to agree about what elements they have, etc. you might often benefit from rearranging them structurally. Google’s Testing Blog covered the data structure version of this in it’s Code Health: Obsessed With Primitives? episode.

Uncovering Historical Choices and Fixes

Some of the lines we never covered very clearly looked like edge cases/fixes for something. Yet, it often was not clear what the edge case was or why we did it.

Ideally, as one introduces fixes for edge cases, they will also introduce the tests that exercise them. If you ever skip introducing a test, however, you might never get around to it. Until you look at the results of your test coverage run, that is.

Once we see an uncovered line, it becomes a great opportunity to track why it was added, and retroactively add the tests for it. Tools like git blame and git log are great resources to pinpoint the edge case in time, and find the relevant commit message.

I ran into two of these cases: a parser-level term that was being skipped and some specific enum handling. Both of these looked like big issues in Schema.org 3.4 that were silently fixed after. In both cases, I added comments explaining the triple that causes this, and where to find it, and added tests exercising this case. Great!

Opportunity to Simplify Code

As test coverage approaches 100%, it starts to become a proxy for code unreachability analysis. While sometimes it will simply point out shortcomings in your testing (also good), at others, it will point to unreachable code that your traditional static analysis tools won’t detect. This is because test coverage is calculated on code that actually runs, rather than some inspection of the control flow graph.

Here’s an example:

export class Class {
// ...
private baseNode(skipDeprecatedProperties: boolean, context: Context):
TypeNode {
// ...
const parentNode = /* ... */;
const isRoot = parentNode === null;
const propLiteral = createTypeLiteralNode([
// Add an '@id' property for the root.
...(isRoot ? [IdPropertyNode()] : []),
// ... then everything else.
...this.properties()
.filter(property => !property.deprecated || !skipDeprecatedProperties)
.map(prop => prop.toNode(context))
]);
if (parentNode && propLiteral.members.length > 0) {
return createIntersectionTypeNode([parentNode, propLiteral]);
} else if (parentNode) {
return parentNode;
} else if (propLiteral.members.length > 0) {
return propLiteral;
} else {
return createTypeLiteralNode([]);
}
}
}

Here, the tests never covered line 25. This makes sense: this line only happens if parentNode was null and propLiteral had 0 members. Except propLiteral will always have at least 1 member (@id) when it has no parent.

The other interesting part about this branch, by the way, is that propLiteral is itself a type literal node. Therefore, if propLiteral.members.length === 0, it would be equivalent to the node being returned in line 25, an empty type literal node.

Here, the fix was simple:

-    } else if (propLiteral.members.length > 0) {
-      return propLiteral;
     } else {
-      return createTypeLiteralNode([]);
+      return propLiteral;
     }

Reflections on 100

While getting to 100 involved making some sacrifices, it also allowed me to:

  • fix a lot of bugs,
  • catch and fix redundancies in my code,
  • reason about my unreachable code,
  • delineate thoughtfully between error cases and assertions, and
  • thoughtfully shift run-time assertions into compile-time guarantees.

Was it worth it? One way to take a crack at this is to quantify the harm of the downsides and compare it to the gain from the upsides. Using this method, for me, getting to 100 is clearly a net positive.

The real question, however, is not only whether the downsides are worth the upsides: but whether the upsides could have been achieved without the downsides. This, I think, is where the real opposition to 100% test coverage comes from.

I think I could achieved some of these downsides by looking at test coverage runs without obsessing over getting to 100%. Maybe most, I’m not sure. Yet, I find it true that:

  • obsessing over every single uncovered line helped me think about reachability in a way I wouldn’t have if I just glanced at an unreached line;
  • some of the interesting cases outlined above might not have been caught in a sea of red.

To me, a lot of the interesting reflections I have had over the code, and a lot of the interesting changes I made to the code, seem like they would have been hard to spot and truly think about if I wasn’t already at 95% coverage.

Ultimately, my sense is that 100% test coverage is definitely a useful exercise to do at least once. It’s likely also a useful exercise to do more than once. For me and schema-dts, I’ll be keeping it at 100% test coverage. How can I not, after all? “100” is just a really, really nice number.

Use trackBy in Angular ngFor Loops and MatTables

A missing trackBy in an ngFor block or a data table can often result in hard-to-track and seemingly glitchy behaviors in your web app. Today, I’ll discuss the signs that you need to use trackBy. But first—some context:

More often than not, you’ll want to render some repeated element in Angular. You’ll see code that looks like this:

<ng-container *ngFor="let taskItem of getTasks(category)">

In cases where the ngFor is looping over the results of a function that are created anew each time (e.g. an array being constructed using .map and .filter), you’ll run into some issues.

Every time the template is re-rendered, a new array is created with new elements. While newly-created array elements might be equivalent to the previous ones, Angular uses strict equality on each element to determine how to handle it.

In cases where the elements are an object type, strict equality will show that each element of the array is new. This means that a re-render would have a few side-effects:

  • Angular determines all the old elements are no longer a part of the block, and
    • destroys their components recursively,
    • unsubscribes from all Observables accessed through an | async pipe from within the ngFor body.
  • Angular finds newly-added elements, and
    • creates their components from scratch,
    • subscribing to new Observables (i.e. by making a new HTTP request) to each Observable it accesses via an | async pipe.

This also leads to a bunch of state being lost:

  • selection state inside the ngFor is lost on re-render,
  • state like a link being in focus, or a text-box having filled-in values, would go away.
  • if you have side-effects in your Observable pipes, you’ll see those happen again.

The Solution

trackBy gives you the ability to define custom equality operators for the values you’re looping over. This allows Angular to better track insertions, deletions, and reordering of elements and components within an ngFor block.

<ng-container *ngFor="let taskItem of getTasks(category); trackBy: trackTask">

… where trackTask is a TrackByFunction<Task>, such as:

  trackTask(index: number, item: Task): string {
    return `${item.id}`;
  }

If you run into situations where you have Observables that are being subscribed more often that you expect, seemingly duplicate HTTP calls being made, DOM elements that lose interaction and selection state sporadically, you might be missing a trackBy somewhere.

It’s not just For Loops

Any kind of data source that corresponds to repeated rows or items, especially ones that are fetched via Observables, should ideally allow you to use trackBy-style APIs. Angular’s MatTable (and the more general CdkTable) support their own version of trackBy for that purpose.

Since a table’s dataSource will often by an Observable or Observable-like source of periodically-updating data, understanding row-sameness across updates is very important.

Symptoms of not specifying trackBy in data tables are similar to ngFor loops; lost selections and interaction states when items are reloaded, and any nested components rendered will be destroyed and re-created. The experience of trackBy-less tables might be even worse, in some cases: changing a table sort or filtering will often be implemented at the data source level, causing a new array of data to render once more, with all the side effects entailed.

For a table of tasks fetched as Observables, we can have:

<table mat-table [dataSource]="category.tasksObs" [trackBy]="trackTask">

Where trackTask is implemented identically as a TrackByFunction<Task>.

The Joys and Happy Accidents of Branching Out

How helping on a film set led to me down a serendipitous path, publishing a new open source library, and getting an IMDB mention.

About a year ago, a friend asked me—along with some others—to help as extra hands on set filming the second season of an absurdist comedy mini-series she was working on called Look it Up.

Helping film anything wasn’t something I thought I would ever do, so I was excited to try it out. We weren’t necessarily trusted with anything difficult; carry equipment around, slate shots, clear the set, and move things around in general. It was interesting to see how much thought and effort goes into composing a shot, or storyboarding a scene.

Filming took place on a weekend. Five episodes, each under 2 minutes, took up about 2 days to film (including make-up, breaks, and lots of setup in between).

One of the cool things I saw in the filming process was how sound was treated. In the first season, the show’s creators enlisted help of another friend, Elizabeth McClanahan, to be their re-recording mixer. Her help salvaging a lot of the audio, mixing some sound effects for them, and also working with them to re-record some of the dialog was eye opening, and they entered their second season hoping to avoid some of the same mistakes (and try to learn how do to the sound themselves). So when I was there for the second season, we heard how a boom operator moving the mic too fast is enough to ruin the shot with wooshing air, and made a point to be silent, slow, and deliberate. I don’t think I had been as still for as long in quite a while.

Beyond being interesting, though, going out of my comfort zone for a weekend helped in a few fun ways:

  1. Being physically present, I was tasked with playing an extra for an opening credit. For some reason, that came with an IMDB credit. Kind of cute to cross off an item that wasn’t even on your bucket list to begin with.
  2. Caring about a friend’s artistic endeavor (wanting it to do well, get traction, etc.) caused me to think about practical tech solutions to problems I hadn’t occupied myself with.
Tara and Alex, in positions, between takes.

You see, while I tend to think of myself as a creative problem solver… if I know what the problem is. I often say I don’t see myself as starting a new company or idea, but rather as a lot of friends’ go-to person to make an existing idea a reality.

One of the privileges of working in software engineering at a large company is that you don’t run into life or career problems too often (except for all those problems at work you get to solve). Much of my perspective and creativity—as much as I try otherwise—is silo’d within work, that I find my externally visible contribution of the world hard to quantify.

Wanting Someone to Succeed

It turns out that “Look it Up” is a hilariously bad name for a web-series, if you want to get discovered. Ashton Shepherd has a song with that name with 3.5M views on YouTube. The phrase is also quite a common idiom, with lots of content about the phrase. Things get worse when you consider that the “it” in the keyword, is such a common word that it often goes ignored, resulting in a lot of “look up”-related content as well (who knows, Google’s new BERT natural language model might make this better in the future, as phrases are understood contextually).

They’ll need to market it. Part of this includes tapping into their networks, e-mailing their friends, and trying to generate some attention for what they have. As a fan or ally, I also tried to do the same with my own friends.

But I was still thinking about that Look it Up song knowledge panel. Which got me reading about how the SEO world is still all the rage about structured data and Schema.org. Interesting. I remember being excited about the semantic web way back when, but forgot about it and thought it just went to obscurity. Turns out, in the SEO corner of the web, Schema.org is cool.

And as often, my chain of thought transitively morphs into reading and research largely unrelated to the topic at hand. “How do I get Look it Up (the web series) to be noticed” forms into “can Look it Up be represented as a factual entity that is picked up by search and knowledge engines” which itself forms into “can authors of the web influence this” then “so how do I write JSON-LD” and “how do I validate JSON-LD”… but then, the chain of questions hit a wall.

The Upside of Finding Bad Things

You see, the answer to the question “how do I validate JSON-LD” is bad enough that I:

  1. had to look for another answer in disbelief, and, when I failed,
  2. knew I had to make it better.

And reader, let me tell you this, if I am the one who needs to make it better, it’s a really sad indictment of the state of affairs.

Let’s start with the answer to “How do I validate JSON-LD” (or, more specifically, Schema.org JSON-LD that search engines find useful). The state-of-the-art way to do this, it turns out, is through validators; online tools basically made up of a web form that takes your data in and spits out a verdict: your data is bueno or no bueno. The most used is Google’s structured data testing tool, followed by Bing’s markup validator and Yandex’s microformat validator.

So to craft a static piece of JSON-LD structured data, you’d need to:

  1. find the appropriate specific type on Schema.org, reading it’s docs,
  2. use Schema.org as a reference to write in the properties you care about,
  3. transitively look up the types of each of those properties on Schema.org, repeating 2. and 3. recursively as needed (and consulting examples on the site as necessary),
  4. paste the resulting JSON in your favorite validator.

And voila! Unless the validator tells you something is wrong. Then you go back and fix it, potentially consulting steps 2. and 3. again. You do that a few times, and then, for real, voila! You’re done. And you got some structured data to show for it.

Maybe this is fine and dandy if you’re crafting one hard-coded piece of structured data. But what about programmatically generated data (e.g. say you’re IMDB, or an e-commerce company)?

This whole thing sounded like compiler driven development. You have this buffer of text that represents something (a program, or structured data) and you need to periodically feed it to a compiler or validator to know that it’s correct. In development, people use IDEs, or code editors with at least syntax checking, if not also language services, code completions, and the works. Why can’t we do the same for Schema.org?

Hello, schema-dts

That was the birth of schema-dts, an open source TypeScript project released under Google. I’ll talk some more at a later point about the experience of releasing open source code at Google. For this post, I’d like to focus a bit on the why and how components of that project.

I’ve always been a huge fan of TypeScript. Its ability to specify shapes of plain JavaScript objects (through types), its ability to do discriminated unions (think of Schema.org JSON-LD as @type-tagged unions), and the great ecosystem of language services, completions, and general tooling around, made it jump up as a clear example for writing good structured JSON.

So I Googled: “Schema.org TypeScript”. Nada. “JSON-LD Typescript”. “TypeScript structured data”. Wow. This thing actually doesn’t exist. And it felt so obvious.

So, I had to make it.

That Sunday night, I sat down for some 9ish hours from around 8 PM till 5 AM or so coding up my first proof of concept. It worked! I created a program that downloads a .nt N-Triples file representing an ontology (I mostly tested in on the Schema.org files available), parses it, and transforms it multiple times, generating a TypeScript file representing all the class types defined in that ontology.

schema-dts has been a microcosm of branching out in itself. It’s given me many unique experiences: Navigating open source at Google; having an OSS project people actually use; an excuse to write multiple technical articles about the topic; and an excuse to be additionally self-promotional.

Perhaps seeing myself as a solver of existing known problems, rather than new ones, is not an indictment of my creativity, just an indication of how little I branch out.

Learning by Implementing: Observables

Sometimes, the best way to learn a new concept is to try to implement it. With my journey with reactive programming, my attempts at implementing Observables were key to to my ability to intuit how to best use them. In this post, we’ll be trying various strategies of implementing an Observable and see if we can make get to working solution.

I’ll be using TypeScript and working to implement something similar to RxJS in these examples, but the intuition should be broadly applicable.

First thing’s first, though: what are we trying to implement? My favorite way or motivating Observables is by analogy. If you have some type, T, you might represent it in asynchronous programming as Future<T> or Promise<T>. Just as futures and promises are the asynchronous analog of a plain type, an Observable<T> is the asynchronous construct representing as collection of T.

The basic API for Observable is a subscribe method that takes as bunch of callbacks, each triggering on a certain event:

interface ObservableLike<T> {
  subscribe(
      onNext?: (item: T) => void,
      onError?: (error: unknown) => void,
      onDone?: () => void): Subscription;
}

interface Subscription {
  unsubscribe(): void;
}

With that, let’s get to work!

First Attempt: Mutable Observables

One way of implementing an Observable is to make sure it keeps tracks of it’s subscribers (in an array) and have the object send events to listeners as they happen.

For the purpose of this and other implementations, we’ll define an internal representation of a Subscription as follows:

interface SubscriptionInternal<T> {
  onNext?: (item: T) => void;
  onError?: (error: unknown) => void;
  onDone?: () => void;
}

Therefore, we could define an Observable as such:

class Observable<T> implements ObservableLike<T> {
  private readonly subscribers: Array<SubscriptionInternal<T>> = [];

  triggerNext(item: T) {
    this.subscribers.forEach(sub => sub.onNext && sub.onNext(item));
  }

  triggerError(err: unknown) {
    this.subscribers.forEach(sub => sub.onError && sub.onError(err));
  }

  triggerDone() {
    this.subscribers.forEach(sub => sub.onDone && sub.onDone());
    this.subscribers.splice(0, this.subscribers.length);
  }

  subscribe(
    onNext?: (item: T) => void,
    onError?: (error: unknown) => void,
    onDone?: () => void
  ): Subscription {
    const subInternal: SubscriptionInternal<T> = {
      onNext,
      onError,
      onDone
    };

    this.subscribers.push(subInternal);
    return {
      unsubscribe: () => {
        const index = this.subscribers.indexOf(subInternal);
        if (index !== -1) {
          onDone && onDone(); // Maybe???
          this.subscribers.splice(index, 1);
        }
      }
    };
  }
}

This would be used as follows:

// Someone creates an observable:
const obs = new Observable<number>();
obs.triggerNext(5);
obs.triggerDone();

// Someone uses an observable
obs.subscribe(next => alert(`I got ${next}`), undefined, () => alert("done"));

There are a few fundamental problems going on here:

  1. The implementer doesn’t know when subscribers will start listening, and thus won’t know if triggering an event will be heard by no one,
  2. Related to the above, this implementation always creates hot observables; the Observable can start triggering events immediately after creation, depending on the creator, and
  3. Mutable: Anyone who receives the Observable can call triggerNext, triggerError, and triggerDone on it, which would interfere with everyone else.

There are some limitations of the current implementation: can error multiple times, a “done” Observable can trigger again, and an Observable can move back and forth between “done”, triggering, and “errored” states. But state tracking here wouldn’t be fundamentally more complicated. We also need to think more about errors in the callback, and what the effect of that should be on other subscribers.

Second Attempt: Hot Immutable Observables

Let’s first solve the mutability problem. One approach is to pass a ReadonlyObservable interface around which hides the mutating methods. But any downstream user up-casting the Observable could wreck havoc, never mind plain JS users who just see these methods on an object.

A cleaner approach in JavaScript is to borrow from the Promise constructor’s executor pattern, where the constructor is must be passed a user-defined function that defines when an Observable triggers:

class Observable<T> implements ObservableLike<T> {
  private readonly subscribers: Array<SubscriptionInternal<T>> = [];

    constructor(
      executor: (
        next: (item: T) => void,
        error: (err: unknown) => void,
        done: () => void
      ) => void
    ) {
      const next = (item: T) => {
        this.subscribers.forEach(sub => sub.onNext && sub.onNext(item));
      };
    
      const error = (err: unknown) => {
        this.subscribers.forEach(sub => sub.onError && sub.onError(err));
      };
    
      const done = () => {
        this.subscribers.forEach(sub => sub.onDone && sub.onDone());
        this.subscribers.splice(0, this.subscribers.length);
      };
    
      executor(next, error, done);
    }

  subscribe(
    onNext?: (item: T) => void,
    onError?: (error: unknown) => void,
    onDone?: () => void
  ): Subscription {
    const subInternal: SubscriptionInternal<T> = {
      onNext,
      onError,
      onDone
    };

    this.subscribers.push(subInternal);
    return {
      unsubscribe: () => {
        const index = this.subscribers.indexOf(subInternal);
        if (index !== -1) {
          onDone && onDone(); // Maybe???
          this.subscribers.splice(index, 1);
        }
      }
    };
  }
}

Much better! We can use this as such:

// Someone creates an observable:
const obs = new Observable<number>((next, error, done) => {
  next(5);
  done();
});

// Someone uses an observable
obs.subscribe(next => alert(`I got ${next}`), undefined, () => alert("done"));

This cleans up the API quite a bit. But in this example, calling this code in this order will still cause the subscriber to see no events.

Good Examples

We can already use this type of code to create helpful Observables:

// Create an Observable of a specific event in the DOM.
function fromEvent<K extends keyof HTMLElementEventMap>(
  element: HTMLElement,
  event: K
): Observable<HTMLElementEventMap[K]> {
  return new Observable<HTMLElementEventMap[K]>((next, error, done) => {
    element.addEventListener(event, next);
    // Never Done.
  });
}

const clicks: Observable<MouseEvent> = fromEvent(document.body, "click");

Or an event stream from a timed counter:

function timer(millis: number): Observable<number> {
  return new Observable<number>((next, error, done) => {
    let count = 0;
    setInterval(() => {
      next(count);
      ++count;
    }, millis);
  });
}

Even these examples have some issues: they keep running even when no one is listening. That’s sometimes fine, if we know we’ll only have one Observable, or we’re sure callers are listening and so tracking that state is unnecessary overhead, but it’s starting to point to certain smells.

Bad Examples

One common Observable factory is of, which create an Observable that emits one item. The assumption being that:

const obs: Observable<number> = of(42);
obs.subscribe(next => alert(`The answer is ${next}`)); 

… would work, and result in “The answer is 42” being alerted. But a naive implementation, such as:

function of<T>(item: T): Observable<T> {
  return new Observable<T>((next, error, done) => {
    next(item);
    done();
  };
}

… would result in the event happening before anyone has the chance to subscribe. Tricks like setTimeout work for code that subscribes immediately after, but is fundamentally broken if we want to generalize this to someone who subscribes at a later point.

The case for Cold Observables

We can try to make our Observables lazy, meaning they only start acting on the world once subscribed to. Note that by lazy I don’t just mean that a shared Observable will only start triggering once someone subscribes to it — I mean something stronger: an Observable will trigger for each subscriber.

For example, we’d like this to work properly:

const obs: Observable<number> = of(42);
obs.subscribe(next => alert(`The answer is ${next}`));
obs.subscribe(next => alert(`The second answer is ${next}`)); 
setTimeout(() => {
  obs.subscribe(next => alert(`The third answer is ${next}`)); 
}, 1000);

Where we get 3 alert messages the contents of the event.

Third Attempt: Cold Observables (v1)

type UnsubscribeCallback = (() => void) | void;

class Observable<T> implements ObservableLike<T> {
  constructor(
    private readonly executor: (
      next: (item: T) => void,
      error: (err: unknown) => void,
      done: () => void
    ) => UnsubscribeCallback
  ) {}

  subscribe(
    onNext?: (item: T) => void,
    onError?: (error: unknown) => void,
    onDone?: () => void
  ): Subscription {
    const noop = () => {};
    const unsubscribe = this.executor(
      onNext || noop,
      onError || noop,
      onDone || noop
    );

    return {
      unsubscribe: unsubscribe || noop
    };
  }
}

In this attempt, each Subscription will run the executor separately, triggering onNext, onError, and onDone for each subscriber as needed. This is pretty cool! The naive implementation of of works just fine. I also snuck in a pretty simple method to allow us to add cleanup logic to our executors.

fromEvent would benefit from that, for example:

// Create an Observable of a specific event in the DOM.
function fromEvent<K extends keyof HTMLElementEventMap>(
  element: HTMLElement,
  event: K
): Observable<HTMLElementEventMap[K]> {
  return new Observable<HTMLElementEventMap[K]>((next, error, done) => {
    element.addEventListener(event, next);
    // Never Done.

    return () => {
      element.removeEventListener(event, next);
    };
  });
}

The nice thing about this is that we remove our listeners when a particular subscriber unsubscribes. Except now, we open as many listeners as subscribers. That might be okay for this one case, but we’ll want to figure out how to let users “multicast” (reuse underlying events, etc.) when they want to.

We still haven’t figured out error handling and proper cleanup and error handling. For example:

  1. It is generally regarded that a subscription that errors is closed (just like how throwing an error while iterating over a for loop will terminate that loop)
  2. When a subscriber unsubscribes, we should probably get that “onDone” event.
  3. When there’s an error, we should probably do some cleanup.

Better Cold Observables

Here’s a re-implementation of subscribe that might satisfy these conditions:

class Observable<T> implements ObservableLike<T> {
  constructor(
    private readonly executor: (
      next: (item: T) => void,
      error: (err: unknown) => void,
      done: () => void
    ) => UnsubscribeCallback
  ) {}

  subscribe(
    onNext?: (item: T) => void,
    onError?: (error: unknown) => void,
    onDone?: () => void
  ): Subscription {
    let dispose: UnsubscribeCallback;
    let running = true;
    const unsubscribe = () => {
      // Do not allow retriggering:
      onNext = onError = undefined;

      onDone && onDone();
      // Don't notify someone of "done" again if we unsubscribe.
      onDone = undefined;

      if (dispose) {
        dispose();
        // Don't dispose twice if we unsubscribe.
        dispose = undefined;
      }

      running = false;
    };

    const error = (err: unknown) => {
      onError && onError(err);
      unsubscribe();
    };

    const done = () => {
      unsubscribe();
    };

    const next = (item: T) => {
      try {
        onNext && onNext(item);
      } catch (e) {
        onError && onError(e);
        error(e);
      }
    };

    dispose = this.executor(next, error, done);

    // We just assigned dispose. If the executor itself already
    // triggered done() or fail(), then unsubscribe() has gotten called
    // before assigning dispose.
    // To guard against those cases, call dispose again in that case.
    if (!running) {
      dispose && dispose();
    }

    return {
      unsubscribe: () => unsubscribe()
    };
  }
}

Using Observables

Taking the “Better Cold Observables” example, let’s see how we can use Observables:

Useful Factories

We already discussed fromEvent and of, which work with the new form of Observable. A few others we can create:

// Throws an error immediately
function throwError(err: unknown): Observable<never> {
  return new Observable((next, error) => {
    error(err);
  });
}

// Combines two Observables into one.
function zip<T1, T2>(
  o1: Observable<T1>,
  o2: Observable<T2>
): Observable<[T1, T2]> {
  return new Observable<[T1, T2]>((next, error, done) => {
    const last1: T1[] = [];
    const last2: T2[] = [];

    const sub1 = o1.subscribe(
      item => {
        last1.push(item);
        if (last1.length > 0 && last2.length > 0) {
          next([last1.shift(), last2.shift()]);
        }
      },
      err => error(err),
      () => done()
    );

    const sub2 = o2.subscribe(
      item => {
        last2.push(item);
        if (last2.length > 0 && last1.length > 0) {
          next([last1.shift(), last2.shift()]);
        }
      },
      err => error(err),
      () => done()
    );

    return () => {
      sub1.unsubscribe();
      sub2.unsubscribe();
    };
  });
}

Useful Operators

Another nice thing about Observables is that they’re nicely composable. Take map for instance:

function map<T, R>(observable: Observable<T>, mapper: (item: T) => R) {
  return new Observable<R>((next, fail, done) => {
    const sub = observable.subscribe(item => next(mapper(item)), fail, done);
    return () => {
      sub.unsubscribe();
    }
  });
}

This allows us to do things like:

function doubled(input: Observable<number>): Observable<number> {
  return map(input, n => n * 2);
}

Or we could define filter:

function filter<T>(observable: Observable<T>, predicate: (item: T) => boolean) {
  return new Observable<T>((next, fail, done) => {
    const sub = observable.subscribe(
      item => {
        if (predicate(item)) next(item);
      },
      fail,
      done
    );
    return () => {
      sub.unsubscribe();
    };
  });
}

Which allows us to do:

function primeOnly(input: Observable<number>): Observable<number> {
  return filter(input, isPrime);
}

Conclusion

I didn’t really try to sell you, dear reader, on why you should use Observables as helpful tools in your repertoire. Some of my other writing showing their use cases (here, here, and here) might be helpful. But really, what I wanted to demonstrate is some of the intuition on how Observables work. The implementation I shared isn’t a complete one, for that, you better consult with Observable.ts in the RxJS implementation. This implementation is notably missing a few things:

  • We could still do much better on error handling (especially in my operators)
  • RxJS observables include the pipe() method, which makes applying one or more of those operators to transform an Observable much more ergonomic
  • Lot’s of things here and there

Schema.org Classes in TypeScript: Properties and Special Cases

In our quest to model Schema.org classes in TypeScript, we’ve so far managed to model the type hierarchy, scalar DataType values, and enums. The big piece that remains, however, is representing what’s actually inside of the class: it’s properties.

After all, what it means for a JSON-LD literal to have "@type" equal to "Person" is that certain properties — e.g. "birthPlace" or "birthDate", among others — can be expected to be present on the literal. More than their potential presence, Schema.org defines a meaning for these properties, and the range of types their values could hold.

The easy case: Simple Properties

You can download the entire vocabulary specification of Schema.org, most of which describes properties on these classes. For each property, Schema.org will tell us it’s domain (what classes have this property) and range (what types can its values be). For example, the name property specification shows that it is available on the class Thing, and has type Text. One might represent this knowledge as follows:

interface ThingBase {
  "name": Text;
}

Linked Data, it turns out, is a bit richer than that, allowing us to express situations where a property has multiple values. In JSON-LD, this is represented by an array as the value of the property. Therefore:

interface ThingBase {
  "name": Text | Text[];
}

Multiple Property Types

Often times, however, the range of a particular property is any one of a number of types. For example, the property image on Thing can be an ImageObject or URL. Note, also, that nothing in JSON-LD necessitates that all potential values of image have the same type.

In other words, if we want to represent image on ThingBase, we have:

interface ThingBase {
  "name": Text | Text[];
  "image": ImageObject | URL | (ImageObject | URL)[];
}

Properties are Optional

In JSON-LD, all properties are optional. In practice Schema.org cares about "@type" being defined for all classes, but does not otherwise define any other properties as being required. This is sometimes complicated as specific search engines require some set of properties on a class.

interface ThingBase {
  "name"?: Text | Text[];
  "image"?: ImageObject | URL | (ImageObject | URL)[];
}

Properties Can Supersede Others in the Vocabulary

As Schema.org matures, it’s vocabulary changes. Not all of these changes will be additive (adding a new type, or a new type on an existing property). Some will involve adding a new type or property intended to replace another.

For example, area was a property on BroadcastService describing a Place the service applies to. Turns out, a lot of other businesses also apply to a specific area. serviceArea replaced area, and instead of applying to BroadcastService, it applied to its parent, Service. In addition, serviceArea can also apply to Organization and ContactPoint (something area never did). In addition to being just a Place, serviceArea can be an AdministrativeArea or an arbitrary GeoShape.

Later on, serviceArea was replaced by areaServed, which also included a freeform Text as a possible value, and applied to a few more objects.

When a property replaces another, it supersedes it (inversely, the other property is superseded by the new one). These changes keep existing Schema.org JSON-LD backwards-compatible. A property p2 superseding p1 will generally imply:

  1. p2 is available on all types p1 was available on. (p2‘s domain is strictly wider).
    This includes (a) additional types in the domain, or (b) the domain changing to a parent class, for example.
  2. p2 includes all possible types of p1 (p2‘s range is strictly wider).

Typically, new data will be written with p2, but the intention is that any old data written using p1 continues to be valid.

In TypeScript, we can use the @deprecated JSDoc annotation to recommend using a new property instead. We can go further and simply skip all deprecated properties (properties that are superseded by one or more properties) if we wanted to.

The story of area, serviceArea, and areaServed can be partially summarized as follows:

interface BroadcastServiceBase extends OrganizationBase {
  /** @deprecated Superseded by serviceArea */
  "area"?: Place | Place[];
}

interface OrganizationBase {
  /** @deprecated Superseded by areaServed *
  "serviceArea"?: AdministrativeArea | GeoShape | Place |
                  (AdministrativeArea | GeoShape | Place)[];

  "areaServed"?: AdministrativeArea | GeoShape | Place | Text |
                 (AdministrativeArea | GeoShape | Place | Text)[];
}

Things Fall Apart

Multiple Types

"@type" is just another property (albeit it has speical meaning).

JSON-LD permits a node to have multiple "@type"s as well, and search engines are happy to accept multiple types (at least for some nodes). In practice, a node having two types means that it can have properties on both types. For example, this is valid:

{
  "@type": ["Organization", "Person"],
  "birthDate": "1980-01-01",
  "foundingDate": "2000-01-01"
}

In TypeScript, discriminating a union on an array seems to be hard, and it becomes a bit clunky to define. For now, our TypeScript definitions will not allow multiple @type values.

Sub-Properties

Schema.org takes advantage of the RDF concept of a sub-property:

If a property P is a subproperty of property P’, then all pairs of resources which are related by P are also related by P’

RDF Schema 1.1

Simply put, a sub-property is a more specific version of a property.

For example, image exists on Thing, but has two sub-properties: logo, which exists on Brand, Organization, and a few other types, and photo, which exists on a Place.

One thing I expected is not to be able to specify a super-property on a node whose type has the sub-property available. I.e., if I’m describing a Brand, it’s logo will sufficiently describe image, thereby serving no meaning to specify image.

That’s not quite true, though, a sub-property implying a property still leaves room for the property itself to be available (an Organization can have multiple images, one of which is its logo).

And while that should be true (by the RDF specification), turns even that isn’t true in Schema.org. Some sub-properties have more general types than their super-properties, e.g. photo can be a Photograph, but it’s super-property, image cannot.

So here, we simply punt.

Special Cases

Reading Schema.org documentation, you might expect as I did that there are two distinct hierarchies of data: Thing (aka classes/node types) and DataType (aka values/scalars/primitives). That’s definitely not true in JSON-LD in general, where many values are untyped to begin with, specified using an "@id" reference, or a string. Schema.org implies it imposes a tighter requirement, and describes these hierarchies dis-jointly, but that turns out not to be true.

Turns out, some types, like Distance are in the Thing hierarchy, but expect string values (in the case of Distance, those would take the form "5 in" or "2.3 cm", etc.).

We might consider having our typings include string (or Text?) for all of our classes. To encourage semantically specifying properties, however, I decided to only allow string on a subset of our nodes.

type Distance = DistanceLeaf | string;

Conclusion

Schema.org is a vocabulary designed in an inherently human way. This sometimes have repercussions of being thoughtful. Yet, just as often, it means that the semantics have evolved in a way that is inconsistent. The result is often dissatisfying: relations that are defined but don’t hold in practice, objects that are described with textual comments but have no formal relations specifying them, distances that are described as nodes, and many others. These inconsistencies often lead to hacks when trying to represent the vocabulary in TypeScript.

Yet, it’s important not to lose track of why modeling Schema.org in TypeScript to begin with. The lack of tooling around Schema.org (specifically in IDEs when writing out a specific piece of data), is precisely the need we’re filling in. But ultimately, adding structure to an ontology that is largely decided by a loose set of guidelines will be lossy.

The question remains: is the trade-off worth it?

For my purposes, schema-dts has helped me tremendously over the past several months.

%d bloggers like this: