Categories
Software

Unexpected Lessons from 100% Test Coverage

The conventional wisdom of the software engineering community is that striving to 100% test coverage is a fool’s errand. It won’t necessarily help you catch all bugs, and it might lead you down questionable paths when writing your code.

My recent attempts at 100% test coverage showed me the answer is much more subtle. At times I was tempted to make questionable code changes just for the sake of coverage. In some of those times, I did succumb. Yet, I found that often, there is an enlightened way to both cover a branch and make the code better for it. Blind 100% coverage can cause us to make unacceptable compromises. If we constrain ourselves with only making the codebase better, however, then thinking about 100% coverage can change the way we think about a codebase. The story of my 100% test coverage attempt is a story of both the good and the bad.


Last year I came across a thread from NPM creator Isaac Z. Schlueter advocating for 100% test coverage:

Schlueter alluded to a mentality shift that a developer achieves that piqued my interest:

Road to 100

Coveralls.io screen grab for schema-dts showing how it got to 100% Test Coverage

I decided that schema-dts would be the perfect candidate for the 100% test coverage experiment. Given N-Triples as an input, schema-dts generates TypeScript types describing valid JSON-LD literals for that ontology. I’ve been more and more interested recently in getting it to stability, and understanding where there’s headroom in the codebase.

The Setup

To start, I didn’t have a way to compute test coverage of my project in its current state. I ended up switching my test runner from Jasmine to Mocha for its lcov support. This being a TypeScript project, though, I had to enable source maps, and use ts-node to get coverage numbers of my actual .ts source (PR #56). I used Istanbul’s nyc to get coverage runs working locally. Coveralls integrates with nyc nicely to host online tracking of code coverage time. Coveralls also integrates seamlessly with Travis CI and gates all PRs by their Ξ”Coverage (PR #57).

My first real run after setting things up had a %78.72 test coverage. That’s not too bad, I thought. My existing tests belonged to broadly two general categories:

These baseline tests definitely covered a lot of lines of code that they didn’t really exercise, which is part of why that number was high. That itself can be an argument that 100% test coverage is a meaningless number. Schlueter’s promise of 100% test coverage, however, is that the act of getting to that long tail can have transformative effects of how I think about my own code. I wanted to try my luck at that first hand. If we wanted to be more confident about our covered lines are truly being tested, mutation testing might do better wonders than test coverage.

Happy Times: The Low Hanging Fruit

A Schema.org-like ontology can declare a certain class, property, or enum value as deprecated by marking it with a supersededBy predicate. schema-dts handles this in one of two ways: either marking it with @deprecated JSDoc comments in the code, or stripping those declarations entirely.

Looking at my coverage report, swaths of untested code become apparent. For example, I never attempted to generate a @deprecated class. Ok, let’s fix that. And I catch a real bug that my few unit tests hadn’t caught. I increased coverage by 9.8%, added some baseline tests of deprecation, and added some N-Triple parsing unit tests that I had never gotten around to.

Categories
Software

Learning by Implementing: Observables

Sometimes, the best way to learn a new concept is to try to implement it. With my journey with reactive programming, my attempts at implementing Observables were key to to my ability to intuit how to best use them. In this post, we’ll be trying various strategies of implementing an Observable and see if we can make get to working solution.

I’ll be using TypeScript and working to implement something similar to RxJS in these examples, but the intuition should be broadly applicable.

First thing’s first, though: what are we trying to implement? My favorite way or motivating Observables is by analogy. If you have some type, T, you might represent it in asynchronous programming as Future<T> or Promise<T>. Just as futures and promises are the asynchronous analog of a plain type, an Observable<T> is the asynchronous construct representing as collection of T.

The basic API for Observable is a subscribe method that takes as bunch of callbacks, each triggering on a certain event:

interface ObservableLike<T> {
  subscribe(
      onNext?: (item: T) => void,
      onError?: (error: unknown) => void,
      onDone?: () => void): Subscription;
}

interface Subscription {
  unsubscribe(): void;
}

With that, let’s get to work!

First Attempt: Mutable Observables

One way of implementing an Observable is to make sure it keeps tracks of it’s subscribers (in an array) and have the object send events to listeners as they happen.

For the purpose of this and other implementations, we’ll define an internal representation of a Subscription as follows:

Categories
Software

Schema.org Classes in TypeScript: Properties and Special Cases

In our quest to model Schema.org classes in TypeScript, we’ve so far managed to model the type hierarchy, scalar DataType values, and enums. The big piece that remains, however, is representing what’s actually inside of the class: it’s properties.

After all, what it means for a JSON-LD literal to have "@type" equal to "Person" is that certain properties β€” e.g. "birthPlace" or "birthDate", among others β€” can be expected to be present on the literal. More than their potential presence, Schema.org defines a meaning for these properties, and the range of types their values could hold.

The easy case: Simple Properties

You can download the entire vocabulary specification of Schema.org, most of which describes properties on these classes. For each property, Schema.org will tell us it’s domain (what classes have this property) and range (what types can its values be). For example, the name property specification shows that it is available on the class Thing, and has type Text. One might represent this knowledge as follows:

interface ThingBase {
  "name": Text;
}

Linked Data, it turns out, is a bit richer than that, allowing us to express situations where a property has multiple values. In JSON-LD, this is represented by an array as the value of the property. Therefore:

interface ThingBase {
  "name": Text | Text[];
}

Multiple Property Types

Often times, however, the range of a particular property is any one of a number of types. For example, the property image on Thing can be an ImageObject or URL. Note, also, that nothing in JSON-LD necessitates that all potential values of image have the same type.

In other words, if we want to represent image on ThingBase, we have:

interface ThingBase {
  "name": Text | Text[];
  "image": ImageObject | URL | (ImageObject | URL)[];
}

Properties are Optional

In JSON-LD, all properties are optional. In practice Schema.org cares about "@type" being defined for all classes, but does not otherwise define any other properties as being required. This is sometimes complicated as specific search engines require some set of properties on a class.

interface ThingBase {
  "name"?: Text | Text[];
  "image"?: ImageObject | URL | (ImageObject | URL)[];
}

Properties Can Supersede Others in the Vocabulary

As Schema.org matures, it’s vocabulary changes. Not all of these changes will be additive (adding a new type, or a new type on an existing property). Some will involve adding a new type or property intended to replace another.

Categories
Software

Schema.org DataType in TypeScript: Structural Typing Doesn’t Cut It

Schema.org has a concept of a DataType, things like Text, Number, Date, etc. In JSON-LD, we represent these as strings or numbers, rather than array or object literals. This data could describe the name of a Person, a check-in date and time for a LodgingReservation, a URL of a Corporation, publication date of an Article, etc. As we’ll see, the Schema.org DataType hierarchy is far richer than TypeScript’s type system can accommodate. In this article, we’ll go over the DataType hierarchy and explore how much type checking we can provide.


We saw in the first installment how TypeScript’s type system makes expressing JSON-LD describing Schema.org class structure very elegant. The story got slightly more clouded when we introduced Schema.org Enumerations.

Schema.org Data Types

Let’s take a look at the full DataType tree according Schema.org:

Boolean’s look quite similar to enums, with http://schema.org/True and http://schema.org/False as it’s two possible IRI values (depending on @context, those can of course be represented as relative IRIs instead) or their HTTPS equivalents.

Number and descendants are just JSON / JavaScript numbers. Float indicates the JSON number will have a floating point precision, whereas Integer tells us to expect a whole number. On its own right, JavaScript does not distinguish floats and integers as separate types, and neither does TypeScript. While TypeScript supports the idea of literal types, specifying a type as all possible integers or all possible floating point numbers isn’t expressible.

Categories
Software

Schema.org Enumerations in TypeScript

Last time, we talked about modeling the Schema.org class hierarchy in TypeScript. We ended up with an elegant, recursive solution that treats any type Thing as a "@type"-discriminated union of ThingLeaf and all the direct sub-classes of the type. The next challenge in the journey of building TypeScript typings for the Schema.org vocabulary is modeling Enumerations.

Learning from Examples

Let’s look at a few examples from the Schema.org website to get a better sense of what Enumerations look like.

First up, I looked at PaymentStatusType, which can take any one of these values: PaymentAutomaticallyApplied, PaymentComplete, PaymentDeclined, PaymentDue, or PaymentPastDue. PaymentStatusType is used in the paymentStatus property on the Invoice class.

Here’s an excerpt from an example of an invoice:

{
    "@context": "http://schema.org/",
    "@type": "Invoice",
    // ...
    "paymentStatus": "http://schema.org/PaymentComplete",
    "referencesOrder": [
      // ...
    ]
}

Here, the value of an Enumeration appears as an absolute IRI.

Looking at other examples, however, such as GamePlayMode which appears in playMode on VideoGame shows up differently: