Categories
Software

Schema.org Classes in TypeScript: Properties and Special Cases

In our quest to model Schema.org classes in TypeScript, we’ve so far managed to model the type hierarchy, scalar DataType values, and enums. The big piece that remains, however, is representing what’s actually inside of the class: it’s properties.

After all, what it means for a JSON-LD literal to have "@type" equal to "Person" is that certain properties — e.g. "birthPlace" or "birthDate", among others — can be expected to be present on the literal. More than their potential presence, Schema.org defines a meaning for these properties, and the range of types their values could hold.

The easy case: Simple Properties

You can download the entire vocabulary specification of Schema.org, most of which describes properties on these classes. For each property, Schema.org will tell us it’s domain (what classes have this property) and range (what types can its values be). For example, the name property specification shows that it is available on the class Thing, and has type Text. One might represent this knowledge as follows:

interface ThingBase {
  "name": Text;
}

Linked Data, it turns out, is a bit richer than that, allowing us to express situations where a property has multiple values. In JSON-LD, this is represented by an array as the value of the property. Therefore:

interface ThingBase {
  "name": Text | Text[];
}

Multiple Property Types

Often times, however, the range of a particular property is any one of a number of types. For example, the property image on Thing can be an ImageObject or URL. Note, also, that nothing in JSON-LD necessitates that all potential values of image have the same type.

In other words, if we want to represent image on ThingBase, we have:

interface ThingBase {
  "name": Text | Text[];
  "image": ImageObject | URL | (ImageObject | URL)[];
}

Properties are Optional

In JSON-LD, all properties are optional. In practice Schema.org cares about "@type" being defined for all classes, but does not otherwise define any other properties as being required. This is sometimes complicated as specific search engines require some set of properties on a class.

interface ThingBase {
  "name"?: Text | Text[];
  "image"?: ImageObject | URL | (ImageObject | URL)[];
}

Properties Can Supersede Others in the Vocabulary

As Schema.org matures, it’s vocabulary changes. Not all of these changes will be additive (adding a new type, or a new type on an existing property). Some will involve adding a new type or property intended to replace another.

Categories
Software

Schema.org DataType in TypeScript: Structural Typing Doesn’t Cut It

Schema.org has a concept of a DataType, things like Text, Number, Date, etc. In JSON-LD, we represent these as strings or numbers, rather than array or object literals. This data could describe the name of a Person, a check-in date and time for a LodgingReservation, a URL of a Corporation, publication date of an Article, etc. As we’ll see, the Schema.org DataType hierarchy is far richer than TypeScript’s type system can accommodate. In this article, we’ll go over the DataType hierarchy and explore how much type checking we can provide.


We saw in the first installment how TypeScript’s type system makes expressing JSON-LD describing Schema.org class structure very elegant. The story got slightly more clouded when we introduced Schema.org Enumerations.

Schema.org Data Types

Let’s take a look at the full DataType tree according Schema.org:

Boolean’s look quite similar to enums, with http://schema.org/True and http://schema.org/False as it’s two possible IRI values (depending on @context, those can of course be represented as relative IRIs instead) or their HTTPS equivalents.

Number and descendants are just JSON / JavaScript numbers. Float indicates the JSON number will have a floating point precision, whereas Integer tells us to expect a whole number. On its own right, JavaScript does not distinguish floats and integers as separate types, and neither does TypeScript. While TypeScript supports the idea of literal types, specifying a type as all possible integers or all possible floating point numbers isn’t expressible.

Categories
Software

Schema.org Enumerations in TypeScript

Last time, we talked about modeling the Schema.org class hierarchy in TypeScript. We ended up with an elegant, recursive solution that treats any type Thing as a "@type"-discriminated union of ThingLeaf and all the direct sub-classes of the type. The next challenge in the journey of building TypeScript typings for the Schema.org vocabulary is modeling Enumerations.

Learning from Examples

Let’s look at a few examples from the Schema.org website to get a better sense of what Enumerations look like.

First up, I looked at PaymentStatusType, which can take any one of these values: PaymentAutomaticallyApplied, PaymentComplete, PaymentDeclined, PaymentDue, or PaymentPastDue. PaymentStatusType is used in the paymentStatus property on the Invoice class.

Here’s an excerpt from an example of an invoice:

{
    "@context": "http://schema.org/",
    "@type": "Invoice",
    // ...
    "paymentStatus": "http://schema.org/PaymentComplete",
    "referencesOrder": [
      // ...
    ]
}

Here, the value of an Enumeration appears as an absolute IRI.

Looking at other examples, however, such as GamePlayMode which appears in playMode on VideoGame shows up differently:

Categories
Software

Modeling Schema.org Schema with TypeScript: The Power and Limitations of the TypeScript Type System

Recently, I published schema-dts (npm, GitHub), an open source library that models JSON-LD Schema.org in TypeScript. A big reason I wanted to do this project is because I knew some TypeScript type system features, such as discriminated type unions, powerful type inference, nullability checking, and type intersections, present an opportunity to both model what Schema.org-conformant JSON-LD looks like, while also providing ergonomic completions to the developer.

In a series of posts, I’ll go over some of the Structured Data concepts that lent themselves well to TypeScript’s type system, and those concepts that didn’t. First up: the type hierarchy of JSON-LD Schema.org Schema, and how can be represented in TypeScript.

Note: I’ll be describing JSON-LD in general in very broad strokes and will spend more time discussing how Schema.org JSON-LD looks like in particular. For those who are familiar with the JSON-LD spec, you’ll see I took a few liberties. This is because schema-dts makes a few assumptions, such as the @context being a known constant, etc. schema-dts also foregoes some features, such as specifying multiple types of a node object, etc.

Modeling the Schema.org class structure with the TypeScript Type System

Schema.org JSON-LD node objects are always typed (that is, they have a @type property that points to some IRI–a string–describing it). Given a @type you know all the properties that are defined on a particular object. Object types inherit from each other. For example, Thing in Schema.org has a property called name, and Person is a subclass of Thing that defines additional properties such as birthDate, and inherits all the properties of Thing such as name. Thing has other sub-classes, like Organization, with it’s own properties, like logo.

Let’s use this minimal example to try a few approaches:

1. Modeling each with inheritance

interface Thing {
  "name": string;
}
interface Person extends Thing {
  "@type": "Person";
  "birthDate": string;
}
interface Organization extends
    Thing {
  "@type": "Organization";
  "logo": string;
}

If we had a const something: Thing , then we could assign it to a Thing, Person, or Organization. So that’s a start! But there are a few problems: