WHAT MAKES A GOOD DATA MODEL? — POINT 7:

Murray Callander
2 min readAug 21, 2020

--

BE PRAGMATIC WHEN IT COMES TO DECIDING WHAT IS AN OBJECT AND WHAT IS A PROPERTY

Now we are potentially getting right in amongst the weeds and many a data model gets stuck in the weeds because this issue is not understood.

Now we are potentially getting right in amongst the weeds and many a data model gets stuck in the weeds because this issue is not understood.

So what is it?? Well, when you are modelling your data you will be making choices about what is an object and what is a property of that object. For example, let’s say you are modelling a piece of equipment and one of the features of this model is the manufacturer. Do you do it like this:

Object: Heater
ID: F102
Area: Utility Area
Manufacturer: ACME Heaters Inc.

Or like this

What’s the big deal here, who cares? In reality almost every property could be split out as a separate object* (for example “Area” definitely could). But traversing relationships more than 1 step is an expensive operation in graphs and can really impact performance. However, finding everything that is connected by 1 step is very efficient. So if you have a use case where you want to find all the equipment first then filter by manufacturer, having the manufacturer as a property is best. But if you want to find all the equipment made by a certain manufacturer, then filter by area then having the manufacturer as a separate object is more performant.

POINT 7: BE PRAGMATIC WHEN IT COMES TO DECIDING WHAT IS AN OBJECT AND WHAT IS A PROPERTY

The example above is very simple and in reality the performance difference would not be noticeable to the user, but it does become an issue as you scale this up.

How we solve this in Eigen is to be pragmatic and not worry about duplicating some information as both a property and an object. There’s no right or wrong answer and ofter there are several ways to achieve the objective. What you need is experience at building data models to understand what works.

*there is an approach called “triples” that reduces everything to a set of A-relationship->B statements (subject, predicate and object) but these perform terribly and don’t scale well.

Do get in touch if you want to talk more about data models and class libraries

To find out more:

  • Read the rest of the series
  • To join the conversation join us over on LinkedIn

--

--

Murray Callander
Murray Callander

Written by Murray Callander

Co-Founder & CEO @eigenltd — How can we help industrial companies become more efficient? And how do we make sure we do a great job?

No responses yet