What makes a good datamodel? – Point 4: Validated

Google search is not a validated datamodel. Nor is it comprehensive, but it’s good enough for most of us.

Ordnance Survey maps are validated and they make something like 15,000 updates a day! 

In most industrial use cases I have encountered, accuracy is very important.  If I want a list of all my production wells, it needs to be all of them, not “most” of them.  If users feel that the data model is not accurate they won’t use it or trust anything built on it. 

This is blindingly obvious Murray, why are you saying this??  Well, it’s because the most common question I get asked is “How quickly can you build it?” or “How long does it take?”. 

The true answer is probably something like “it depends, how accurate do you want it?”.  If 80% accurate is ok I can get it to you in a couple of weeks.  If you want it 100% accurate then it’ll probably take 6-9months.  “69 months!?”, (to quote the Penguins in MadagasgarJ).  No, 6 to 9 months, or maybe 12 depending on how many data sources you want to integrate, but most of that time will be spent untangling the source data, decoding edge cases and testing to make sure everything is working.  

Ooh, and that’s another point – how do you actually test it?  A typical offshore installation might have ~120,000 “tags” in the master equipment database, 30,000 documents and drawings, 20,000 timeseries points (could be up to 100,000) and these are all interrelated so a typical model might have 250,000 objects and 4million+ relationships.  You can’t realistically check all these by hand.  What you have to do is spot checks, scripted pattern matching and real-life use cases. 

And finally you need a mechanism for keeping the model current.  This means responding quickly to user feedback and keeping it in sync with the source data. 

written by

Murray Callander

posted on

June 25, 2020

you may also like...