Knowledge graphs for the IT Crowd
Addressing challenges of IT partners about knowledge graph technology
“We need this technology”. It’s a common refrain from the business when they stumble on the latest buzz word or silver bullet. And IT often comes across as the bad cops having to explain why it won’t work or how it doesn’t conform to company standards.
As a responsible IT professional, you might be thinking the same about knowledge graph
technology. Or maybe you haven’t been able to get past the hype to get your questions answered properly.
Here’s a few questions I get from the IT crowd, that you might have too:
- What’s the underlying architecture?
- What do we need to have to make this work?
- How can you get away without defining a schema?
- Is it really just a big database?
- How does it handle unstructured, scattered data?
- Where is the knowledge graph hosted?
- What kind of support is there?
- How will it comply with Company security standards?
- How is data governance managed?
- What do we need to do in IT, to make this work?
Over the past few years, we have really tried to listen to the usually quite legitimate challenges and concerns IT departments have raised. We have tried to see knowledge graph technology from their perspective, faced with their objectives and needs, so we can improve our offer and help our clients.
Here’s some of the questions I often get asked – and the answers I give:
What’s the underlying architecture?
We know there’s no such thing as typical in oil and gas, but here’s a pretty simple diagram that shows how the knowledge graph fits into an oil and gas systems domain. It’s based on a real architecture we worked with connecting in multiple data sources, internal and external, enabling monitoring, reporting, alerts and visualisation for different user groups.
We can deploy knowledge graph technology to any Cloud supporting Kubernetes or Virtual Machines. The containers that comprise the solution can be deployed to a VM (or VM cluster) within a resource group assigned to Eigen.
Eigen knowledge graphs are built using Neo4j. They can be accessed using native Cypher queries or via the Eigen Python Library from user-generated Python code.
How can you get away without defining a schema?
The biggest thing that sets the knowledge graph apart from the database or data lake is the fact the former points to source data; it doesn’t ingest or copy data.
So, there’s no need to define a schema or the boundaries, as required by a database or data lake.
Building use-case by use-case allows us to continually evolve the knowledge graph based on the most important business needs – rather than needing to predict these potentially years in advance. And by following an Agile methodology the graph can evolve by improving on previous graph structures.
Is it really just a big database?
Not really, because a database is by its nature predefined. If business requirements change,
modifying the underlying database is both complex and costly. Databases also require a high level of standardisation in nomenclature and format, which often creates bottlenecks and silos as new systems, tools and even assets are onboarded.
Knowledge graph technology by contrast is more flexible and dynamic and can handle, for
example, a single asset with multiple names and multiple asset hierarchies.
How does it handle unstructured, scattered data?
Moving to a new database or data lake often requires considerable data cleansing during migration to fit with the new structure.
Knowledge graphs point to source data, rather than moving a copy of the source data. It does not require the data to be structured and it does not care where the data sits.
Knowledge graphs can point to documents, blueprints, text files, and the analytics applications that use the graph knows how to interpret each reference, e.g. trending a timeseries or opening a file.
Where is the knowledge graph hosted?
The knowledge graph is part of the multi-container Eigen Analytics Platform, which is hosted typically within the operator’s own cloud environment.
The knowledge graph can also be deployed on its own and can be queried for useful insights.
What kind of support is there?
The knowledge graph technology we use at Eigen is based on Neo4j; it is an open source technology, with a large community of major corporate users.
The knowledge graph and Neo4j is part of the Eigen framework and is supported by Eigen as part of its application support contract. Different SLAs can be offered based on the operator’s need and budget. Eigen provides first level support for several of our clients, although several others have chosen to keep first level support in-house.
How will it comply with Company security standards?
We recommend thorough investigation of knowledge graph technology, like any external application, to assure digital security. Neo4j’s client list of major banks, insurers and other IT providers, may offer some reassurance of security standards and practices.
At Eigen, we have worked hard to achieve accreditation in ISO 9001 (Quality) and ISO 27001
(Information Security) to ensure we design, build and operate solutions that comply with standards at the highest level for quality and digital security.
How is data governance managed?
The honest truth is that like any poorly run IT projects knowledge graph projects can go wrong – and they can get out of control. Knowledge graphs require rules, processes and standards. Eigen has experience and tools to help operators define and operate such governance but ultimately these are the responsibility of the operator.
If you would like a demo of an Eigen knowledge graph, tailored to your needs as an IT specialist, please book a convenient time here.
What do we need to do in IT, to make this work?
At Eigen, we always engage with our customer’s IT departments and to define the steps required to deploy our technology. A secure cloud infrastructure design is generally the first step.
Once that is approved, all the IT department needs to do is assign an appropriate resources within its cloud provider and implement the secure channels and firewall rules agreed in the design. For example in Microsoft Azure this would be a Standard_D8s_v3 VM or an AKS cluster of three Standard_D8s_v3 VMs for full container orchestration through Kubernetes.
Alternatives are always possible and Eigen experts are available to answer any questions.