Boost your Knowledge Graph with Events to gain Untapped Insights

This article will explore another use case to extend the knowledge graph of the CM Baseline, where we will not only look at the product data to create a kind of product 360 but also all the process and event data related to this product data. By adding events as nodes into a Knowledge Graph, you show events of a specific product data node, but also the interactions between these product data nodes from an event perspective. This will create patterns of data that give new insights that both classic process mining and product data analysis were not able to provide.

This post was inspired by the work of Dirk Fahland and Kadir Marangoz on Event Knowledge Graphs. But if you have not yet read the previous articles from the CM Baseline series, please take a look at them first:

Understanding the Impact of Changes introduced the need for the CM Baseline.
Connections tell the Story introduced the first component of the CM Baseline, the Business Object Graph.
Intentions are the desire for new Stories introduced the second component of the CM Baseline, the Impact Matrix.
Dependencies limit the possible sequences of events, introduced the third component of the CM Baseline, Dependencies or a.k.a. Change Dependencies.
Timing is Everything, introduced the fourth and last component of the CM Baseline, Implementation Plans.
5 Ways a CM Baseline brings value, explores 5 scenarios that bring value.
One way to organize information for the CM Baseline, proposes a way to model the knowledge graph of the CM Baseline.

This article will dive into the concept of event knowledge graphs. An Event Knowledge Graph according to D. Fahland from the Eindhoven University of Technology is a:

“data structure that allows to naturally model behavior over multiple entities as a network of events”

How to Introduce Events into a Knowledge Graph

Let’s start with the following example of a regular knowledge graph. Both Jane and John are employed by ACME and Jane is John’s manager. Jane, John, and ACME are represented as Nodes and the employed_by relation is directed from both Jane and John to ACME. The manager_of relationship is directed from Jane to John. This way the relations tell the story, that Jane is the manager of John and that both Jane and John are employed by ACME.

But to be employed by ACME, Jane had to be hired at some point in time. That is an event and even before that, she must have been born. But to be hired at ACME, ACME must have first been founded. All these events have something to do with one another. By adding these events as nodes you can show that Jane was Born on t0 and Hired at ACME on t3, which was Founded on t1.

When you add all (relevant) events to this knowledge graph you can connect them based on their timing using a direct-follows relation. To zoom in, we can look at Jane’s career. Jane was first hired at t1 at SKYNET, where she resigned on t2 and got hired at ACME on t3. There is now a chain of events that you can connect with a Direct-Follows (DF) relation.

The interesting thing is that you have events that are linked to specific nodes like Jane or SKYNET. But they do not share all the same events. So basically you can generate a timeline for each node where there will be events that are shared. Here the timelines of these nodes collide. This can be a good thing or a bad thing depending on the situation. Before we start applying this to product data, we need to address one other important aspect.

Events vs States

An event is ‘something that happens’ whereas a state is ‘a condition something or someone is in at a specific time’. Events precede states. But in a data model where already a lot of states are being represented, like in a product data model, the states of a dataset are often ‘work in progress’, ‘under review’, or ‘released’, sometimes the states can be used to represent the event that precedes it. In the end, the state is the reflection of an event that occurred.

Below you can see a change that is submitted by a change owner and approved by a Change Review Board (CRB). The first is where both the event and states are modeled as nodes, and the second is where the states are modeled as nodes and labeled as events. This is a way to minimize the amount of nodes you need in your model. Knowledge graphs without event nodes can already grow to millions of nodes and billions of relations. If you would add events the amount of nodes can be easily multiplied by a factor of 10 or more, depending on the granularity of the events you are going to need.

If you only need events like submit change or approve change you might end up with 10 events per change object. However, if you model every change to an attribute as an event, this can easily grow to 100+ events per change object.

The caveat here is that it sometimes is useful to separate the events from the states when you want to model which events are resulting in which states. As you can see below, the Update and EndReview events are not mapped to a change of the state of the BoM revision. Secondly, the Work in Progress (WIP) state of the BoM revision is first triggered by the Revise event and later by the PauseReview event in line with this, the Under Review (UR) state of the Bom revision is triggered by both the StartReview and Resume events. Also, the Revise event points to revision 01, but the related state WIP points to the new revision 02. Note it is not about this exact example as each business might have its own events and states defined, but about the concept that these do not always have a one-to-one mapping.

Depending on what you want to learn for your event knowledge graph, you have to make a choice here to show both the events and states or only one of these. In one of the examples below, you will see that I used revisions of an Impact Matrix as events. So there are multiple ways of looking at events depending on your use case.

Collisions/Intersections and Predictions

When the same product nodes share the same Events, for instance when a set of objects are being released together using a release package. Because these chains of events are often predictable, you can also create placeholder events based on patterns we have learned from historical data. When creating these placeholders and connecting them, you now see the chain of events going into the future. This allows you to for instance predict that you will run into issues if you do not take any actions.

In the below graph, the current time is t4. That means that all the events after t4 are predicted based on historical data or based on some pre-defined logic. That also means that the Part 1 BoM revision 01 and Part 2 BoM revision 02 are just placeholders as these do not exist yet. There is a collision happening on the BoM of Part 2.

2 release packages that impact that same bom

This becomes even more clear when you merge the two event chains and create one directly follows chain of events. Besides the fact that it looks like a cool spaceship, there are a couple of interesting patterns that we can identify from this example.

2 release packages that impact that same bom merged event chain

First, we have a StartReview event at t10 that is directly followed by another StartReview event at t11 both having the Part 2 BoM rev 02 as Target. Same with the EndReview events. But there is also an event collision as at the same time we have the EndReview event and a Release Event at t13 that have the same target. I know in practice they will become different revisions of the same BoM but having two revisions under review at the same time is probably not a good idea. For the event knowledge graph, it’s now not clear how the Directly Follows relations need to be modeled based on the available data.

Each of these patterns could indicate that something might go wrong or at least could result in extra work and delays. By modeling this in your knowledge graph you can develop the means to predict these situations and make your project teams aware of this. Naturally above examples will highly depend on how you are organized and how your development tools are implemented.

Behavior patterns

Events can also show behavior patterns from people in a process. For instance, during impact analysis, people might add an impacted part to the impact matrix, and later remove it, add it a while later again, and finally remove it. Or the scope of a change is extended late during the implementation of the change by the engineer. When this occurs it is interesting to see not only that this happens but also which product data is involved and how often this happens. It could be that often the same type of part is added to the scope of a change or that it only happens when multiple projects impact the same part.

behavioral patterns event knowledge graph

In the above example, the Impact Matrix of Change 1 contains Part P1, P2, and P3 as impacted parts in revision 01, in revision 2 of the Impact Matrix Part P1 was removed therefore it contains only Part P2 and P3 as impacted parts. Later in revision 03 of the Impact Matrix, Part P1 is added again. Note that in this event knowledge graph, I decided to use the revisions of the Impact Matrix as events. As that is the right granularity I was looking for. I could have made an event of every Add or Remove action but that would not match how people work and iterate their Impact Matrix.

Understanding these behavioral patterns can help get a better insight into why we work the way we work. If you only do process mining you might not catch this or do not fully understand why this is happening because you miss the context of the product data.

Conclusions

By bringing product data and process data together, interesting patterns emerge that could help us further improve the way we develop our products, prevent corrective actions and delays, and improve the overall quality of our product data. While this field of research is relatively new, we can already see various opportunities for using this approach to start tapping into insights currently untapped. It will require a different way of thinking about product data, process data, process mining, and data analytics, but from where I stand that will be well worth it.

Another advantage of modeling the events in a knowledge graph is that Large Language Models (LLMs) can use these to be trained and find patterns. This will allow you to improve the quality of your impact analysis and implementation planning of changes as well. For more information on the use of LLMs in configuration management, check these articles:

If you are interested in making use of event knowledge graphs you should connect with Dirk Fahland, as this is his field of research.

Header Photo by Uriel SC on Unsplash