
Less than 20% of data generated today in industrial plants is actually used. This is about to change through better data mining and analysis, giving actionable insight and multi-level benefits
If I am an engineer, operator or manager in any industrial operation, typically two things are the primary focus of my attention: the equipment and the process. In addition, sustainability as a third topic has become more important.
When considering a piece of equipment like a pump or heat exchanger, I may want to know its physical condition, how it’s performing, if it can take a higher load, how frequently it needs maintenance, will it fail soon, etc. My aim is to know if the asset can perform better, and if so, how.
Then, when a series of equipment is put together in sequence it becomes the process. Now I want to know how successfully it is all working, together, to make my products and eventually my profits. Thus we look at outcomes and factors such as yield based on raw material input, total production rate, product quality or process stability.
The combined performances of equipment and processes, when viewed through the lens of sustainability, are an additional concern. This covers areas such as environmental emissions, energy use, raw material consumption, and risks and safety hazards for both people and assets.
Data needs to be put in context
Each of these three topics – equipment, process and sustainability – exert crucial impact on success for anyone operating an industrial plant. Based on advances we have made with data extraction and analysis, process engineers today have powerful new tools to optimise them. Specifically, valuable new insight can be attained by using data that is already there but, until now, has not been sorted out, put into context, and analysed to find causality.
Typically, a plant’s process control system is producing extremely large volumes of data, at high frequency. Since the control system is the key element of operational technology – we call it OT data. But, according to industry experts such as the ARC Advisory Group, the average industrial plant uses less than 20% of the data it generates.
Normally we look at high-frequency data to know what is happening right now, or very recently, and not much more. If you do want to go back further, there are likely enormous quantities of data partially dumped into a historian. What today’s new technology helps us with is using all of that data to extract value from it.
For pumps, you may get some information about flow and temperature from sensors. But that OT data is not contextualised with other information that is stored, for example, in your maintenance management system to tell you that the pump was serviced just three weeks ago and there is a pending work order for additional service.
Your Enterprise Resource Planning (ERP) system might have further information on when the pump was bought and installed. However, as a process engineer, you typically don’t directly see that data because it’s in different systems, the IT data systems.
Then for this same pump you surely have detailed engineering information like the equipment specifications, maximum loads and capacities, CAD drawings, and maybe even a simulation model. This is the Engineering Technology (ET) data.
OT, IT and ET correlations help find root causes
So, you have tremendous amounts of data, probably more than you ever realised, but it’s stored in separate silos which we call OT, IT and ET data. If we can bring this all together and see certain things behaving in a certain way following specific patterns, we can correlate these events and equipment to get a valuable new tool: The power to predict process and equipment behavior better.
Perhaps you know that you had a great production run two months ago with high production rates, perfect quality, and stable operation with no breakdowns. A real golden batch, with no hassles. But you wonder why you can’t reproduce that and do it again. What is different, you may ask.
Three key steps in data detective work
To find the answer and be able to recreate those ideal operating conditions and production results, we need to go through three steps so we can find root causes – not only of problems, but also of positive production runs and operations.
First, we bring together the OT, IT and ET source data from their separate silos, to allow cross-referencing. So, we extract it and put it all together in a ‘data lake’. But this in itself is not sufficient.
The second step is to build a model that represents the reality of your operation by bringing information from multiple data sources related to the same equipment or process area together. This is called contextualising, meaning you no longer have just bits and bytes, and zeros and ones, but a context that intelligently models how they are related.
Then the third step is to find correlations. So, after we gather all the data and organise it via modelling, we then can apply the newest analytic tools and machine learning algorithms to identify relations that you never knew existed. The relations were there, but not visible. This detective work allows us to reveal potentially meaningful and important correlations. But again, we are not fully there yet.

Domain expertise is indispensable
Simply having correlations does not automatically give you causality. For example, if the data shows a correlation between a piece of equipment being hot and its failure, that may very well be a side effect, not the original cause.
You might also have a huge number of correlations, and finding what is relevant is like searching for a needle in a haystack. To be able to make successful and correct judgments about which correlations are meaningful and which are not, we need another important piece in this puzzle integrated into the third step: Deep domain expertise.
Domain expertise is the meaningful understanding that a person or company has about a specific manufacturing process or piece of equipment. This could involve expertise in the production of paper, or steel, or oil and gas, or dozens of other products. Having deep understanding about specific sectors, or domains, gives you a much higher probability of making successful decisions about correlations and root causes.
Genix platform was created for this purpose
While researching ways that ABB might accomplish better data mining and correlation discovery, we realised we are in a unique position. Our process control platforms are automating many of the world’s plants in a wide variety of industries, and our equipment is generating much of the data customers can use for more impactful business decisions.
In addition, the actual motion and other interactions happening in many of these processes are being driven by ABB equipment, like motors, drives, switchgear, generators, robots etc, facilitated by world-class data processing capabilities. We have deep domain knowledge about many industries, and also about the used.
Putting this all together under a common platform, starting with legacy applications to which we added powerful new functionality, we created ABB Ability™ Genix. The Genix platform gathers all the power required to use big data for big gains under one roof, pulling together OT, IT, and ET data, contextualising it and revealing correlations that lead to process improvements.
Thinking way beyond condition monitoring
If we look at condition monitoring of equipment, and the growing area of predictive maintenance, much progress is being made by a lot of companies. And it’s an area that can certainly benefit from the contextualisation and correlation tasks that Genix performs.
But looking at assets/equipment is only a starting point. Remember the other two items you care about: process operations and sustainability.
Soft sensors give computed measurements and confident predictions
An example of what we can now do might concern emissions from a power plant. You obviously want to stay within the legal limits, but when the emissions are measured at the end of the process and found to be too high, it’s too late. The result is created much earlier in the plant and process.
To avoid this, we can now correlate emissions measurement with various sensors that we already have earlier in the process and create what we call a “soft sensor” measurement. This will predict what the emissions will be, based on historical correlations between process parameters and final emissions.
It’s not a real measurement, it’s a computed measurement – and it is shown to be a good predictor of what you will have for the emissions further downstream. So we can predict with high confidence, based on our domain expertise and algorithms what the emissions will be based on these parameters.
Don’t be afraid, just get started
You might be asking yourself: “Do I have to install lots of additional sensors – that themselves may require maintenance – to get the necessary data?” Actually, by using data from equipment and sensors that are already in place measuring parameters like voltage, current, temperature, and flow rate, you may be able to get some key benefits. Later, if you decide additional data is needed, you can do a targeted investment to get it.
No matter where you are on your digital data journey – and it’s different for absolutely every company – don’t be afraid to start applying these tools, based on signals already being generated in your process control systems.
It’s like running ten miles: You will never get to the finish line if you don’t take the first steps. And if you choose the right buddy to run with, it makes everything a lot easier.











