The role of Big Data in creating Industry 4.0

Industry 4.0

In the last 20 years the stakeholders of industry were able to reduce the amount of waste and enhance the quality of the products and the yield. This could happen due to they implemented lean and six sigma methods within the processes.  Processes of today's companies work so well, that decision-makers may even think that they could barely be improved more. However recently some stated that we are now just on the threshold of the fourth industrial revolution. The latest technologies of IT will have a big impact on industry and they will completely change it and also that how we think about industry. The change will be so huge that they call the mix of these new technologies Industry 4.0.

Industrial revolutions

Why 4.0? Industry 4.0 is the consequence of the fourth industrial revolution, which is actually happening at the moment. In the course of history mankind lived through three revolutions in industry. One may ask what was about the other three?

The essence of all revolutions is to implement a whole new technology in industry which basically makes processes faster and safer, products better and increases yield. During the first one in the 18th century the big change was the invention of mechanical production facilities which were driven by water and steam. The next big stage came in the late 1800s, when they applied electrical energy and new labor-concepts and the first production lines had just appeared on the scene. The third one was not so long ago: about forty years ago everything was about automation after the first PLC was invented. And now comes the fourth with IoT and Big Data…


Why Big Data? 

According to McKinsey, there is more data created in modern manufacturing than in any other industry. Most of the data are not only unused, but they are not even collected. In some cases even if they are collected, there is no capacity to process them.

As it can be seen in other fields like IT or business, data and data-driven decision making is increasingly gaining around. Because at those companies they realized that investing in data science can highly improve the operation and production of them. As the MIT professor, Erik Brynjolfsson said in New York Times back in 2011, "companies adopting data-driven decision making achieved productivity gains that were 5 percent to 6 percent higher than other factors could explain." In the last few years the only problem was that amount of data around us started to grow very rapidly. The old-fashioned data management software solutions were not be able to process that big amount of data anymore which we started to collect that time. It is a fact that today we collect twice amount of data as we had been collecting from the beginning of the history till 2003. And it doubles in every 40 months.

Then Hadoop and a lot of new so called distributed data processing components of Hadoop framework came. It has unarguably changed the way we think about data. Implementing a Hadoop environment – even if Hadoop itself and all most its components are open source – can cost a lot. This is because of implementing a system like Hadoop is not simple, there are many components of it and each of them fits a specific appropriate use case. And also there has not been developed many best practices in Hadoop world yet. Each company has to do experiments and some research and development work with the technologies on the market for their own use cases. Another complicating condition is that Big Data is evolving really fast, there are new technologies invented in every corner of the street. So companies also have to be fast with implementing if they would like to stay on the wave.

Applying Big Data technologies recently was only typical in IT and business area and maybe in some other science area. In industry it is happening at the moment and lots of development will be able to seen in the future. The biggest companies in the world have just started to develop their Industry 4.0 system, like GE and Bosch. It clearly shows that this change is inevitable and sooner or later every company needs to follow up.

How can we use it in industry? 

According to GE, serving as an example in one of their customers' plant only one machine of the plant produces 13 billion samples per day, which means the same amount of records in a database table. Now we can imagine how much data is being produced in a common factory.

In most of the cases data science is in the service of some business purpose. This is not different in industry. The most important goals of them are zero unplanned downtime, high-reliability assets and economically the most efficient use of them. Big Data can help to reach these goals by predicting and preventing crisis situations and it could also provide real-time interventions within the systems. The real power of Big Data is the speed how it processes data and with this we are able to establish a closed loop in the system with the control of its data.

So far in industry the most used method of trying to eliminate errors was sampling. Sampling is not very effective, can cost a lot and also an invasive method. However, installing sensors and processing the data of them is non-invasive and could be really effective if the correct methods are applied. The know-how of processing industrial data differs from processing data in IT or business of course. Data scientists or algorithms developed by them have to understand physical processes deeply which are played through in an industrial environment. They have to work with the laws of physics including thermodynamics, mechanics, electronics and with any other fields which is in connection with the specific industrial environment.

Analyzing data and creating BI reports and data visualizations is a good method for supporting the decision-making of the management. But it is still a slower manner to change anything in the process if something seems to be upgradable in the reports. As I mentioned before, the real power hides in the real-time interventions that needs another technology which is getting more and more popular in data science these days, and this is Machine Learning.

Machine Learning algorithms have to be fed with data in order to make them work and create the appropriate intelligent models. And generally the more data they have got, the more precise predictions they provide. In the last few years we became able to process huge amount of data.

Now put these two things together.

IoT and takeaways 

Industry 4.0 is built around the technology of Internet of Things. IoT makes it possible for the components of an industrial environment to communicate with each other and to make decisions in real-time in order to improve the quality of the production and to prevent failures.

IoT does not only mean that only the machines communicate with each other but human interactions will be equally important and humans will be part of the IoT system. It is a clear fact that building such a big IoT system would not be possible without the power of Big Data technologies.

What's more: in the future wireless communication technologies will be really fast, like 5G, which is currently being developed by several telecommunication companies, for example Ericsson, Nokia and Huawei. So fast and so reliable, that they will be able to be applied in more critical environments than mobile networks, such as traffic or industry. In industry it does not just mean that sensors and machines of one factory will be connected to each other.

It means that more industrial sites can be connected to each other. For instance, an event which is happening in a plant in Europe at the moment will be able to modify and improve the process of another plant in America.

In real-time.

In every second.