The keys of being data-driven

Data-driven organization

Being data-driven has become a must for companies that want to preserve their competitiveness. However achieving that requires some important capabilities. Without implementing them, one company cannot claim itself to be data-driven. In some cases they still do it, despite they do not possess all keys of being that. This could be very dangerous and it is often caused by the lack of understanding. Managers have to be prudent when they decide to transform an organization to be data-driven. Creating such an organization is not that easy, especially when we are talking about a big enterprise. 

One of the most popular statistics made on the topic of the application of Big Data in companies is Erik Brynjolfsson's (professor of MIT) research. It was made in 2011 (Brynjolfsson, E., L. M. Hitt, and H. H. Kim – Strength in Numbers: How Does Data-Driven Decision making Affect Firm Performance?). According to The New York Times, it was mentioned that companies that adopt data-driven decision making achieved 5-6 % higher productivity gains. This research was made exactly when Big Data technologies had just been becoming a hype all over the globe. Now, in 2015, Forbes confirmed that Big Data is not only a hype anymore, but it is an existing field of technology. They asked 316 executives of large global companies with the partnership of Teradata and McKinsey about the state of Big Data implementations. 90% of them reported medium to high levels of investment in these kind of technologies and data analytics. It is a clear trend that data-driven decision making is becoming a standard methodology in business. But what are the keys of applying that? 


The most important key is implementing a data oriented culture. There could be hundreds of use cases of data analytics in a company. Nowadays thanks to Big Data technologies which give us very flexible tools and methods, the goal of the data analytics does not have to be determined foremost in all cases. We have the possibility and the resources for creating experiments with which we could even find actually new KPIs. It should not be forgotten that this is one of the biggest advantages of these technologies. 

However since we could have several use cases, there have to be standards and protocols for using the different types of data. Protocols are intended to guide everyone in the data management chain through the processes step by step. It is not enough to have standards for each different data sources and for different sections or divisions of a company. These standards have to be globalized in order to fit the data-driven strategy which involves the data lake philosophy. In short, a data lake means that every data sources should be joinable in one environment, regardless they are different, for instance structured or non-structured. 

Another key point of the culture is sharing of the data. The main idea is if there is no legal obstacle, data should be shared with as many employees as possible. Of course because of possible legal obstacles and top business secrets there have to be a proper authorization system on the data lake. 


It is not absolutely true that the more data you collect and process the more data-driven you are. However if you have 4V type of data (volume, velocity, variety, veracity), then it is clear that you need to invest in Big Data technologies. Of course the proper architecture with the proper tools strongly depends on the actual types of the data sources and the usage, processing and goal of the processing. That is why Big Data environments are varied. There are no global "best solutions" taken from the ground. Building such an architecture is more like playing with building blocks. 

So, the takeaway is that a prudent investigation on data sources and use cases has to prelude the building of the architecture and data pipelines. 


There has to be well-defined roles inside of a company. There has to be a role who defines the goals of the analytics and asks the right questions. Right question means that regardless of the output of the analytics, so even if it is positive or negative, the answer could be used for creating business value. 

On the other hand we have to have guys who speaks the languages of data. However we have to separate data engineering and data analyst or scientist roles. Despite the separation these roles are strongly dependent on each other. We have to find a good balance in our data team between engineers and scientists/analysts which can depend on the use cases and our tools used on the environment. 

Nowadays in data business top-down management is not always the best method. In some cases, ideas for new KPIs or actions from bottom to top are more than welcome. Those guys who are closer to the data itself, they are also closer to the fire, they are the first who will understand the whole story. 


Now we have an architecture, an implemented data-driven culture, and a data team. Are we data-driven now? Not yet, because until this point we did not make a single dollar of profit. 

Making dashboards and predictions are nice but action has to be done based on them in order to rightly claim ourselves to be data-driven. Managers and people who do actions and make decisions have to trust their data, and also have to quit making decisions based on habits and intuitions. Sometimes, when data  does not show what we expected, it is hard. Of course there are some cases when we have to double check our data and make sure everything works well in our data-driven system. But since we have invested in it in the hope that will make profit, we have to trust in our data. And as it was mentioned earlier, this whole thing was built up because there are things might be happening which were not expected before.