The Data Knowledge Initiative
Let’s explore how people and organizations perceive, gather and manage data and how they gain a sustainable data-driven advantage.
What is data?
What is data?
A system collects raw data: recorded signal readings and discrete facts, usually coded and in electronic, binary form
By processing, analysing and interpreting, data is turned into meaningful information with implications that can serve as inputs for decisions
Derived from experience, information is turned into knowledge generally residing with the user / human; (re)cognition (know-what), capacity to act (know-how), understanding (know-why)
Big Data
What is “new” and “big” about big data?
Volume
For example, it is estimated that Walmart collects more than 2.5 petabytes of data every hour from its customer transactions (1 petabyte = 1million gigabytes)
Velocity
Speed of data creation: real-time or nearly real-time data
Variety
Different forms (images, text, location data) and different sources (social networks, mobile phones)
Unique characteristics
Context-specific
The value of data is difficult to quantify. Benefits are context- and use- specific. Costs include investments (IT, data centres) and talent acquisition. Markets for data are not as developed as markets for knowledge and information. Trade is hindered by its non-rival nature.
Decreasing marginal returns
Data typically exhibits decreasing marginal returns to scale. For instance, accuracy of an AI system tends to improve as the quantity of data points increases, but it does so at a decreasing rate.
Non-rival
The Economist wrote “data is the new oil”. Indeed, both need to be refined before use. However, oil is rival (use by one person reduces availability for others), whereas data is non-rival.
Granular
Focus has shifted from the size (“big-ness”) of the data to its granularity and fine-grained nature: for example, a participant in a Formula 1 car race generates 20 gigabytes of data from the 150 sensors on the car that can help analyse the technical performance of its components, but also the driver reactions, pit stop delays, and communication between crew and driver that contribute to overall performance.
Subjective
While data are often thought of as objective facts, they are actively created, not passively recorded. Data is situated, partial and constitutive. -> Pay attention to the choice of which data to collect.
Time-specific
Data may lose relevance over time. Moreover, data requires time to be collected and you cannot go back in time to reproduce a competitor’s data. Besides, data collection over one year may not be able to replicate data collection over five years. -> Pay attention to when you (need to) collect the data.
When data creates competitive advantage…
and when it does not
Value-add of customer data to product or service
Do customers value the data-driven improvements? In order to gain a competitive advantage, data must improve the product or service in the eyes of the customer.
Durability of data-enabled learning
If marginal value of new data is small (e.g. thermostats learn users’ preferences within a few days), then other firms can easily achieve the same learnings.
Speed of depreciation
If data becomes obsolete quickly, new rivals can acquire relevant data.
Uniqueness
Proprietary data: If data can easily be purchased or copied, then the firm will not gain an advantage.
Imitability of data-driven product improvements
Even if the data itself is difficult to imitate, the product improvements may be easy to imitate, leading only to a temporary advantage.