The age of data
In the early 2000s, a quiet revolution began. With the rise of social media, smartphones, and the advent of the Internet of Things (IoT), we entered the age of data—a moment when the volume of data generation truly started to soar.
It is estimated that a person connected to the internet generates, on average, 146GB of data daily. To put this into perspective, that's equivalent to about 45 hours of high-quality video.
This data comes from everywhere: your morning Google search for the weather, performed alongside approximately 40,000 other searches in the same second; the tweet you posted; the Netflix show you streamed in the evening. Even the steps you take during your afternoon walk are recorded by your smartwatch.
But as we become more digital, so do the businesses, institutions, and governments around us. It's not just individuals who contribute to this vast amount of data—it's everything around us that generates data.
What is data, really?
At its core, data is information. It's the digital breadcrumbs we leave behind as we navigate through our increasingly interconnected world—a massive, ever-growing diary of behavior.
Nowadays, your smart fridge knows more about your eating habits than you do, your fitness tracker understands your health better than you, and each social media platform knows your personal interests precisely.
The modern factory is riddled with systems and devices that measure each part of the process, more accurately than any human could. Meanwhile, computers help crunch research data within a day that a team of scientists would spend weeks analyzing. Smart cities collect data about traffic patterns and energy usage, all to make urban life more efficient and sustainable.
All information, generated by you, by businesses, governments, and devices, from every click to every measurement, is stored, processed, and analyzed.
The digital library of Alexandria
In ancient times, the Library of Alexandria was considered the pinnacle of human knowledge. Today, it's as if we're collectively writing the most detailed library ever created. All that data is stored in massive data centers around the world, far surpassing the knowledge that was once housed in Alexandria's library. It is the universe of human knowledge.
However, unlike the neatly categorized scrolls and books of the ancient library, our modern digital repository is far more complex and diverse in its organization.
Structured vs. Unstructured
Just as the Library of Alexandria had various types of documents - from carefully indexed scholarly works to loosely organized personal letters - our digital data also comes in different forms. In the world of big data, we primarily distinguish between two types: structured and unstructured data.
Data can be either structured or unstructured, each with its own characteristics and challenges. Most of the data generated today is unstructured, accounting for about 80% to 90%, while the remaining portion is structured data.
Structured data is organized in a specific format, such as an Excel file with columns and rows. Each piece of information is stored in a specific location based on the table's structure, like a column for your name and another for your surname. This structure makes it easy to retrieve and search the data and to define relationships between data stored in different locations. Essentially, the information is saved in a structured way.
Unstructured data, on the other hand, does not fit a predefined structure. It can come in various formats, such as documents, multimedia files like audio, video, or images. Because this data is unstructured, it becomes harder to search through and define relationships between the data stored in different locations.
Because data is information, it is vital—like oxygen—something we can't go without anymore in our digital age.
Information drives innovation, decision-making, and economic value. Medical data is fueling breakthroughs in healthcare, businesses use data to make more informed decisions, and the insights derived from data are worth billions to companies worldwide.
Understanding the value and implications of our data is the first step in navigating this digital landscape. In today's world, data literacy is becoming as important as reading and writing. This accounts for businesses as well as individuals.
Data tells a story—the question is, who's able to read the story?