An overview on Big Data

An overview on Big Data

Article by Johann Mifsud Executive at eSkills Malta Foundation published on The Sunday Times of Malta on 14.11.2021

Big Data technology has been working in the background in a wide range of sectors for more than a decade. Practically all activities of our daily lives have some sort of a digital footprint, organisations both public and private, are finding ways to extract wealth out of this Data gold mine.

Areas of application of Big data projects span a wide range of industries, such as healthcare, financial services, transport, logistics, and customer service, being typical examples.

But what does Big data mean?

In essence, Big data means large sets of data originating from different sources and processed in separate storage for the sole purpose of analysis. The ultimate goal is that of reaping valuable and actionable information from this data that translates into value. To visualise, one might take a simplistic example, that of a supermarket chain. The chain managers use analytics on the purchase data of thousands of customers across hundreds of stores, possibly uncovering hidden buying patterns. In turn, actions such as product placement on shelves might result in more sales.

As we shall see shortly, much of the data is generated in real-time and on a vast scale, as dictated by the sources involved. These may be multinational eCommerce websites with thousands of transactions per minute, or social media platforms with millions of users, with an aggregate of billions of posts a day.

Over the years, the concept of big data has evolved to include more characteristics in addition to the original three it was defined earlier on, resulting in five, eight or even more. For the purpose of this article, however, we shall focus on five characteristics, also known as “Vs” or 5V model. The 5V big data model is often depicted as a pyramid with volume at its base and Value at its apex.

The first V is Volume. As both individually as well as businesses utilise digital tools in our activities the amounts of data we generate rose exponentially. It is easy to see that the data collected adds up to billions of data points a day.

The second V is Velocity, information has to flow as quickly as possible. One could visualize the importance of velocity if, for instance, consulting a traffic map that uses Big data to make real-time recommendations on alternative routes as the situation evolves in real-time. An app with a few minutes delay would be totally useless.

The third V of big data is variety, referring to many types of data streams and relative sources. Depending on the area of activity, many different sources are correlated to yield a complete picture. For instance, some countries adopted a mobile app. to track and flag potential exposure to covid of individuals by combining medical status with GPS location and other data from their smartphone.

The fourth V is veracity referring to the quality of data, in that it conveys an accurate representation of the particular real-life context.

The fifth V is for value sits at the apex of the pyramid. Here the insights are converted to decisions that in turn translate to tangible benefits and value. In addition to business intelligence analytics, the marriage of big data with AI technologies come into play. These provide deviation detection, probability of future outcome and Pattern recognition functionality on the large volumes of data sets otherwise simply not possible by human intervention.

If we take the applications that are already in existence, as well as what we can expect in the near future in two sectors, financial services and healthcare, respectively, these would be the following:

Financial services were able to leverage information about their customers, including transaction data, banking records, credit histories, spending patterns and financial assets, be armed in order to expand their businesses to build new targeted services based on the knowledge acquired from big data about their customers.

In healthcare, examples are treatment improvement and outbreak progression. In the first, it translates to more effective treatment at a lower cost. The second predicts outbreaks of future pandemics and identifies clusters to reduce the impact of disease and human suffering.

The trend shows we can expect more organisations to recognise the value of the data they generate and thus embark on projects geared toward leveraging it. The resulting demand for Big Data experts and associated fields can only be expected to grow.

This article was prepared by collating various publicly available online sources.