Data Science Applications

Data Science Applications

Article by Johann Mifsud Executive at eSkills Malta Foundation

Data Science is a branch of information technology that deals with the analysis and processing of large volumes of data, which may consist of structured or unstructured or a mix of both, in order to find unseen patterns and derive meaningful information. The reasons for utilising data science could possibly be for the identification of market opportunities, process optimisation, cost-reduction, as well as, for the identification of abnormal financial transactions, among others.

Data Science leverages several disciplines such as statistics, mathematics, software engineering, predictive analytics, data modelling and machine learning algorithms development and more. A typical project involves components that require expertise from several of these areas in combinations of varying proportions from one project, to another.

As technology finds its way into all our daily activities, so do the digital data trails we leave behind, be it at banks, retail outlets, hospitals, and so many more.
Many organisations have realised that they can put this mountain of data to good use and capitalise on this veritable gold mine of data they are sitting on.

The process to prepare data in such an application is quite intricate, but in general certain key steps are required and these are the following;

Identification and evaluation of a business opportunity in data,
Collection and preparation of data for analysis,
Evaluation alternative analytical models,
Selection and testing of analytical model on test data set,
Presentation of findings to decision makers,
Launch of system into service with real live data.

Once the system is up and running, what comes out may sometimes be surprising. Let us illustrate such outcomes with the following examples.

In retail outlets such as super-markets, planning managers put a lot of effort aimed at careful product placement on shelving, as they know that, by placing pasta and sauces close to each other, for instance, entices customers to buy more products. Thus, uncovering such placement opportunities clearly provides ample room to drive more sales and profits.

Based on this, a known retail chain decided to launch a data mining project on the retail data they possessed on several hundred stores and millions of retail transactions.
Surprisingly amongst other insights, data revealed that there was a pattern in purchases that a statistically significant portion of receipts that included diapers for new-borns also included packs of beer in the same purchase.
It turned out that when fathers made the purchase following the birth of a child, they also included a treat for themselves!

Now imagine, if managers could detect many such patterns and take steps to place products more strategically, for the chain this implies significant gains.

In another case, an auto insurance company applied data science techniques in order to curb insurance fraud cases while delivering excellent service to honest customers. As one might expect, this is not easy, as introducing lengthy customer checks can result in a significant workload for the company and also annoy the customers that in their vast majority, are legitimate.

Thus insurance firm sought to utilise data science techniques in order to extract what would be key indicators of fraudulent claims.
Amongst the key findings was that a high percentage of the fraudulent claims occurred on recently opened policies, and more surprisingly, such claims clustered around weeks of holidays! Armed with this knowledge, the company could make changes to internal procedures to be in a better position to stop such claims from being paid while keeping the administrative overheads still contained and legitimate customers happy.

There is no shortage of examples such as the above ranging from facilitating background checks for customers seeking credit from banks to applications in healthcare via data from wearable devices that monitor and prevent further health problems for patients with chronic illnesses.

Data-driven decisions pay back substantial dividends to organisations, so much so that many have dedicated teams entirely dedicated to this work. In turn, as these teams grow, it follows that those who either possess or are able to acquire data science skills are very sought after, with demand for them heating up on a daily basis.

This article was prepared by collating various publicly available online sources.