Early Warning SystemsMay 31, 2020

Early Warning Systems

How Daisho took a 600-page dashboard, and automatically extracted the most important information for business-users to act upon.

The Problem Statement

Which product in which market is likely to be a problem for me? What factors are driving this judgement? And what should we be doing?

One of India’s largest CPG companies captures data from 14 different sources - internal as well as syndicated. At its deepest, the data is at SKU+State level. This data was all being dashboarded, and color-coded as well. However, This turned out to be so much information that the team gets flooded, and is unable to do much to analyse.

The Challenges

The 14 sources of data are at varying levels of frequency and lags. For example, Nielsen is monthly, while internal sales is daily. Some syndicated data comes in 15 days after the month-end, which some other data sources come in 45 days later. Internal data, on the other hand is near real-time.

In addition, there’s the question of resolution. Internal metrics track at SKU level with a great degree of accuracy, while most external data sources are at Brand or Variant level.

The big asks:

1. Can we get a progressive system, which can throw out alerts as soon as the first data comes in, and these alerts can get corroborated/refined as other data comes in?

2. How does one correlate across these 14 data sources when they track at vastly varying resolutions and frequencies? Can a single system take care of this?

One Rule DOES NOT Fit all

The next set of challenges were to figure out how one defines a problem. 10% growth could be great for a 10-year old brand, but could be very bad for a 6-month-old brand. Given the number of brands and variants, it was next to impossible for the team to formulate problem definition for each variant.

The asks:

3. The solution should automatically also discover problem definitions. The solution should be able to automatically differentiate between fast-growth and stable variants, and propose actions accordingly.

4. The solution should also handle seasonality and cyclicality automatically. It should be smart enough to not just compare month-on-month, but also take into account seasonal patterns.

OK. Metric X is bad. What next? What is the implication?

The third set of challenges was two-fold: Tie up all anomalies into a coherent pattern. And then make playbooks for each pattern which can help management take decisions quickly. Patterns come from data, and playbooks come from management’s understanding of the business to extrapolate patterns to actions. But businesses always evolve - and so these static rules and playbooks lose relevance. We have seen clients lose opportunities to react because changes and impacts got identified as late as nine months later!

The asks:

5. What anomalies normally occur together – both in same time-frame, and across time-frames? The solution needs to be able to group anomalies into coherent patterns.

6. However, the solution should be flexible enough to keep learning new patterns as time evolves.

I got the data a month back. I am getting analysis results today. It’s TOO LATE!!!

The last set of challenges was on speed. You can never wait too long to act in a fast-moving and evolving market.

The ask:

7. We need results ASAP.

THE SOLUTION

How Daisho Did it

Daisho tackled all these questions by framing and solving it as two tough machine learning problems:

1. A flexible, scalable, automatic, alerts identification algorithm for all metrics at all granularities (running  in to more than 750K separate time series data!).

Daisho’s modelling API built more than 750,000 time series models in less than an hour, providing robust baselines after adjusting for product/SKU and market characteristics -- completely automatically:

  • Is there is a 3-month seasonality, or a 6-seasonality, or a combination of them?  -- it was identified automatically using Fourier components. 
  • Is the time series impacted by a short-term 3-month trend, or it’s more correlated with just last month’s data? -- it was identified automatically by running multiple models and then choosing the best one. 
  • And once the time series was modelled -- since they are backed by fairly sophisticated Bayesian methods -- it did not just predict an expected value in the future, but also an uncertainty around it. Anytime a probabilistically rare value is observed i.e. a value beyond a certain range of certainty, it’s identified as an alert. 

    When new data data comes in, all the models just re-update themselves -- completely automatically!

    2. An automatic way of combining two or more historical alerts as patterns, and ranking them by their quality i.e. a level of confidence based on which one can act on it

    Daisho’s pattern detection API looks through all combinations of alerts now and in the recent past and neatly ties up alerts in the past and how it correlates it with extreme movements of metrics of business interest e.g. market share or sales. It even measures how much more likely an extreme market share or sales movement would be based on current situations. These patterns are the problems which the business wants to focus on.

    For example, one pattern picked up when stock at distributors is going down, which is potentially a sign of supply-chain disruption at local levels.  Another picked up when items with less free grammage are dumped to distributors and are not getting picked up by retailers.

    The deployed solution answers ALL 7 asks that are listed above. Results come in within ONE hour of the data flowing in. Problems are automatically surfaced, interpreted, and presented to brand managers.