Anomaly detection helped client predict anomalies 90 minutes ahead of time

Authored by Ameex Technologies on 06 Nov 2019

Our client is a leading cloud service provider for various enterprise customers in domains such as Banks, pharmaceutical companies, Aviation, etc. As enterprises demand 24/7 access to their services and data, reliability remains a challenge for cloud service providers everywhere. It’s not a matter of whether an outage occurs; it’s strictly a matter of when. The client’s objective was to predict an outage in advance and to find the causal relationship, to take corrective steps proactively.

Challenges

  • Need to predict outages
  1. The prediction should not be too alarmist or too slack - i.e it should be more accurate and minimal   false positives.
  2. The prediction should be made well ahead of time to take any corrective actions.
  • As data contains per minute log files, the size of data to handle is enormous.
  • Need a scalable model as the client has many infrastructure devices, each customized to their business partners.
  • Each IP has different parameters, irrespective of the type of device. In other words, Parameters for servers might vary based on IP address and client.

Solution Methodology

  • Exploratorily identified the parameters which are significant in determining the performance of a device.
  • Historical data is considered to get reliable input data and the considered window size of data was large enough to avoid any seasonality and any effects of outliers.
  • As the volume of data is enormous deep learning models have been used.
  • Predictive models are built for each device type to predict the values of parameters based on the time.
  • A notification system has been implemented to alert the client regarding the anomalies so that corrective actions can be taken.

Business Impact

  • Model was able to predict the parameter values with 81.6% of accuracy.
  • The training time was reduced to a minimum by using transfer learning.
  • The Model can predict anomalies 90 minutes ahead of time, enabling the client to take corrective actions.

Want to learn more about how we have helped our clients with their analytics journey? let’s connect!