Part Human, Part Machine Learning, All Powerful Analyst

Lab Overview
Lesson 1 - Anomaly Detection
Lesson 2 - Contribution Analysis
Lesson 3 - Intelligent Alerts
Next Steps
Appendix

Lab Overview

In this lab we will show you how to take advantage of some of the analysis tools within Adobe Analytics to better understand and draw insights from your data. We will focus on Anomaly Detection, Contribution Analysis, and Intelligent Alerts to identify unexpected changes in your data, understand what may be causing those changes, and configure alerts so that you can be notified quickly about the trends you most care about.

Key Takeaways

How to run Anomaly Detection, which will automatically identify departures from your usual patterns
How to run Contribution Analysis, which can point to likely causes of anomalies in your data
How to configure Intelligent Alerts, which can notify you when there are significant changes in your data

Prerequisites

All of the tools we will be exploring are capabilites within Adobe Analytics
Terminology
- Metric: a numeric value, such as Page Views, Visits, or Average Time Spent
- Dimension: non-numeric values such as Region, Browser, or Channel. Each individual value of a Dimension (Chrome vs Safari) is referred to as an Element, or Dimension-element. The Contribution Analysis portion of this lab will focus on how Metrics such as Visits break into different Dimension-elements.
- Segment: a set of rules based on Metrics and/or Dimensions used to identify a sub-group. For example, Region = US and Time Spent > 1 minute.

Lesson 1 - Anomaly Detection

Objectives

Learn when Anomaly Detection is run automatically and how to access detailed results
Understand the output generated by Anomaly Detection
Understand how the training data (time window selection) affects Anomaly Detection results

Lesson Context

The inspiration for Anomaly Detection came from one of our product managers (John) while working in Adobe's consluting group. A client called and reported that the conversion rate on their website had fallen by 0.8%. John first needed to know whether that kind of change was actually unusual for the client. After many hours of pulling together data and writing code for analysis, he concluded that the change was statistically (and practically) significant. The intent with Anomaly Detection is for everything John did to now happen automatically for you whenever you're looking at data in Analysis Workspace.

When you create a new table in Analysis Workspace, it can be difficult to quickly digest the data and notice where the biggest changes occur. When you do see an uptick or downturn in the data, your first question is usually something like "Does this happen every week? Every year?" The Anomaly Detection algorithm learns the seasonal patterns in your data and will automatically account for them when identifying anomalies.

Algorithm Background

Anomaly Detection fits several time series models to the 35 days preceding your selected time window. It then compares how well each model represents the data within these 35 days, and selects the model with the best fit. This model is used to create a forecast range from each point within your window to the next point. When a point falls outside of this forecasted range, it is flagged as an anomaly.

The algorithm fits separate models for weekdays and weekends, enabling it to capture large changes in weekend behavior without triggering an anomaly.
Each model will also use data from the previous year, if available. Annual data is shifted by one day (or two for leap years) so that Friday of this year is compared to Friday of last year.
The default confidence for forecasting is 99% for daily data to reduce false positives. This can't be changed for automatic jobs, but we will see how to configure confidence level for automated alerts later.

Data Background

In our examples we will take the role of an analyst on our first day at a brand-new clothing company called Luma. We know the company is young so we don't expect to see huge volumes of data yet, but we want to get in and start to get familiar with what's available.

Exercise 1.1

Log in to your lab machine with the username provided and password Adobe2019!
Start Google Chrome and click the Experience Cloud bookmark
Click Sign In with an Adobe ID and log in with the credentials provided.
Click the grid in the top right, then click Analytics
Click Workspace in the top left banner then Create New Project > Create
You should now have a blank project with an empty Freeform Table
Change the data range in the top right to December 1, 2018 through January 31, 2019.
Drag the metric Visits from the left bar into the metric drop zone of the Freeform Table
In the top right, you should briefly see Searching for anomalies...

Note: Anomaly Detection will only run when the Dimension in the table is a time variable: Hour, Day, Week, Month
If there are no anomalies, this will soon change to No anomalies found
If any anomalies are found, they will be marked in the corner of the cell
To see details of the Anomaly Detection results, let's add a line visualization to our project. Click the bar graph icon in the top left to bring up visualizaton options, then drag Line toward the top of the table.
You will get a line graph with anomalies highlighted, as well as the green shaded confidence region that shows where we expected your data to be during this time window.

Exercise 1.2

It is important to understand how the date selection of your table affects the results of Anomaly Detection. The algorithm uses the 35 days before your selection (as well as the previous year, if available) to train a model. Because of this, a data point that is anomalous under one selection may not be under another.

To see an extreme example of this, change the time window to Nov 1 through Dec 31. We now have a lot of anomalies in our report, but most of the data doesn't look erratic.

Anomaly Detection report for Nov 2018 - Jan 2019 with many anomalies
If you see anomalies that aren't intuitive to you, especially many at once, it can help to take a look at the data Anomaly Detection is using to set its expectations. Let's now include October in the time range so we can see what happened that resulted in this graph.

Anomaly Detection report for Oct 2018 - Jan 2019 showing empty training data
Now we can see what the problem is - there is no data for October. With a flat line of zeroes in October, the activity in November looks anomalous in comparison.

Note: We are releasing an update to Anomaly Detection in a few weeks that will help it recover more quickly from situations like this, but the results will still be dependent on the time window selected.
Note the anomalies in early January (before the big spike), and let's return the window to just December and January.

Anomaly Detection report for Dec 2018 - Jan 2019 showing only one anomaly in January
Now that the November data is used as the training period, a more appropriate model is created that only finds one anomaly during January.