Averaging Error in Xtraction Dashbaord Time Component Series during average calculation

Document ID : KB000046158
Last Modified Date : 14/02/2018
Show Technical Document Details

Question:

How can we interpret average of averages within Xtraction for calculated series

Answer:

Follow the below simple understanding of comparing the average of averages.

Consider the below screenshot for example which requires interpretation:

Bug Averaging.jpg

(1)          Is showing January’s average and February’s average being averaged together.  This is an average of averages and is, in most cases, an inappropriate calculation.

(2)          Is showing the overall average of both month’s numbers together.  This is, most likely, your desired calculation

It may be hard to see the fallacy in the logic because the numbers involved are so close to one another.  So, let’s look at the same example, but with more extreme numbers so that the problem is more clearly visible.

Assume the numbers looked like this:

 

 

Group starts with X (S1)

Group does not start with X (S2)

Jan

1

1

Feb

1

1000

Combined (Jan+Feb)

2

1001

 

Using the formula as mentioned in above screenshot of S2/(S1+S2)*100:

                January = 1/(1+1)*100 = 50 (S2 represents 50% of the month’s tickets)

                February = 1000/(1 + 1000) * 100 = 99.9 (S2 represents virtually all of the month’s tickets, so it is close to 100%)

Calculation (1) is averaging the averages = (50+99.9)/2 = 74.95 (the average of 50 and 100 is 75)

Calculation (2) is combining the 2 month’s combined numbers in the calculation = 1001/(2+1001) *100 = 99.8

And, obviously, 75 <> 99.8.

Usually #2 is the correct calculation that people want to use.  Usually, you do NOT want to take the average of averages (calculation #1).


Additional Information:

For more reading on this issue, also known as Simpson’s Paradox, you can Google “average of averages” or “Simpson’s Paradox”.