At AppSignal, we want to give you a picture of the performance of your application and alert you to anomalies, like high response times. We apply several calculations to your metric data to help you visually interpret and access this information.
In this article, we'll give you a crash course in mean and median calculations and then explain how AppSignal calculates percentiles so the next time you look at a performance dashboard, you can better interpret the data represented.
The mean is the average of a dataset and is calculated by adding all of the numbers in a dataset together and dividing them by the count of the number set. Let's imagine we have an array of the following integers:
[10, 15, 25, 30, 30, 60, 75]
To calculate the mean, we sum our integers and divide them by their count, which is 7:
sum = 10 + 15 + 25 + 30 + 30 + 60 + 75 245 mean = sum / 7 35
This gives us a mean of 35.
The mean is a useful statistic as it represents a benchmark of what is typical and makes it easier to identify a-typical statistics, which would be far lower or higher than our mean value.
However, the mean is not foolproof; anomalies with extremely high or low values can skew the mean, with the mean not reaching a low or high value until the majority of the data points used to calculate the mean are high and low, which defeats the point of proactively monitoring your application.
The median is a data set's midpoint. How you calculate the median depends on if the count of values in a dataset is odd or even:
Our data set:
[25, 10, 15, 30, 60, 75, 30]
[10, 15, 25, 30, 30, 60, 75]
Our data set:
[25, 10, 15, 30, 60, 75]
[10, 15, 25, 30, 60, 75]
sum = 25 + 30 55 median = sum / 2 27.5
This calculation gives us a median of 27.5.
Like the mean, the median is a useful statistic as it represents a benchmark of what is typical and makes it easier to identify a-typical statistics, which would be far lower or higher than our median value. As the median is calculated using the position of data in an ordered data set, it is less sensitive or easily skewed as the medium.
While the mean and median can provide insights into your application's performance, they can fail to show you the times your application is not performing optimally as a single monitoring data point.
To illustrate this point, let's visualize the response times of an API that is queried ten times per hour, first the data:
# Response times for each request, made ten times per hour # 13:00 - 14:00 [100, 100, 200, 200, 300, 300, 400, 400, 500, 1000] # 14:00 - 15:00 [200, 2000, 200, 200, 300, 100, 100, 200, 400, 200] # 15:00 - 16:00 [100, 100, 100, 200, 300, 100, 100, 200, 300, 200] # 16:00 - 17:00 [100, 100, 100, 200, 300, 200, 200, 500, 1000, 500] # 17:00 - 18:00 [1000, 100, 2000, 200, 100, 200, 100, 300, 100, 200] # 18:00 - 19:00 [1000, 1000, 2000, 200, 100, 100, 100, 300, 100, 200] # 19:00 - 20:00 [300, 100, 200, 200, 100, 400, 100, 200, 100, 100]
Now let's chart this data:
As you can see, if we only provided the mean as a statistic, we'd not notice some anomalous spikes where our API endpoint took significantly longer to respond.
That's why AppSignal uses percentiles to provide more thorough insights into your application's behavior.
While the median acts as the halfway marker of a dataset, percentiles allow us to be much more targeted in what metric data we wish to show.
For example, if we look at the 95th percentile, 95% of our dataset would be below the percentile. Using response times as an example, if we trace the 95th percentile, we can trace our max response times, which we can then investigate.
The below graph shows the mean, 90th and 95th percentile for our API response times:
If we only tracked the mean of our application's response time, we'd have a very scoped view of our application's actual performance. This is why we want to measure percentiles; with percentiles, we get a far more comprehensive understanding of how our application is behaving and can spot problematic anomalies, such as unusually high response times, far quicker than we can by purely relying on the mean.
Percentiles are calculated similarly to the median. If we wanted to know the 90th percentile, we'd first determine the percentiles index using the below calculation:
(percentile / 100) * number_of_data_points - 1
The results will vary depending on if we are working with an odd or even dataset.
We have nine values in our dataset, so first, we'll calculate the index:
index = (90 / 100) * 9 - 1 8.1
This calculation gives us a value of 8.1, and we'll round that to the nearest whole value, in this case, 8. We then take the number at the 1st index of our dataset:
array = [50, 50, 100, 250, 250, 250, 300, 300, 350] array[8] 350
So, in this case, our 90th percentile is 350.
We have ten values in our dataset, so first, we'll calculate the index:
(80 / 100) * 10 - 1 = 8.1
Based on our calculation, if we round the result to the nearest whole number, the index we need to take the value at index 8 in our dataset.
However, because we are dealing with an even data set, we need a second value that is positioned one index higher than our calculated index. We'll use both these values to calculate a median with this formula:
(array[index] + array[index + 1]) / 2
Let's apply this to our data:
array = [50, 50, 100, 250, 250, 250, 300, 300, 350, 450] 90th_percentile = (array[index] + array[index + 1]) / 2 400
This means our 90th percentile is 400.
Because of how means and percentiles are calculated, it's possible in some situations to have a scenario where the 90th percentile is lower than the Mean, for example if our response time graph contained the following data:
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 2000]
We'd have a mean of 272.72, and a 90th percentile of 100, which, when charted, would still single your metric data contains a large outlier:
AppSignal is the developer-driven APM that offers all its customers an awesome, always above-average service. We've designed AppSignal to help developers of all abilities monitor applications of all sizes intuitively with fantastic features like anomaly alerts that let you know when we detect anomalies in your application's performance.
Ready to spot performance anomalies with AppSignal? Be sure to sign up for a free trial.
AppSignal offers a 30-day free trial, no credit card is required. All features are available in all plans. Start monitoring your application in just a few clicks!