For this discussion we will use the following set of measurements: 45.66 45.66 45.68 45.65
Some terms we’ll encounter:
The difference between any two measurements of the same quantity.
Example: the discrepancy between the highest and lowest measurements is 0.03
b. Most Probable Value (MPV)
This is a fancy way of saving “average” a.k.a. “mean”. In simplest form, it's the measurement sum divided by the number of measurements.
MPV: most probable vlaue
Example: MPV = (45.66+45.66+45.68+45.65)/4 = 45.662
In some analyses where measurements of different accuracies are mixed you’ll see reference to the “weighted average.” This takes into account the measurement quality variations. For our purposes, we’ll stick with a straight (“unweighted”) average.
c. Residual (v)
The difference between a measurement and the MPV. This is like a discrepancy except it’s always compared against the MPV.
Example: Residual of the first measurement is 45.66-45.662 = -0.002
It doesn't really matter if the MPV is subtracted from the measurement or vice versa since we will ultimately square each so the mathematical sign goes away. Just be consistent: (measurement-MPV) or (MPV - measurement).
d. Normal Distribution Curve
If we compute all the residuals and graph their frequency we would get the traditional bell-shaped curve, Figure D-1. This curve is symmetric about the y-axis since some of the residuals will be positive and some negative. The graph is asymptotic- it approaches the x-axis but never touches it.
Normal Distribution Curve
Of course with only four measurements in this example, ours would be a pretty sorry looking bell. Just like flipping the coin, as you approach an infinite number of measurements your graph becomes a smooth curve.
e. Standard Deviation
On both sides of the curve is an inflection point: where the curve changes from convex to concave. These occur at ±σ along the x-axis, Figure D-2.
To compute the standard deviation:
|σ: standard deviation
n: number of measurements
You’ll also see another form of the equation:
What's the difference? This is a litle over-simplified, but basically:
Equation D-2 is used with a subset of all data.
Equation D-3 is used for an entire set of data.
Equation D-3 is sometimes referred to as the Standard Error because the data set is finite and can be used in its entirety to determine error present. For example: consider a factory making Do-dads. In the course of one day a finite number of Do-dads are produced as part of a batch. If we weigh each Do-dad, we would know the total weight error in the batch.
In measurement science, the only way we can quantify error present in an unknown quantity is to measure an infinite number of times. Since this is not practical, we instead settle for a limited number of measurements, a subset of infinity. So we use Equation D-2.
Note that as n increases, the difference between Equations D-2 and D-3 decreases. When n hits infinity, division by infinity or (infinity-1) makes little difference.
Anyway, back to Standard Deviation, Equation D-2...
This area bounded by ±σ and the curve is ~68.3% of the total area under the entire normal distribution curve. What that means is ~68% of our measurements fall within ±σ of the MPV.
Standard deviation is an indicator of the measurement set precision: the smaller the standard deviation, the smaller the data spread, the better the precision. Figure D-3 demonstrates this for two different measurement sets.
| Figure D-3
Standard Deviation and Precision
f. Confidence Interval (CI)
A confidence interval is the degree of certainty that a value falls within a specific range. The standard deviation represents a 68.3% confidence interval: continuing measurements under similar conditions using similar equipment and procedures, we’re 68.3% confident that our results will fall within ±s of our MPV.
Other common CIs are 90% and 95%:
The 90% CI covers 90% of the area under the normal distribution curve. It’s roughly equal to 2 times the standard deviation so is often referred to as the 2 sigma (2σ) confidence. Measurements are expected to fall between ±2σ of the MPV 90% of the time.
The 95% CI, which includes 95% of the area under the curve, is roughly equal to 3 times the standard deviation so is sometimes referred to as the 3 sigma (3σ) confidence. Measurements are expected to fall between ±3σ of the MPV 95% of the time.
Hey, what about 100% confidence? After all, we spent all this money on expensive accurate equipment so we should be able to state 100% confidence, right? Well, you can. Since the normal distribution curve is asymptotic you can state with 100% confidence that your measurement will fall within ±infinity of the MPV.
g. Standard Error Of the Mean
This the expected error in the MPV. Since we don’t know the exact error present, this is only an indicator. It’s computed from:
|EMPV: Error off the mean
σ: standard deviation
n: number of measurements
Note that it’s a function of the standard deviation and number of measurements. Theoretically, as the number of measurements increases, EMPV decreases: less error with more measurements since the random errors get more chances to cancel.
Standard error of the mean is an indicator of accuracy.
- Next >>