## 1. Terms

For this discussion we will use the following set of measurements: 45.66 45.66 45.68 45.65

Some terms we’ll encounter:

### a. Discrepancy

The difference between any two measurements of the same quantity.

Example: the discrepancy between the highest and lowest measurements is 0.03

### b. Most Probable Value (MPV)

This is a fancy way of saving “average” a.k.a. “mean”. In simplest form, it's the measurement sum divided by the number of measurements. Equation D-1 MPV: most probable vlaue meas: individual measurement n: number of measurements

Example: MPV = (45.66+45.66+45.68+45.65)/4 = 45.662

In some analyses where measurements of different accuracies are mixed you’ll see reference to the “weighted average.” This takes into account the measurement quality variations. For our purposes, we’ll stick with a straight (“unweighted”) average.

### c. Residual (v)

The difference between a measurement and the MPV. This is like a discrepancy except it’s always compared against the MPV.

Example: Residual of the first measurement is 45.66-45.662 = -0.002

It doesn't really matter if the MPV is subtracted from the measurement or vice versa since we will ultimately square each so the mathematical sign goes away. Just be consistent: (measurement-MPV) or (MPV - measurement).

### d. Normal Distribution Curve

If we compute all the residuals and graph their frequency we would get the traditional bell-shaped curve, Figure D-1. This curve is symmetric about the y-axis since some of the residuals will be positive and some negative. The graph is asymptotic- it approaches the x-axis but never touches it. Figure D-1 Normal Distribution Curve

Of course with only four measurements in this example, ours would be a pretty sorry looking bell. Just like flipping the coin, as you approach an infinite number of measurements your graph becomes a smooth curve.

### e. Standard Deviation

On both sides of the curve is an inflection point: where the curve changes from convex to concave. These occur at ±σ along the x-axis, Figure D-2. Figure D-2 Standard Deviation

To compute the standard deviation: Equation D-2 σ: standard deviation v: residual n: number of measurements

You’ll also see another form of the equation: Equation D-3

What's the difference? This is a little over-simplified, but basically:

Equation D-2 is used with an entire data set.

Equation D-3 is used for a subset of all data.

Equation D-3 is sometimes referred to as the Standard Error. It's used when it's not practical to analyze all the data. For example: consider a factory making Do-dads. Instead of measuring each Do-dad (thus slowing production and, in the case of destruction testing, immediately driving itself out of business), a random sampling is taken and its statistics used to describe the entire production run.

In measurement science, the only way we can quantify error present in an unknown quantity is to measure an infinite number of times. Since this is not practical, we instead settle for a limited number of measurements and based our statistical analysis on the complete measurement set. So we use Equation D-2.

Note that as n increases, the difference between Equations D-2 and D-3 decreases. When n hits infinity, division by infinity or  (infinity-1) makes little difference.

Anyway, back to Standard Deviation, Equation D-2...

This area bounded by ±σ and the curve is ~68.3% of the total area under the entire normal distribution curve. What that means is ~68% of our measurements fall within ±σ of the MPV.

Standard deviation is an indicator of the measurement set precision: the smaller the standard deviation, the smaller the data spread, the better the precision. Figure D-3 demonstrates this for two different measurement sets. Figure D-3 Standard Deviation and Precision

### f. Confidence Interval (CI)

A confidence interval is the degree of certainty that a value falls within a specific range. The standard deviation represents a 68.3% confidence interval: continuing measurements under similar conditions using similar equipment and procedures, we’re 68.3% confident that our results will fall within ±s of our MPV.

Other common CIs are 90% and 95%:

The 90% CI covers 90% of the area under the normal distribution curve. It’s roughly equal to 2 times the standard deviation so is often referred to as the 2 sigma (2σ) confidence. Measurements are expected to fall between ±2σ of the MPV 90% of the time.

The 95% CI, which includes 95% of the area under the curve, is roughly equal to 3 times the standard deviation so is sometimes referred to as the 3 sigma (3σ) confidence. Measurements are expected to fall between ±3σ of the MPV 95% of the time.

Hey, what about 100% confidence? After all, we spent all this money on expensive accurate equipment so we should be able to state 100% confidence, right? Well, you can. Since the normal distribution curve is asymptotic you can state with 100% confidence that your measurement will fall within ±infinity of the MPV.

### g. Standard Error Of the Mean

This the expected error in the MPV. Since we don’t know the exact error present, this is only an indicator. It’s computed from: Equation D-4 EMPV: Error off the mean σ: standard deviation n: number of measurements

Note that it’s a function of the standard deviation and number of measurements. Theoretically, as the number of measurements increases, EMPV decreases: less error with more measurements since the random errors get more chances to cancel.

Standard error of the mean is an indicator of accuracy.