D.3 Test Design

37.571-13GPPPart 1: Conformance test specificationRelease 16TSUser Equipment (UE) conformance specification for UE positioning

A statistical test is characterised by:

Test-time, Selectivity and Confidence level

D.3.1 Confidence level

The outcome of a statistical test is a decision. This decision may be correct or in-correct. The Confidence Level CL describes the probability that the decision is a correct one. The complement is the wrong decision probability (risk) D = 1-CL

D.3.2 Introduction: Supplier Risk versus Customer Risk

There are two targets of decision:

a) A measurement on the pass-limit shows, that the DUT has the specified quality or is better with probability CL (CL e.g.95%) This shall lead to a “pass decision”

The pass-limit is on the good side of the specified DUT-quality. A more stringent CL (CL e.g.99%) shifts the pass-limit further into the good direction. Given that the quality of the DUTs is distributed, a greater CL passes less and better DUTs.

A measurement on the bad side of the pass-limit is simply “not pass” (undecided)

aa) Complementary:

A measurement on the fail-limit shows, that the DUT is worse than the specified quality with probability CL.

The fail-limit is on the bad side of the specified DUT-quality. A more stringent CL shifts the fail-limit further into the bad direction. Given that the quality of the DUTs is distributed, a greater CL fails less and worse DUTs.

A measurement on the good side of the fail-limit is simply “not fail”.

b) A DUT, known to have the specified quality, shall be measured and decided pass with probability CL. This leads to the pass limit.

For CL e.g. 95%, the pass limit is on the bad side of the specified DUT-quality. CL e.g.99% shifts the pass-limit further into the bad direction. Given that the DUT-quality is distributed, a greater CL passes more and worse DUTs.

bb) A DUT, known to be an (ε🡪0) beyond the specified quality, shall be measured and decided fail with probability CL.

For CL e.g.95%, the fail limit is on the good side of the specified DUT-quality.

Note the different sense for CL in (a), (aa) versus (b), (bb).

NOTE: For constant CL in all 4 bullets, (a) is equivalent to (bb) and (aa) is equivalent to (b).

D.3.3 Supplier Risk versus Customer Risk

The table below summarizes the different targets of decision.

Table D.3.3: Equivalent statements

Equivalent statements, using different cause-to-effect-directions,

and assuming CL = constant >0.5

cause-to-effect-directions

Known measurement result 🡪 estimation of the DUT’s quality

Known DUT’s quality 🡪 estimation of the measurement’s outcome

Supplier Risk

A measurement on the pass-limit shows, that the DUT has the specified quality or is better

(a)

A DUT, known to have an (ε🡪0) beyond the specified DUT-quality, shall be measured and decided fail

(bb)

Customer Risk

A measurement on the fail-limit shall shows, that the DUT is worse than the specified quality

(aa)

A DUT, known to have the specified quality, shall be measured and decided pass

(b)

NOTE: The bold text shows the obvious interpretation of Supplier Risk and Customer Risk.
The same statements can be based on other DUT-quality-definitions.

D.3.4 Introduction: Standard test versus early decision concept

In standard statistical tests, a certain number of results (ns) is predefined in advance of the test. After ns results the number of bad results (ne) is counted and the error ratio (ER) is calculated as ne/ns.

Applying statistical theory, a decision limit can be designed, against which the calculated ER is compared to derive the decision. Such a limit is one decision point and is characterised by:

– D: the wrong decision probability (a predefined parameter)

– ns: the number of results (a fixed predefined parameter)

– ne: the number of bad results (the limit based on just ns)

In the formula for the limit, D and ns are parameters and ne is the variable. In the standard test ns and D are constant. The property of such a test is: It discriminates between two states only, depending on the test design:

– pass (with CL) / undecided (undecided in the sense: finally undecided)

– fail (with CL) / undecided (undecided in the sense: finally undecided)

– pass (with CL) / fail (with CL) (however against two limits).

In contrast to the standard statistical tests, the early decision concept predefines a set of (ne, ns) co-ordinates, representing the limit-curve for decision. After each result a preliminary ER is calculated and compared against the limit-curve. After each result one may make the decision or not (undecided for later decision). The parameters and variables in the limit-curve for the early decision concept have a similar but not equal meaning:

– D: the wrong decision probability (a predefined parameter)

– ns: the number of results (a variable parameter)

– ne: the number of bad results (the limit. It varies together with ns)

To avoid a “final undecided” in the standard test, a second limit must be introduced and the single decision co-ordinate (ne, ns) needs a high ne, leading to a fixed (high) test time. In the early decision concept, having the same selectivity and the same confidence level an “undecided” does not need to be avoided, as it can be decided later. A perfect DUT will hit the decision coordinate (ne, ns) with ne=0. This test time is short.

D.3.5 Standard test versus early decision concept

For Supplier Risk:
The wrong decision probability D in the standard test is the probability, to decide a DUT in-correctly in the single decision point. In the early decision concept there is a probability of in-correct decisions d at each point of the limit-curve. The sum of all those wrong decision probabilities accumulate to D. Hence d<D

For Customer Risk:
The correct decision probability CL in the standard test is the probability, to decide a DUT correctly in the single decision point. In the early decision concept there is a probability of correct decisions cl at each point of the limit-curve. The sum of all those correct decision probabilities accumulate to CL. Hence cl<CL or d>D

D.3.6 Selectivity

There is no statistical test which can discriminate between a limit-DUT-quality and a DUT-quality which is an (ε🡪0) apart from the limit in finite time and confidence level CL>1/2. Either the test discriminates against one limit with the results pass (with CL)/undecided or fail (with CL)/undecided, or the test ends in a result pass (with CL)/fail (with CL) but this requires a second limit.

For CL>0.5, a (measurement-result = specified-DUT-quality), generates undecided in test “supplier risk against pass limit” (a in clause D.3.2) and also in the equivalent test against the fail limit (aa in clause D.3.2)

For CL>0.5, a DUT, known to be on the limit, will be decided pass for the test “customer risk against pass limit” (b in clause D.3.2) and also in the equivalent test against fail limit (bb in clause D.3.2).

This overlap or undecided area is not a fault or a contradiction, however it can be avoided by introducing a Bad or a Good DUT quality according to:

– Bad DUT quality: specified DUT-quality * M (M>1)

– Good DUT quality: specified DUT-quality * m (m<1)

Using e.g. M>1 and CL=95% the test for different DUT qualities yield different pass probabilities:

Figure D.3.6: Pass probability versus DUT quality

D.3.7 Design of the test

The test is defined according to the following design principles:

1. The early decision concept is applied.

2. A second limit is introduced: Bad DUT factor M>1

3. To decide the test pass:

Supplier risk is applied based on the Bad DUT quality

To decide the test fail

Customer Risk is applied based on the specified DUT quality

The A-GNSS test cases are defined using the following parameters:

1. Specified DUT quality: ER = 0.05

2. Bad DUT quality: M=1.5 (selectivity)

3. Confidence level CL = 95% (for specified DUT and Bad DUT-quality)

The ECID and OTDOA test cases are defined using the following parameters:

1. Specified DUT quality: ER = 0.1

2. Bad DUT quality: M=1.5 (selectivity)

3. Confidence level CL = 95% (for specified DUT and Bad DUT-quality)

This has the following consequences:

a) A measurement on the fail limit is connected with 2 equivalent statements:

A measurement on the fail-limit shows, that the DUT is worse than the specified DUT-quality

A DUT, known to have the specified quality, shall be measured and decided pass

A measurement on the pass limit is connected with the complementary statements:

A measurement on the pass limit shows, that the DUT is better than the Bad DUT-quality.

A DUT, known to have the Bad DUT quality, shall be measured and decided fail

The left column is used to decide the measurement.

The right column is used to verify the design of the test by simulation.

The simulation is based on the two fulcrums A and B only in Figure D.3.6. There is freedom to shape the remainder of the function.

b) Test time

1. The minimum and maximum test time is fixed.

2. The average test time is a function of the DUT’s quality.

3. The individual test time is not predictable (except ideal DUT).

c) The number of decision co-ordinates (ne, ns) in the early decision concept is responsible for the selectivity of the test and the maximum test time. Having fixed the number of decision co-ordinates there is still freedom to select the individual decision co-ordinates in many combinations, all leading to the same confidence level.