Review Article
Two-stage Multiple Hypotheses LAO Test of Distributed Detection System for Many Families of Distributions
Department of Statistics and Mathematics, Islamic Azad University, Ahvaz Branch, Iran
There is a considerable literature on the problems of distributed detection and decision in engineering contexts such as Kreidl et al. (2011), Tsitsiklis (1988) and Tsitsiklis and Athans (1985). The problem is important because the components of a distributed detection system may amass more data than they can transmit to a fusion node and must summarize that data by choice of a message drawn from a small set. The decentralized or distributed detection problem was first formulated and studied by Tenney and Sandell (1981) which considers a parallel configuration whereby each sensor makes an observation and sends a quantized version of that observation to a fusion center. The goal is to make a decision on the possible hypotheses, based on the messages received at the fusion center.
The most common architecture in distributed detection is the parallel system depicted in Fig. 1. It consists of N geographically dispersed sensors, one-way communication links and a fusion center. Each sensor makes an observation denoted by of a random source, quantizes Xi into an M-ary message Ui = gi(Xi) and then transmits to the fusion center. Upon receipt of U1, U1, , UN, the fusion center makes a global decision U0 = D(U1, U2, , UN) about the nature of the random source.
The optimal design of system, entails choosing quantizers g1, g2, , gN and a global decision rule D so as to optimize the reliabilities.
Fig. 1: | Multiple hypotheses parallel distributed detection system, H: Hypothesis, P: Probability distribution, X: Observation, g: Quantizer, U: Message |
The messages U1, U2, , UN are all transmitted to the fusion center which declares hypothesis to be true, applying a decision rule D.
As was shown by Tsitsiklis (1988), when N tends to infinity, the error exponents of the absolutely optimal system coincide with those achieved by the best identical quantizer system. Now, in an N-sensor identical quantizer system, the quantizer outputs U = g(X) are clearly i.i.d. The optimal error exponents are then obtained by choosing the mapping g so as to maximize the appropriate functional such as reliabilities matrix. In case of two hypotheses both reliabilities corresponding to two possible error probabilities could not be increased simultaneously, it is an accepted way to fix the value of one of the reliabilities and try to make the tests sequence get the greatest value of the remaining reliability.
The need of testing of more than two hypotheses in many scientific and applied fields has essentially increased recently. The models of multiple hypotheses optimal testing are studied in some direct such as Ahlswede and Haroutunian (2006), Hoeffding (1965), Haroutunian (1990) and Tusnady (1977). The models of the two-stage LAO testing in multiple hypotheses for a pair of families and many families of Probability Distributions (PDs) are investigated by Hormozi Nejad and Haroutunian (2012a) and Hormozi Nejad et al. (2011) and the model of two-stage LAO test of distributed detection for a pair of family of PDs is investigated by Hormozi Nejad and Haroutunian (2012b). In this paper the problem of distributed detection of two-stage multiple hypotheses LAO testing to detect between hypotheses consisting of many families of PDs is studied. The matrices of optimal asymptotic interdependencies of all pairs of the error probability exponents are studied.
Some preliminaries are in coming each sensor observation x takes values in the set X. A deterministic M-ary quantizer is a measurable mapping g from the observation space X to the message space U = {1,2, ,M}.
Random Variable (RV) X characterizing the studied object takes values in the set X and P(X) is the space of all distributions on X. The random source has S hypothetical PDs that are divided in K disjoint families of PDs. The first family includes R1 PDs P1, P2, , PR1, the second family consists of R2 PDs PR1+1, PR1+2, , PR1+R1 and etc., the K-th family have RK PDs. The distributions of X under hypotheses are denoted by . The distribution of the messages produced by g are denoted by Pi(g) and it is obtainable from Pi and g.
Let N-sample x = (x1, x2, , xN) be a vector of results of N independent observations of the RV X and u = (u1, u2, , uN) be a vector of results of N transmitted messages to the fusion center. The purpose of the test is using sample u to detect the actual distribution from given list. The divergence (Kullback-Leibler distance) of PDs P and Q, is defined by Cover and Thomas (1991) and Haroutunian et al. (2008) as follows:
ONE-STAGE MULTIPLE HYPOTHESES LAO TEST OF DISTRIBUTED DETECTION
The procedure of making decision on the base of N-sample is called by the test φN. The statistician must detect one among S hypotheses. An answer must be defined using vector of results of N-sample u = (u1, u2, , uN).
The probabilities of the erroneous acceptance of hypothesis Hl provided that Hs is true, are defined as follows:
(1) |
If the hypothesis Hs is true but it is not accepted, then the probability of error is:
(2) |
Corresponding "reliabilities", are defined as follows for infinite sequence of tests φ:
(3) |
It follows from Eq. 1-3 that for every test sequence φ:
The following theorem contains the solution of problem of LAO test φ* construction and existence conditions of the test of elements of matrix E(φ*) of which are positive. For construction of the necessary LAO test for preliminarily given positive values E1|1, E2|2, , ES-|S-1, the following subsets of distributions are defined:
(4) |
(5) |
(6) |
Theorem 1: Hormozi Nejad and Haroutunian (2012b): If all distributions are different and the positive values E1|1, E2|2, , ES-|S-1 are such that the following inequalities hold:
(7) |
(8) |
then there exists a LAO sequence of tests, all elements of the reliabilities matrix E* = {E*l|s} of which are positive and are defined in Eq. 4-6.
When one of the inequalities Eq. 7-8 is violated, then at least one element of the matrix E* is equal to zero.
TWO-STAGE MULTIPLE HYPOTHESES LAO TEST OF DISTRIBUTED DETECTION
Now another version of testing will be discussed by supposing N = N1+N2 such that:
and so vectors of messages are as follows:
The two-stage test on the base of N-sample denoted by is the system depicted in Fig. 2. The first stage is to choice of a family of PDs, it is executed by a non-randomized test (u) using the first messages sample u1. The next stage of test is a non-randomized test (u2,U') based on another messages sample u2 and the outcome of test (u1) that is the first fusion center U'.
First stage of two-stage test of distributed detection: The first stage of decision making consists of using the first messages sample u1 for selection of one family of Pds and it is shown by a test (u1).
Consider for convenience the cumulative numbers:
and the sets of indexes:
Therefore, suppose there are K disjoint families of PDs P1, P2, , PK such that:
Let be the probability of the erroneous acceptance of the m-th family of PDs provided that the k-th family of PDs is true (that is the correct PD is in the k-th family):
Fig. 2: | Two-stage multiple hypotheses test of distributed detection system, H: Hypothesis, P: Probability distribution, X: Observation, g: Quantizer, U: Message, U', U": Decision rules |
The reliabilities of the infinite sequence of tests φ1 are considered as follows:
For preliminarily given positive values E'1|1, E'2|2, , E'S-|S-1, the following subsets of distributions are defined:
(9) |
(10) |
(11) |
Theorem 2: Hormozi Nejad and Haroutunian (2012a): If all distributions are different and the positive values E'1|1, E'2|2, , E'S-|S-1 are such that the following inequalities hold:
(12) |
(13) |
then there exists a LAO sequence of tests, all elements of the reliabilities matrix E'* = {E'*m|k} of which are positive and are defined in Eq. 9-11.
When one of the inequalities Eq. 12-13 is violated, then at least one element of the matrix E'* is equal to zero.
Second stage of the two-stage test of distributed detection: The test (u2,U') can be defined by using result of the first fusion center U' and the second messages sample u2.
The probability of the fallacious acceptance of PD Pl at the second stage of test, when Ps is correct and k-th family of PDs is accepted, is:
The probability to reject Ps when it is true and k-th family of PDs is accepted, is:
(14) |
Corresponding reliabilities for the second stage of test are:
(15) |
It follows from Eq. 14 and 15:
Theorem 3: Haroutunian (1990) and Hormozi Nejad and Haroutunian (2012b): If at the first stage of test the k-th family of PDs is accepted, then for given positive and finite values E"s|s, SεDk, s≠Ck of the reliabilities matrix, let us investigate the regions:
and the following values of elements of the future reliabilities matrix E"(φ2*) of the LAO test sequence:
When the following compatibility conditions are valid:
then there exists a LAO sequence of test φ29, elements of reliabilities matrix of which are defined above and are positive.
Even if one of the compatibility conditions is violated, then has at least one element equal to zero.
Reliabilities of the two-stage test of distributed detection: The tool of making decision according to N-sample denoted is organized by a pair of LAO tests and . Similarly, definitions of error probabilities and reliabilities of two-stage test are as follows:
So error probabilities can be considered as follows:
Using Lemma of types is defined by Cover and Thomas (1991) and Haroutunian et al. (2008), the following equality can be created:
(16) |
According to Eq. 16 and definition of reliabilities are obtained:
(17) |
(18) |
(19) |
Theorem 4: If for different distributions compatibility conditions of Theorems 2 and 3 are satisfied, then elements of reliabilities matrix of the two-stage test are defined in Eq. 17-19.
When one of the compatibility conditions is violated, then at least one element of is equal to zero.
The reliabilities of a distributed detection system investigated as the number of sensors tend to infinity. It is assumed that the sensor data are quantized into M-ary messages and transmitted to the fusion center for multiple hypotheses testing concerning many families of PDs. The optimal reliabilities in a pair of stages characterized and the compatibility conditions provided for this to happen and description of characteristics of LAO hypotheses testing of distributed detection investigated. The goal was to make a decision on many possible hypotheses, based on the messages received at the fusion centers.
NOTATIONS
N | = | Numbers of sensors |
M | = | Numbers of messages |
U | = | Message |
U', U", U0 | = | The fusion centre |
D | = | Global decision rule |
H | = | Hypothesis |
g | = | Quantizer |
X | = | The space of source data |
U | = | The space of messages |
S | = | Number of hypotheses |
K | = | Number of families |
Rk | = | Number of PDs in k-th family |
Pk | = | k-th probability distribution(PD) |
x = (x1, x1, , xN) | = | Vector of observations |
u = (u1, u1, , uN) | = | Vector of transmitted messages |
φ, φ1, φ2, Φ | = | Tests |
α | = | Error probability |
E | = | Reliability |
R, D | = | Sets |