A New Logical Measure for Quantum Information

Ellerman, David

doi:10.2478/qic-2025-0005

Full Article

1.

Introduction

The purpose of this paper is to derive the relatively new notion of quantum logical entropy ([1,2,3,4]) from the relatively new logic of partitions ([5,6,7]) that is category-theoretically dual to the usual Boolean logic of subsets. The “classical” notion of logical entropy is derived starting as the quantitative version of the distinctions of partitions just as probability is derived starting as the quantitative version of the elements of subsets [8]. The notion of logical entropy is compared and contrasted with the usual Shannon entropy. Then the notion of logical entropy is linearized using a semi-algorithmic procedure to translate set-based concepts into the corresponding vector space concepts. The concept of logical entropy linearized to Hilbert space gives the concept of quantum logical entropy. The linearization procedure also allows results proven at the set level with logical entropy to be extended in a straightforward manner to quantum logical entropy. For instance, logical entropy is a probability measure with a simple probability interpretation (i.e., the two-draw probability of getting a distinction of a partition) and that interpretation extends to quantum logical entropy. That is, given an observable and a quantum state vector, their quantum logical entropy is the probability in two independent projective measurements of the observable on the prepared state that different eigenvalues are obtained. As a measure, the compound notions of difference (or conditional) and mutual logical entropy are immediately defined and their relationships are illustrated in the usual Venn diagrams for measures. The derivation of the quantum logical entropy allows the definitions of the difference and mutual quantum logical entropies which satisfy the corresponding relationships. This method of deriving the concepts of quantum logical entropy makes it a logic-based and natural measure of information for quantum mechanics (QM). Our purpose is to show that naturality and fundamentality of this notion of quantum information so the contrasts and comparisons with other notions such as the von Neumann entropy are left to the reader.

2.

Subset Logic and Partition Logic

Subsets and partitions are category-theoretic dual concepts. That is, in the turn-around-the-arrows duality of category theory, a subset is also called a “part” and “The dual notion (obtained by reversing the arrows) of ‘part’ is the notion of partition” [9, p. 85]. A partition π = {B₁, ..., B_m} on a universe set U = {u₁, ..., u_n} (|U| ≥ 2) is a set of nonempty subsets B_j called “blocks” such that the blocks are disjoint and their union is U. The partitions on U form a lattice Π(U). The partial order (PO) for the lattice is refinement where a partition σ = {C₁, ..., C_m′} is refined by π, written σ ≾ π, if for every block B_j ∈ π, there is a block C_j′ ∈ σ such that B_j ⊆ C_j′ .

At a more atomic or granular level, the elements of a subset are dual to the distinctions (dits) of a partition which are ordered pairs of elements in different blocks of the partition. The set of distinctions or ditset of a partition π is dit(π) ⊆ U × U and the complementary set, $indit (π) = U \times U - dit (π) = \cup_{j = 1}^{m} B_{j} \times B_{j}$ {\rm{indit}}\left( \pi \right) = U \times U - {\rm{dit}}\left( \pi \right) = \cup _{j = 1}^m B_j \times B_j of indistinctions is the equivalence relation on U associated with π where the equivalence classes are the blocks of π. The refinement PO on partitions is the same as the inclusion PO on ditsets: σ ≾ π if and only if (iff) dit(σ) ⊆ dit(π).

The join σ ∨ π is the partition whose blocks are the nonempty intersections B_j ∩ C_j' , and it is the least upper bound of π and σ for the refinement partial order. The ditset of the join is the union of the ditsets: dit(σ ∨ π) = dit(σ) ∪ dit(π). Since the arbitrary intersection of equivalence relations is an equivalence relation, the meet σ ∧ π can be defined as the partition whose ditset is the complement of the smallest equivalence relation containing indit(σ) ∪ indit(π), and it is the greatest lower bound of σ and π. The top of the lattice is the discrete partition 1_U = {{u_i}}_u₁∈U of singletons of the elements of U and the bottom of the lattice is the indiscrete partition 0_U = {U} whose only block is all of U .

The lattice of partitions was known in the 19th century (e.g., Dedekind and Schröder). However, throughout the 20th century “the only operations on the family of equivalence relations fully studied, understood and deployed are the binary join ∨ and meet ∧ operations” [10, p. 445]. To go from a lattice of partitions to a logic of partitions comparable to the usual Boolean logic of subsets, there needs to be at least an implication operation defined on partitions. That operation would be the parallel to the subset implication or conditional S ⊃ T = S^c ∪ T for S, T ⊆ U in the powerset Boolean algebra ℘(U) of subsets of U where the partial order is inclusion, the join and meet are union and intersection respectively, and the top and bottom are U and ∅ respectively. The implication σ ⇒ π is the partition on U that is like π except that whenever a block B_j ∈ π is contained in some block of σ, then B_j is replaced by the singletons of its elements. If we denote a block B ∈ π when it has been “discretized” as 1_B and when it remains whole as 0_B, then the implication σ ⇒ π functions like a characteristic or indicator function for inclusion of π-blocks in σ-blocks. Thus when they are all included, i.e., when refinement holds, then the implication is the discrete partition 1_U : $σ \Rightarrow π = 1_{U} iff σ ≾ π$ \sigma \Rightarrow \pi = {\bf{1}}_U {\rm{iff}}\,\,\,\,\,\,\,\,\sigma \mathbin{\lower.3ex\hbox{$\buildrel\prec\over {\smash{\scriptstyle\sim}\vphantom{_x}}$}} \pi which is just the partition logic version of the subset logic relation: $S \supset T = U iff S \subseteq T .$ S \supset T = U\,{\rm{iff}}\,S \subseteq T.

Thus, the usual Boolean logic of subsets (often presented in only the special case of “propositional logic”) has a dual logic of partitions. The elements (Its) of subsets and the distinctions (Dits) of partitions have dual corresponding roles as illustrated in Table 1.

Table 1.

Elements-distinctions duality between the two dual logics.

	Logic ℘(U) of subsets of U	Logic of partitions Π(U) on U
Its/Dits	Elements of subsets	Distinctions of partitions
P.O.	Inclusion of subsets	Inclusion of ditsets
Join	Union of subsets	Union of ditsets
Meet	Subset of common elements	Ditset of common dits
Impl.	S ⊃ T = U iff S ⊆ T	σ ⇒ π = 1_U iff σ ≾ π
Top	Subset U with all elements	Partition 1_U with all distinctions
Bottom	Subset ∅ with no elements	Partition 0_U with no distinctions

For the simplest non-trivial case, the two lattices are illustrated in Figure 1 for U = {a, b, c}.

3.

The New Logical Measure of Information

Gian-Carlo Rota made the crucial connection between the dual notions of subsets and partitions. “The lattice of partitions plays for information the role that the Boolean algebra of subsets plays for size or probability” [11, p. 30]. $\frac{Subsets}{Probability} \approx \frac{Partitions}{Information} .$ {{{\rm{Subsets}}} \over {{\rm{Probability}}}} \approx {{{\rm{Partitions}}} \over {{\rm{Information}}}}.

In his Fubini Lectures, Rota said “Probability is a measure on the Boolean algebra of events” that gives quantitatively the “intuitive idea of the size of a set,” so we may ask by “analogy” for some measure to capture a property for a partition like “what size is to a set.” Rota goes on to ask:

How shall we be led to such a property? We have already an inkling of what it should be: it should be a measure of information provided by a random variable. Is there a candidate for the measure of the amount of information? [12, p. 67]

We have seen the duality between elements of a subset and dits of a partition, i.e.,

\frac{Elements}{Subset} \approx \frac{Distinctions}{Partition}

{{{\rm{Elements}}} \over {{\rm{Subset}}}} \approx {{{\rm{Distinctions}}} \over {{\rm{Partition}}}}

so the “size” of a partition may be taken as the number of distinctions.

The new logical foundations for information theory [4] start with sets, not probabilities, as suggested by Andrei Kolmogorov.

Information theory must precede probability theory, and not be based on it. By the very essence of this discipline, the foundations of information theory have a finite combinatorial character. [13, p. 39]

Since logical probability theory [8] starts as the normalized size of a subset, i.e., $\Pr (S) = \frac{| S |}{| U |}$ \Pr \left( S \right) = {{\left| S \right|} \over {\left| U \right|}} , the notion of information-as-distinctions starts with the normalized size of a partition’s ditset. This gives the logical entropy (with equiprobable points of U) as: $\begin{matrix} h (π) = \frac{| dit (π) |}{| U \times U |} = \frac{| U \times U - indit (π) |}{| U \times U |} = 1 - \frac{| \cup_{j = 1}^{m} B_{j} \times B_{j} |}{[U \times U]} = 1 - \sum_{j = 1}^{m} {(\frac{| B_{j} |}{| U |})}^{2} \\ = 1 - \sum_{j = 1}^{m} \Pr {(B_{j})}^{2} = \sum_{j \neq j^{'}} \Pr (B_{j}) \Pr (B_{j^{'}}) . \end{matrix}$ \displaylines{ h\left( \pi \right) = {{\left| {{\rm{dit}}\left( \pi \right)} \right|} \over {\left| {U \times U} \right|}} = {{\left| {U \times U - {\rm{indit}}\left( \pi \right)} \right|} \over {\left| {U \times U} \right|}} = 1 - {{\left| { \cup _{j = 1}^m B_j \times B_j } \right|} \over {\left[ {U \times U} \right]}} = 1 - \sum\nolimits_{j = 1}^m \left( {{{\left| {B_j } \right|} \over {\left| U \right|}}} \right)^2 \cr = 1 - \sum\nolimits_{j = 1}^m \Pr \left( {B_j } \right)^2 = \sum\nolimits_{j \ne j'} \Pr \left( {B_j } \right)\Pr \left( {B_{j'} } \right). \cr}

Given any (always positive) probability measure p : U → [0, 1] on U = {u₁, ..., u_n} which defines p_i = p(u_i) for i = 1, ..., n, the product measure p × p : U × U → [0, 1] has for any S ⊆ U × U the value of: $p \times p (S) = \sum_{(u_{i}, u_{k}) \in S} p_{i} p_{k} .$ p \times p\left( S \right) = \sum\nolimits_{\left( {u_i ,u_k } \right) \in S} p_i p_k.

The logical entropy of π is thus the product measure of its ditset: $h (π) = p \times p (dit (π)) = \sum_{(u_{i}, u_{j}) \in dit (π)} p_{i} p_{j} = \sum_{j \neq j^{'}} \Pr (B_{j}) \Pr (B_{j^{'}})$ h\left( \pi \right) = p \times p\left( {{\rm{dit}}\left( \pi \right)} \right) = \sum\nolimits_{\left( {u_i ,u_j } \right) \in {\rm{dit}}\left( \pi \right)} p_i p_j = \sum\nolimits_{j \ne j'} \Pr \left( {B_j } \right)\Pr \left( {B_{j'} } \right) where Pr(B_j) = ∑_{u_i∈B_j} p_i.

Interpretation of logical entropy

The logical entropy h(π) of a partition π is the probability that in two draws from U (with replacement), one gets a distinction of the partition π.

Similarly, Pr(S) is the probability that in one draw from U , one gets an element of the subset S ⊆ U . Thus the duality between subsets and partitions in their quantitative versions gives a duality between probability theory and information theory illustrated in Table 2.

Table 2.

Duality of quantitative subsets and partitions.

	Logical Probability Theory	Logical Information Theory
Outcomes	Elements of S	Distinctions of π
Events	Subsets S ⊆ U	Ditsets dit(π) ⊆ U × U
$p_{i} = \frac{1}{n}$ p_i = {1 \over n}	$\Pr (S) = \frac{\| S \|}{[U]}$ \Pr \left( S \right) = {{\left\| S \right\|} \over {\left[ U \right]}}	$h (π) = \frac{\| dit (π) \|}{\| U \times U \|}$ h\left( \pi \right) = {{\left\| {{\rm{dit}}\left( \pi \right)} \right\|} \over {\left\| {U \times U} \right\|}}
Probs. p	Pr(S) = ∑_{u_i∈S} p_i	h(π) = ∑_{(u_i,} _{u_k)∈dit(π)} p_ip_k
Interpretation	1-draw prob. of S-element	2-draw prob. of π-distinction

Given partitions π = {B₁, ..., B_m}, σ = {C₁, ..., C_m'} on U , the ditset for their join is: $dit (π \lor σ) = dit (π) \cup dit (σ) \subseteq U \times U .$ {\rm{dit}}\left( {\pi \vee \sigma } \right) = {\rm{dit}}\left( \pi \right) \cup {\rm{dit}}\left( \sigma \right) \subseteq U \times U.

Given probabilities p = {p₁, ..., p_n}, the joint logical entropy is: $h (π, σ) = h (π \lor σ) = p \times p (dit (π) \cup dit (σ)) = 1 - \sum_{j, j^{'}} p {(B_{j} \cap C_{j^{'}})}^{2} .$ h\left( {\pi ,\sigma } \right) = h\left( {\pi \vee \sigma } \right) = p \times p\left( {{\rm{dit}}\left( \pi \right) \cup {\rm{dit}}\left( \sigma \right)} \right) = 1 - \sum\nolimits_{j,j'} p\left( {B_j \cap C_{j'} } \right)^2 .

The ditset for the difference (or conditional) logical entropy h(π|σ) is the difference of ditsets, and thus: h(π|σ) = p × p(dit(π) − dit(σ)). The ditset for the logical mutual information m(π, σ) is the intersection of ditsets, so: m(π, σ) = p × p(dit(π) ∩ dit(σ)). Venn diagrams apply to measures and since logical entropy is a probability measure on U × U, Figure 2 illustrates its Venn diagram for the compound notions of logical entropy.

As in any Venn diagram for values of a measure, certain relationships hold such as: $h (π \lor σ) = h (π, σ) = h (π) + h (σ) - m (π, σ) = h (π | σ) + m (π, σ) + h (σ | π) .$ h\left( {\pi \vee \sigma } \right) = h\left( {\pi ,\sigma } \right) = h\left( \pi \right) + h\left( \sigma \right) - m\left( {\pi ,\sigma } \right) = h\left( {\pi |\sigma } \right) + m\left( {\pi ,\sigma } \right) + h\left( {\sigma |\pi } \right).

4.

Deriving the Shannon Entropies from the Logical Entropies

The simple and compound definitions for Shannon entropy $H (π) = \sum_{j = 1}^{m} \Pr (B_{j}) \log (\frac{1}{\Pr (B_{j})})$ H\left( \pi \right) = \sum\nolimits_{j = 1}^m \Pr \left( {B_j } \right)\log \left( {{1 \over {\Pr \left( {B_j } \right)}}} \right) were defined so that the Venn diagram relationships hold ([14,15]), but they are not defined in terms of a measure (in the sense of measure theory) [16]. However, all the Shannon entropies can be derived from the definitions of logical entropy (which is a measure) by a uniform monotonic transformation that preserves the Venn diagram relationships. It is easiest to work with the entropies of a probability distribution p = (p₁, ..., p_n) on U where: $\begin{matrix} h (p) = h (1_{U}) = 1 - \sum_{i = 1}^{n} p_{i}^{2} = \sum_{i \neq k} p_{i} p_{k} = \sum_{i = 1}^{n} p_{i} (1 - p_{i}) \\ H (p) = H (1_{U}) = \sum_{i = 1}^{n} p_{i} \log (\frac{1}{p_{i}}) . \end{matrix}$ \matrix{ {h\left( p \right) = h\left( {{\bf{1}}_U } \right) = 1 - \sum\nolimits_{i = 1}^n p_i^2 = \sum\nolimits_{i \ne k} p_i p_k = \sum\nolimits_{i = 1}^n \,p_i \left( {1 - p_i } \right)} \cr {H\left( p \right) = H\left( {{\bf{1}}_U } \right) = \sum\nolimits_{i = 1}^n \,p_i \log \left( {{1 \over {p_i }}} \right).} \cr }

Intuitively, if p_i = 1, then there is no information in the occurrence of u_i, so information is measured by the 1-complement. But there are two 1-complements, the additive 1-complement of 1 − p_i and the multiplicative 1-complement of $\frac{1}{p_{i}}$ {1 \over {p_i }} .

The additive probability average of the additive 1-complements is the logical entropy: $h (p) = \sum_{i = 1}^{n} p_{i} (1 - p_{i})$ h\left( p \right) = \sum\nolimits_{i = 1}^n p_i \left( {1 - p_i } \right) .
The multiplicative probability average of the multiplicative 1-complements is the log-free or anti-log version of Shannon entropy $\prod_{i = 1}^{n} {(\frac{1}{p_{i}})}^{p_{i}} = \log^{- 1} (H (p))$ \prod\nolimits_{i = 1}^n \left( {{1 \over {p_i }}} \right)^{p_i } = \mathop {\log }\nolimits^{ - 1} \left( {H\left( p \right)} \right) .

Then the particular log is chosen according to the application, e.g., log₂ in coding theory and ln in statistical mechanics. Since taking the log of the log-free version of Shannon entropy turns the multiplicative average into an additive average, we can then see how to directly transform the logical formulas into the Shannon formulas by the dit-bit transform: $1 - p_{i} ⇝ \log (\frac{1}{p_{i}}) .$ 1 - p_i \rightsquigarrow \log \left( {{1 \over {p_i }}} \right).

When the compound logical entropy formulas are formulated in terms of the additive 1-complements, then the dit-bit transform gives the corresponding compound formula for the Shannon entropies. This is illustrated in Table 3 for the probability distribution p : U → [0, 1] and a joint distribution p : X × Y → [0, 1].

Table 3.

The dit-bit transform from logical entropy to Shannon entropy.

	The Dit-Bit Transform: $1 - p_{i} ⇝ \log (\frac{1}{p_{i}})$ 1 - p_i \rightsquigarrow \log \left( {{1 \over {p_i }}} \right)
h(p) =	∑_ip_i(1 − p_i)
H(p) =	∑_ip_i log(1/p_i)
h(X, Y) =	∑_x,y p (x, y) [1 − p(x, y)]
H(X, Y) =	$\sum_{x, y} p (x, y) \log (\frac{1}{p (x, y)})$ \sum\nolimits_{x,y} p\left( {x,y} \right)\log \left( {{1 \over {p\left( {x,y} \right)}}} \right)
h(X\|Y) =	∑_x,y p (x, y) [(1 − p(x, y) − (1 − p(y))]
H(X\|Y) =	$\sum_{x, y} p (x, y) [\log (\frac{1}{p (x, y)}) - \log (\frac{1}{p (y)})]$ \sum\nolimits_{x,y} p\left( {x,y} \right)\left[ {\log \left( {{1 \over {p\left( {x,y} \right)}}} \right) - \log \left( {{1 \over {p\left( y \right)}}} \right)} \right]
m(X, Y)	∑_x,y p(x,y) [[1 − p(x)] + [1 − p(y)] − [1 − p(x,y)]]
I(X, Y)	$\sum_{x, y} p (x, y) [\log (\frac{1}{p (x)}) + \log (\frac{1}{p (y)}) - \log (\frac{1}{p (x, y)})]$ \sum\nolimits_{x,y} p\left( {x,y} \right)\left[ {\log \left( {{1 \over {p\left( x \right)}}} \right) + \log \left( {{1 \over {p\left( y \right)}}} \right) - \log \left( {{1 \over {p\left( {x,y} \right)}}} \right)} \right]

The dit-bit transform preserves the same Venn diagram formulas for the Shannon entropies in spite of them not being a measure (in the sense of measure theory) so those relationships, illustrated in Figure 3, are normally termed a “mnemonic” [16, p. 112].

5.

Logical Entropy via Density Matrices

All this will carry over to the quantum version of logical entropy by using density matrices. First, the “classical” treatment of logical entropy is restated using density matrices over the reals. Then that will extend immediately to the quantum case of density matrices over the complex numbers by making the appropriate changes such as replacing the square with the absolute square.

Let’s do the density matrix version of p × p(dit(π)).The density matrix associated with each block B_j ∈ π is the projection matrix ρ(B_j) = ∣b_j〉〈b_j∣ where ∣b_j〉 is the n × 1 column vector with entries $\sqrt{\frac{p_{i}}{\Pr (B_{j})}}$ \sqrt {{{p_i } \over {\Pr \left( {B_j } \right)}}} if u_i ∈ B_j, else 0. Thus the entries are $ρ {(B_{j})}_{i, k} = \frac{\sqrt{p_{i} p_{k}}}{\Pr (B_{j})}$ \rho \left( {B_j } \right)_{i,k} = {{\sqrt {p_i p_k } } \over {\Pr \left( {B_j } \right)}} if u_i, u_k ∈ B_j, else 0. The density matrix for the partition is $ρ (π) = \sum_{j = 1}^{m} \Pr (B_{j}) ρ (B_{j})$ \rho \left( \pi \right) = \sum\nolimits_{j = 1}^m \Pr \left( {B_j } \right)\rho \left( {B_j } \right) where $ρ {(π)}_{i k} = \sqrt{p_{i} p_{k}}$ \rho \left( \pi \right)_{ik} = \sqrt {p_i p_k } if (u_i, u_k) ∈ indit(π), else 0. To recover the logical entropy h(π) = p × p(dit(π)) using density matrices, it can be calculated as: 1 − tr[ρ(π)²]. A basic result about any density matrix ρ is: tr[ρ²] = ∑_i,k|ρ_{_ik}|² [17, p. 77]

so $tr [ρ {(π)}^{2}] = \sum_{(u_{i}, u_{k}) \in indit (π)} p_{i} p_{k} = 1 - \sum_{(u_{i}, u_{k}) \in dit (π)} p_{i} p_{k} = 1 - p \times p (dit (π))$ {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] = \sum\nolimits_{(u_i ,u_k ) \in *{\rm{indit}}\left( \pi \right)} p_i p_k = 1 - \sum\nolimits_{\left( {u_i ,u_k } \right) \in {\rm{dit}}\left( \pi \right)} p_i p_k = 1 - p \times p\left( {{\rm{dit}}\left( \pi \right)} \right) and thus: $h (π) = p \times p (dit (π)) = 1 - tr [ρ {(π)}^{2}] .$ h\left( \pi \right) = p \times p\left( {{\rm{dit}}\left( \pi \right)} \right) = 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right].

Example

Consider U = {a, b, c} with p : U → [0, 1] where $p_{a} = \frac{1}{2}$ p_a = {1 \over 2} , $p_{b} = \frac{1}{3}$ p_b = {1 \over 3} , and $p_{c} = \frac{1}{6}$ p_c = {1 \over 6} and π = {B₁, B₂} = {{a, b}, {c}}. The usual calculation of the logical entropy is $h (π) = 1 - {(\frac{5}{6})}^{2} - {(\frac{1}{6})}^{2} = 1 - \frac{26}{36} = \frac{5}{18}$ h\left( \pi \right) = 1 - \left( {{5 \over 6}} \right)^2 - \left( {{1 \over 6}} \right)^2 = 1 - {{26} \over {36}} = {5 \over {18}} . Then the density matrix calculation is: $ρ (B_{1}) = | b_{1} 〉〈 b_{1} | = [\begin{matrix} \sqrt{\frac{1 / 2}{5 / 6}} \\ \sqrt{\frac{1 / 3}{5 / 6}} \\ 0 \end{matrix}] [\begin{matrix} \sqrt{\frac{1 / 2}{5 / 6}} & \sqrt{\frac{1 / 3}{5 / 6}} & 0 \end{matrix}] = [\begin{matrix} \frac{1 / 2}{5 / 6} & \frac{\sqrt{1 / 6}}{5 / 6} & 0 \\ \frac{\sqrt{1 / 6}}{5 / 6} & \frac{1 / 3}{5 / 6} & 0 \\ 0 & 0 & 0 \end{matrix}];$ \rho \left( {B_1 } \right) = \left| {b_1 } \right\rangle \left\langle {b_1 } \right| = \left[ {\matrix{ {\sqrt {{{1/2} \over {5/6}}} } \cr {\sqrt {{{1/3} \over {5/6}}} } \cr 0 \cr } } \right]\left[ {\matrix{ {\sqrt {{{1/2} \over {5/6}}} } & {\sqrt {{{1/3} \over {5/6}}} } & 0 \cr } } \right] = \left[ {\matrix{ {{{1/2} \over {5/6}}} & {{{\sqrt {1/6} } \over {5/6}}} & 0 \cr {{{\sqrt {1/6} } \over {5/6}}} & {{{1/3} \over {5/6}}} & 0 \cr 0 & 0 & 0 \cr } } \right]; $ρ (B_{2}) = | b_{2} 〉〈 b_{2} | = [\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}] [\begin{matrix} 0 & 0 & 1 \end{matrix}] = [\begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{matrix}]$ \rho \left( {B_2 } \right) = \left| {b_2 } \right\rangle \left\langle {b_2 } \right| = \left[ {\matrix{ 0 \cr 0 \cr 1 \cr } } \right]\left[ {\matrix{ 0 & 0 & 1 \cr } } \right] = \left[ {\matrix{ 0 & 0 & 0 \cr 0 & 0 & 0 \cr 0 & 0 & 1 \cr } } \right] so that: $ρ (π) = \sum_{j = 1}^{2} \Pr (B_{j}) ρ (B_{j}) = [\begin{matrix} 1 / 2 & \sqrt{1 / 6} & 0 \\ \sqrt{1 / 6} & 1 / 3 & 0 \\ 0 & 0 & 1 / 6 \end{matrix}] and ρ {(π)}^{2} = [\begin{matrix} \frac{5}{12} & \frac{5 \sqrt{6}}{36} & 0 \\ \frac{5 \sqrt{6}}{36} & \frac{5}{18} & 0 \\ 0 & 0 & \frac{1}{36} \end{matrix}] .$ \rho \left( \pi \right) = \sum\nolimits_{j = 1}^2 \Pr \left( {B_j } \right)\rho \left( {B_j } \right) = \left[ {\matrix{ {1/2} & {\sqrt {1/6} } & 0 \cr {\sqrt {1/6} } & {1/3} & 0 \cr 0 & 0 & {1/6} \cr } } \right]\,\,\,{\rm{and}}\,\,\,\,\rho \left( \pi \right)^2 = \left[ {\matrix{ {{5 \over {12}}} & {{{5\sqrt 6 } \over {36}}} & 0 \cr {{{5\sqrt 6 } \over {36}}} & {{5 \over {18}}} & 0 \cr 0 & 0 & {{1 \over {36}}} \cr } } \right].

Then $tr [ρ {(π)}^{2}] = \frac{15}{36} + \frac{10}{36} + \frac{1}{36} = \frac{26}{36}$ {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] = {{15} \over {36}} + {{10} \over {36}} + {1 \over {36}} = {{26} \over {36}} so $1 - tr [ρ {(π)}^{2}] = \frac{5}{18} = h (π) ✓$ 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] = {5 \over {18}} = h\left( \pi \right) \checkmark .

Borrowing the language of QM, ρ(B_j) as a projection matrix represents a pure state, i.e., ρ(B_j)² = ρ(B_j). Then ρ(π) represents a mixed state where the pure states ρ(B_j) occur with the probabilities Pr(B_j). The dictionary giving the reformulation of set partition concepts in terms of density matrices is given in Table 4.

Table 4.

Dictionary translating set partitions into density matrices.

Set concept with probabilities	Set level density matrix concept
Partition π with point probs. p	Density matrix $ρ (π) = \sum_{j = 1}^{m} \Pr (B_{j}) \| b_{j} 〉〈 b_{j} \|$ \rho (\pi ) = \sum\nolimits_{j = 1}^m \Pr (B_j )\|b_j \rangle \langle b_j \|
Point probabilities {p₁, ..., p_n}	Value of diagonal entries of ρ(π)
Trivial indits (u_i, u_i) of π	Diagonal entries of ρ(π)
Non-trivial indits of π	Non-zero off-diagonal entries of ρ(π)
Dits of π	Zero entries of ρ(π)
Sum Pr(B_j) = ∑_{u_i∈ B_j} p_i	Trace tr[P_{B_j}ρ(π)]
Block probabilities Pr(B_j) in π	Eigenvalues ≠ 0 of ρ(π)
Block prob. 1 of U in 0_U = {U}	Non-zero eigenvalue of 1 for ρ(0_U)

We also need to give the set version of the “measurement” of a state ρ(π) by an “observable” given by a real-valued numerical attribute g : U → ℝ which defines the inverse-image partition g⁻¹ = {g⁻¹(s)}_s∈g(U). In QM, the transformation of a density matrix ρ(π) in the projective measurement by an observable is given by the Lüders mixture operation [18, p. 279]. For each block g⁻¹(s) in the observable partition g⁻¹, the diagonal projection matrix P_s has the diagonal entries χ_g⁻¹(s) (u_i), i.e., (P_s)_ii = 1 if g(u_i) = s, else 0. Then the Lüders mixture operation gives the post-measurement density matrix ρ̂(π) as: $\hat{ρ} (π) = \sum_{s \in g (U)} P_{s} ρ (π) P_{s} .$ \hat \rho \left( \pi \right) = \sum\nolimits_{s \in g(U)} P_s \rho \left( \pi \right)P_s .

Proposition 1

ρ̂ (π) = ρ(π ∨^g−1).

Proof

A nonzero entry in ρ(π) has the form $ρ {(π)}_{i k} = \sqrt{p_{i} p_{k}}$ \rho \left( \pi \right)_{ik} = \sqrt {p_i p_k } iff there is some block B_j ∈ π such that (u_i, u_k) ∈ B_j × B_j, i.e., if u_i, u_k ∈ B_j and otherwise 0. The matrix operation P_sρ(π) will preserve the entry $\sqrt{p_{i} p_{k}}$ \sqrt {p_i p_k } if u_i ∈ g⁻¹(s), otherwise the entry is zeroed. And if the entry was preserved, then the further matrix operation (P_sρ(π))P_s will preserve the entry $\sqrt{p_{i} p_{k}}$ \sqrt {p_i p_k } if u_k ∈ g⁻¹(s), otherwise it is zeroed. Hence the entries $\sqrt{p_{j} p_{k}}$ \sqrt {p_j p_k } in ρ(π) that are preserved in P_sρ(π)P_s are the entries where both u_i, u_k ∈ B_j for some B_j ∈ π and u_i, u_k ∈ g⁻¹(s). These are the entries in ρ(π ∨ g⁻¹) corresponding to the blocks B_j ∩ g⁻¹(s) for some B_j ∈ π, so summing over the blocks g⁻¹(s) ∈ g⁻¹ gives the result: ρ̂(π) = ∑_s∈g(U) P_sρ(π)P_s = ρ(π ∨ g⁻¹).

Proposition 2

(Measuring measurement). In the “projective measurement” ρ(π) ⇝ ρ( π ∨ g⁻¹), the sum of the squares of the non-zero off-diagonal entries of ρ(π) that were zeroed in ρ̂(π) = ρ(π ∨ g⁻¹) is the difference in their logical entropies h(π ∨ g⁻¹) − h(π) = h(π ∨ g⁻¹|π).

Proof

Since for any density matrix ρ, tr[ρ²] = ∑_i,k|ρ_ik|², $\begin{array}{l} h (π \lor g^{- 1} | π) = h (π \lor g^{- 1}) - h (π) = (1 - tr [ρ {(π \lor g^{- 1})}^{2}]) - (1 - t r [ρ {(π)}^{2}]) = \sum_{i, k} {| ρ {(π)}_{i k} |}^{2} \\ - \sum_{i, k} {| ρ {(π \lor g^{- 1})}_{i k} |}^{2} \end{array}$ \eqalign{ & h\left( {\pi \vee g^{ - 1} |\pi } \right) = h\left( {\pi \vee g^{ - 1} } \right) - h\left( \pi \right) = (1 - {\rm{tr}}\left[ {\rho \left( {\pi \vee g^{ - 1} } \right)^2 } \right]) - \left( {1 - {\rm tr}\left[ {\rho \left( \pi \right)^2 } \right]} \right) = \sum\nolimits_{i,k} \left| {\rho \left( \pi \right)_{ik} } \right|^2 \cr & - \sum\nolimits_{i,k} \left| {\rho \left( {\pi \vee g^{ - 1} } \right)_{ik} } \right|^2 \cr} since the action of the projection operators in the Lüders mixture operation is either to zero an entry or leave it the same.

Example (continued)

Let g(a) = 1, g(b) = g(c) = 0. Then g⁻¹ = {{a}, {b, c}} so π ∨ g⁻¹ = {{a}, {b}, {c}} = 1_U and thus $h (π \lor g^{- 1}) = h (1_{U}) = 1 - {(\frac{1}{2})}^{2} - {(\frac{1}{3})}^{2} - {(\frac{1}{6})}^{2} = 1 - \frac{9}{36} - \frac{4}{36} - \frac{1}{36} = 1 - \frac{14}{36} = \frac{11}{18}$ h\left( {\pi \vee g^{ - 1} } \right) = h\left( {{\bf{1}}_U } \right) = 1 - \left( {{1 \over 2}} \right)^2 - \left( {{1 \over 3}} \right)^2 - \left( {{1 \over 6}} \right)^2 = 1 - {9 \over {36}} - {4 \over {36}} - {1 \over {36}} = 1 - {{14} \over {36}} = {{11} \over {18}} so that $h (π \lor g^{- 1}) - h (π) = \frac{11}{18} - \frac{5}{18} = \frac{1}{3}$ h\left( {\pi \vee g^{ - 1} } \right) - h\left( \pi \right) = {{11} \over {18}} - {5 \over {18}} = {1 \over 3} . The density matrix for the discrete partition is: $ρ (π \lor g^{- 1}) = ρ (1_{U}) = [\begin{matrix} 1 / 2 & 0 & 0 \\ 0 & 1 / 3 & 0 \\ 0 & 0 & 1 / 6 \end{matrix}] and ρ (π) = [\begin{matrix} 1 / 2 & \sqrt{1 / 6} & 0 \\ \sqrt{1 / 6} & 1 / 3 & 0 \\ 0 & 0 & 1 / 6 \end{matrix}]$ \rho \left( {\pi \vee g^{ - 1} } \right) = \rho \left( {{\bf{1}}_U } \right) = \left[ {\matrix{ {1/2} & 0 & 0 \cr 0 & {1/3} & 0 \cr 0 & 0 & {1/6} \cr } } \right]\,\,\,\,{\rm{and}}\,\,\rho \left( \pi \right) = \left[ {\matrix{ {1/2} & {\sqrt {1/6} } & 0 \cr {\sqrt {1/6} } & {1/3} & 0 \cr 0 & 0 & {1/6} \cr } } \right] so the sum of the squares of the zeroed elements is ${(\sqrt{1 / 6})}^{2} + {(\sqrt{1 / 6})}^{2} = \frac{1}{3}$ \left( {\sqrt {1/6} } \right)^2 + \left( {\sqrt {1/6} } \right)^2 = {1 \over 3} .✓

The measuring measurement result deals with the non-zero off-diagonal terms in a density matrix.

[T]he off-diagonal terms of a density matrix... are often called quantum coherences because they are responsible for the interference effects typical of quantum mechanics that are absent in classical dynamics. [18, p. 177]

Since a projective measurement’s effect on a density matrix in QM is given by the Lüders mixture operation, that means that the effects of the measurement is the above-described “making distinctions” by decohering or zeroing certain coherence terms in the density matrix, and the sum of the absolute squares of the coherences that were decohered is the change in the logical entropy. This is a foretaste of the results for quantum logical entropy.

In coding theory, the Hamming distance between two n-ary 0, 1-vectors is the number of places where they differ. The partition version of this idea is the measure of where two partitions π and σ on U differ [19] which in terms of logical entropy is the logical Hamming distance between partitions (see Figure 2): $d (π, σ) : = h (π | σ) + h (σ | π) = h (π \lor σ) - m (π, σ) = 2 h (π \lor σ) - h (π) - h (σ) .$ d\left( {\pi ,\sigma } \right): = h\left( {\pi |\sigma } \right) + h\left( {\sigma |\pi } \right) = h\left( {\pi \vee \sigma } \right) - m\left( {\pi ,\sigma } \right) = 2h\left( {\pi \vee \sigma } \right) - h\left( \pi \right) - h\left( \sigma \right).

Intuitively, it is the logical information that is in each partition but not in the other, so it is a measure of how they differ, i.e., how “far apart” they are.

Lemma 1

h(π ∨ σ) = 1 − tr[ρ(π)ρ(σ)].

Proof

The k^th diagonal entry in ρ(π)ρ(σ) is the scalar product ∑_i ρ(π)_kiρ(σ)_ik with $ρ {(π)}_{k i} = \sqrt{p_{k} p_{i}}$ \rho \left( \pi \right)_{ki} = \sqrt {p_k p_i } if (u_k, u_i) ∈ indit(π) and otherwise 0, and similarly for ρ(σ)_ik. Hence the only non-zero terms in that sum are for (u_k, u_i) ∈ indit(π) ∩ indit(σ) = indit(π ∨ σ). Hence tr[ρ(π) ρ(σ)] = ∑_{(u_i,u_k)∈indit(π∨σ)} p_ip_k = 1 − ∑(_{u_i,u_k)∈dit(π∨σ)} p_ip_k so h([ ]π ∨ σ) = 1 − tr[ρ(π) ρ(σ)] and similarly for tr[ρ(σ) ρ(π)].

The quantity tr[(ρ(π) − ρ(σ))²] is usually termed the Hilbert-Schmidt distance between two density matrices ([20,21]) (sometimes with a 1/2 coefficient). It should be noted that the Hilbert-Schmidt distance is defined quite independently of the logical entropy and yet it is equal to the logical distance.

Proposition 3

tr[(ρ(π) − ρ(σ))²] = h(π|σ) + h(σ|π) = d(π, σ).

Proof

tr[(ρ(π) − ρ(σ))²] = tr[ρ(π)²] − tr[ρ(π) ρ(σ)] − tr[ρ(σ) ρ(π)] + tr[ρ(σ)²] so: $\begin{matrix} \begin{matrix} tr [{(ρ (π) - ρ (σ))}^{2}] = 2 [1 - tr [ρ (π) ρ (σ)]] - (1 - tr [ρ {(π)}^{2}] - (1 - tr [ρ {(σ)}^{2}])) \\ = 2 h (π \lor σ) - h (π) - h (σ) = h (σ | π) + h (π | σ) = d (π, σ) . \end{matrix} \end{matrix}$ \matrix{ \matrix{ {\rm{tr}}\left[ {\left( {\rho \left( \pi \right) - \rho \left( \sigma \right)} \right)^2 } \right] = 2\left[ {1 - {\rm{tr}}\left[ {\rho \left( \pi \right)\rho \left( \sigma \right)} \right]} \right] - \left( {1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] - \left( {1 - {\rm{tr}}\left[ {\rho \left( \sigma \right)^2 } \right]} \right)} \right) \cr = 2h\left( {\pi \vee \sigma } \right) - h\left( \pi \right) - h\left( \sigma \right) = h\left( {\sigma |\pi } \right) + h\left( {\pi |\sigma } \right) = d\left( {\pi ,\sigma } \right). \cr} \cr {} \cr }

Corollary 1

tr[(ρ̂(π) − ρ(π))²] =h(ρ̂(π)|ρ(π)).

Proof

Taking σ = g⁻¹ as the inverse-image partition of the numerical attribute, ρ̂(π) = ρ(π ∨ σ). Since π ≾ π ∨ σ, dit(π) ⊆ dit(π ∨ σ) so dit(π) − dit(π ∨ σ) = ∅ and thus h(π|π ∨ σ) = 0.

6.

Linearization from Sets to Vector Spaces

Our goal is to systematically derive the quantum logical entropy starting from the logic of partitions. We have so far developed the notion of logical entropy at the set level and given a number of results. There is a semi-algorithmic procedure or “yoga” [22, p. 271] to transform set concepts into the corresponding vector space concepts. The yoga is, in general, part of the mathematical folklore but parts have been stated explicitly [23, pp. 355–361].

Yoga of Linearization

Apply a set concept to a basis set of a vector space, and whatever is linearly generated is the corresponding vector space concept.

This yoga or procedure shows how the logical entropy concepts developed so far can be transformed into the corresponding concepts and results in the Hilbert vector spaces of quantum mechanics. Indeed, the previous results formulated using density matrices extend, mutatis mutandis (e.g., using the absolute square instead of the ordinary square), to the corresponding results about quantum logical entropy.

Hence we need to develop the dictionary to translate set concepts into vector space concepts. A subset of a basis set generates a subspace and the cardinality of the subset is the dimension of the subspace. Without assuming any probability distribution on U, a (real-valued) numerical attribute (e.g., weight, height, or age of persons) is a function f : U → ℝ. In any vector space V over a field containing the reals where U is now a basis set, the numerical attribute generates a linear operator F : V → V with real eigenvalues by the definition F u_i = f(u_i)u_i. If we let f ↾ S = rS mean that the numerical attribute f restricted to S has the constant value of r, then the vector space version is the eigenvector-eigenvalue equation Fυ = rυ. This means that the set-version of an eigenvector is a constant set of f and the constant value is a set-version of an eigenvalue. The numerical attribute’s inverse-image is a partition f⁻¹ = {f⁻¹(r)}_r∈f(U) and each block f⁻¹(r) generates a subspace V_r which is the eigenspace of the induced F for the eigenvalue r. Thus the partition f⁻¹ generates a set of subspaces {V_r}_r∈f(U) such that every vector υ can be uniquely expressed as a sum of non-zero vector υ_r ∈ V_r, i.e., the partition f⁻¹ generates a direct-sum decomposition (DSD) {V_r}_r∈f(U) of the vector space. When the numerical attribute is just a characteristic function χ : U → 2 = {0, 1} of some attribute on U , then the induced operator P₁ : V → V is the projection operator to the subspace generated by χ⁻¹(1). Hence for a general numerical attribute f, we have projection operators P_r to the subspace V_r generated by f⁻¹(r). The spectral decomposition of the induced operator is: F = ∑_r∈f(U) rP_r which works backwards gives the spectral decomposition in the set case: f = ∑_r∈f(U) rχ_{f⁻¹ (r)} where χ_f−1(r) is the characteristic function for the subset f⁻¹(r). And finally, the direct product U × U of a basis set U for V will (bi)linearly generate the tensor product V ⊗ V (where the ordered pair (u_i, u_k) is written u_i ⊗ u_k). These linearizations are summarized in Table 5.

Table 5.

Linearization dictionary to translate set concepts into corresponding vector space concepts.

Set concept	Vector-space concept
Subset S ⊆ U	Subspace [S] ⊆ V
Partition {f⁻¹(r)}_r∈f(U)	DSD {V_r}_r∈f(U)
Disjoint union U = ⊎_r∈f(U) f⁻¹(r)	Direct sum V = ⊕_r∈f(U)V_r
Numerical attribute f : U → ℝ	Observable F u_i = f(u_i)u_i
f ↾ S = rS	F u_i = ru_i
Constant set S of f	Eigenvector u_i of F
Value r on constant set S	Eigenvalue r of eigenvector u_i
Characteristic fcn. χ_S : U → {0, 1}	Projection operator P_{[S]u_i} = χ_S (u_i)u_i
∑_r∈f(U) χ_{f⁻¹ (r)} = χ_U	∑_r∈f(U) P_r = I : V → V
Spectral Decomp. f = ∑_r∈f(U) rχf⁻¹(r)	Spectral Decomp. F = ∑_r∈f(U) rP_r
Set of r-constant sets ℘(f⁻¹(r))	Eigenspace V_r of r-eigenvectors
Direct product U × U	Tensor product V ⊗ V

7.

Generalization to Quantum Logical Entropy

We have developed the notion of logical entropy as the quantitative version of partitions. The mathematics used was at the level of sets, e.g., numerical attributes and probability distributions on a set U . We have also outlined the semi-algorithmic yoga of linearization to translate set concepts into the corresponding vector space concepts. Hence the new concept of quantum logical entropy can be developed in a straightforward manner by linearizing the definition of logical entropy to Hilbert space.

The logical notion of information-as-distinctions generalizes to quantum information theory. A qubit is a pair of states definitely distinguishable in the sense of being orthogonal. In general, a qudit needs to be relativized to an observable—just as a dit is a dit of a partition such as the inverse-image partition f⁻¹ of a numerical attribute f : U → ℝ. Given such a numerical attribute f defined on an orthonormal (ON) basis for a (finite-dimensional) Hilbert space V , a Hermitian (or self-adjoint) operator F : V → V is defined by F u_i = f(u_i)u_i. The definition can be reversed. Given a Hermitian operator F on V , there is an ON basis of eigenvectors U and a real-valued numerical attribute f, the eigenvalue function, is defined on U by taking each element u_i to its eigenvalue.

A qudit of an observable F is a pair (u_i, u_k) in the eigenbasis definitely distinguishable by F , i.e., f(u_i) ≠ f(u_k), distinct eigenvalues. Let qudit(F) be the set of tensor product basis elements u_i ⊗ u_k for f(u_i) ≠ f(u_k). Since the quantum version of logical entropy is a straightforward generalization from sets to vector spaces, we give the generalization in Table 6. Numerical attributes f, g on U generate commuting observables F , G and commuting observables F , G generate eigenvalue functions f, g on the ON basis U of simultaneous eigenvectors. We follow Kolmogorov’s dictum by first giving the basic machinery without probabilities.

Table 6.

Yoga of Linearization without probabilities case.

Logical entropy	Quantum logical entropy
U = {u₁, ..., u_n}	ON basis U for Hilbert space V
f, g :U → ℝ	Commuting F , G : V → V
{r}_r∈f(U), {S}_s∈g(U)	Eigenvalues of F and G
π = {f⁻¹(r)}_r∈f(U), σ = {^g−1(S)}_s∈g(U)	DSDs of eigenspaces of F , G
Dits of π : (u_i, u_k), f(u_i) ≠ f(u_k)	Qudits F : u_i ⊗ u_k, f(u_i) ≠ f(u_k)
Dits of σ : (u_i, u_k), g(u_i) ≠ g(u_k)	Qudits G: u_i ⊗ u_k, g(u_i) ≠ g(u_k)
Ditset of π: dit(π)	[qudit(F)]: Subspace generated in V ⊗ V
Ditset of σ: dit(σ)	[qudit(G)]: Subspace generated in V ⊗ V
Join: dit(π) ∪ dit(σ) ⊆ U × U	[qudit(F) ∪ qudit(G)] ⊆ V ⊗ V
Difference: dit(π) − dit(σ) ⊆ U × U	[qudit(F) − qudit(G)] ⊆ V ⊗ V
Mutual: dit(π) ∩ dit(σ) ⊆ U × U	[qudit(F) ∩ qudit(G)] ⊆ V ⊗ V

In quantum mechanics, the probability information is carried by the state to be measured. Hence Table 6 deals with the set and quantum version of quantum observables, not quantum states. The next step is to apply linearization to the set and vector space versions of the quantum state which carries the probability information. At the set level, the universal set U is equipped with a probability distribution p : U → [0, 1]. Table 7 gives the translation dictionary to give the quantum logical entropy.

Table 7.

Logical entropy + Linearization = quantum logical entropy.

Logical entropy	Quantum logical entropy
ρ(0_U) = ρ(U) = ρ(U)²	Pure state ρ(ψ) = ρ(ψ)²
p × p on U × U	ρ(ψ) ⊗ ρ(ψ) on V ⊗ V
h(0_U) = 1 − tr[ρ(0_U)²] = 0	h(ρ(ψ)) = 1 − tr[ρ(ψ)²] = 0
π = f⁻¹, h(π) = p × p(dit(π))	h(F : ψ) = tr[P_[qudit(F)]ρ (ψ) ⊗ ρ(ψ)]
h(π, σ) = p × p(dit(π) ∪ dit(σ))	tr [P_{[qudit(F)∪qudit(G)]}ρ(ψ) ⊗ ρ(ψ)]
h(π\|σ) = p × p(dit(π) − dit(σ))	tr [P_{[qudit(F)−qudit(G)]}ρ(ψ) ⊗ ρ(ψ)]
m(π σ) = p × p(dit(π) ∩ dit(σ))	tr [P_{[qudit(F)∩qudit(G)]}ρ(ψ) ⊗ ρ(ψ)]
h (π) = h (π\|σ) + m(π, σ)	h(F : ψ) = h(F\|G : ψ) + m(F, G : ψ)
ρ(π) = ρ̂(0_U) = ∑_r∈f(U) P_rρ(0_U)P _r	ρ̂(ψ) = ∑_r∈f(U) P_rρ(ψ)P_r
h(π) = 1 − tr [ρ(π)²]	h(F : ψ) = 1 − tr[ρ̂(ψ)²]

For an observable F , let f : U → ℝ be the F -eigenvalue function assigning the real eigenvalue f(u_i) for each u_i in the ON basis U = {u₁, … , u_n} of F -eigenvectors. The image f(U) is the set of F -eigenvalues {r₁, … , r_m}. Let P_r : V → V be the projection matrix in the U -basis to the eigenspace of r. The projective F -measurement of the state ψ transforms the pure state density matrix ρ(ψ) (represented in the ON basis U of F -eigenvectors) to yield the Lüders mixture density matrix ρ̂(ψ) = ∑_r∈f(U) P_rρ(ψ)P_r [18, p. 279]. The off-diagonal elements of ρ(ψ) that are zeroed in ρ̂(ψ) are the coherences (quantum indistinctions or quindits) that are turned into “decoherences” (quantum distinctions or qudits of the observable being measured).

For any observable F and a pure state ψ, a quantum logical entropy was defined as h(F : ψ) = tr[P_[qudit(F)]ρ(ψ) ⊗ ρ(ψ)]. That definition was the quantum generalization of the “classical” logical entropy defined as h(π) = p × p(dit(π)). When a projective F -measurement is performed on ψ, the pure state density matrix ρ(ψ) is transformed into the mixed state density matrix by the quantum Lüders mixture operation, which then defines the quantum logical entropy h(ρ̂(ψ)) = 1 − tr[ρ̂(ψ)²].

The first result is to show that these two entropies are the same: h(F : ψ) = h(ρ̂(ψ)). The proof proceeds by showing that they are both equal to classical logical entropy of the partition π(F : ψ) defined on the ON basis U = {u₁, … , u_n} of F -eigenvectors by the F -eigenvalues with the point probabilities $p_{i} = α_{i}^{*} α_{i}$ p_i = \alpha _i^ * \alpha _i where $| ψ 〉 = \sum_{i = 1}^{n} α_{i} | u_{i} 〉$ \left| \psi \right\rangle = \sum\nolimits_{i = 1}^n \alpha _i \left| {u_i } \right\rangle . ⁽¹⁾ That is, the inverse images B_j = f⁻¹(r_j) for j = 1, ..., m of the eigenvalue function f : U → ℝ define the eigenvalue partition π(F : ψ) = {B₁, … , B_m} on the ON basis U = {u₁, … , u_n} with the point probabilities $p_{i} = α_{i}^{*} α_{i} = {| α_{i} |}^{2}$ p_i = \alpha _i^ * \alpha _i = \left| {\alpha _i } \right|^2 provided by the state ψ for i = 1, … , n. The classical logical entropy of that partition is: $h (π (F : ψ)) = 1 - \sum_{j = 1}^{m} p {(B_{i})}^{2}$ h\left( {\pi \left( {F:\psi } \right)} \right) = 1 - \sum\nolimits_{j = 1}^m p\left( {B_i } \right)^2 where p(B_j) = ∑u_i∈B_j p_i. $\begin{matrix} h (F : ψ) = tr [P_{[q u d i t (F)]} ρ (ψ) \otimes ρ (ψ)] = \sum_{i . k = 1}^{n} {p_{i} p_{k} : f (u_{i}) \neq f (u_{k})} \\ = \sum_{j \neq j^{'}} \sum {p_{i} p_{k} : u_{i} \in B_{j}, u_{k} \in B_{j^{'}}} = \sum_{j \neq j^{'}} p (B_{j}) p (B_{j^{'}}) \\ = 1 - \sum_{j = 1}^{m} p {(B_{j})}^{2} = h (π (F : ψ)) . \end{matrix}$ \matrix{ {h\left( {F:\psi } \right) = {\rm{tr}}\left[ {P_{\left[ {qudit(F)} \right]} \rho \left( \psi \right) \otimes \rho \left( \psi \right)} \right] = \sum\nolimits_{i.k = 1}^n \left\{ {p_i p_k :f\left( {u_i } \right) \ne f\left( {u_k } \right)} \right\}} \cr { = \sum\nolimits_{j \ne j'} \sum \left\{ {p_i p_k :u_i \in B_j ,u_k \in B_{j'} } \right\} = \sum\nolimits_{j \ne j'} \,p\left( {B_j } \right)p\left( {B_{j'} } \right)} \cr { = 1 - \sum\nolimits_{j = 1}^m p\left( {B_j } \right)^2 = h\left( {\pi \left( {F:\psi } \right)} \right).} \cr }

To show that h(ρ̂(ψ)) = 1 − tr[ρ̂(ψ)²] = h(π(F : ψ)) for ρ̂(ψ) = ∑_r∈f(U) P_rρ(∈)P_r, we need to compute tr[ρ̂(ψ)²]. An off-diagonal element in $ρ_{i k} (ψ) = α_{i} α_{k}^{*}$ \rho _{ik} (\psi ) = \alpha _i \alpha _k^ * of ρ(ψ) survives (i.e., is not zeroed and has the same value) the Lüders operation if and only if f(u_i) = f(u_k). Hence, the i-th diagonal element of ρ̂(ψ)² is: $\sum_{k = 1}^{n} {α_{i}^{*} α_{k} α_{i} α_{k}^{*} : f (u_{i}) = f (u_{k})} = \sum_{k = 1}^{n} {p_{i} p_{k} : f (u_{i}) = f (u_{k})} = p_{i} p (B_{j})$ \sum\nolimits_{k = 1}^n \left\{ {\alpha _i^ * \alpha _k \alpha _i \alpha _k^ * :f\left( {u_i } \right) = f\left( {u_k } \right)} \right\} = \sum\nolimits_{k = 1}^n \left\{ {p_i p_k :f\left( {u_i } \right) = f\left( {u_k } \right)} \right\} = p_i p\left( {B_j } \right) where u_i ∈ B_j. Then, grouping the i-th diagonal elements for u_i ∈ B_j gives ∑_{u_i∈B_j} p_ip(B_j) = p(B_j)². Hence, the whole trace is: $tr [\hat{ρ} {(ψ)}^{2}] = \sum_{j = 1}^{m} p {(B_{j})}^{2}$ {\rm{tr}}\left[ {\hat \rho \left( \psi \right)^2 } \right] = \sum\nolimits_{j = 1\,}^m p\left( {B_j } \right)^2 , and thus: $h (\hat{ρ} (ψ)) = 1 - tr [\hat{ρ} {(ψ)}^{2}] = 1 - \sum_{j = 1}^{m} p {(B_{j})}^{2} = h (F : ψ) .$ h\left( {\hat \rho \left( \psi \right)} \right) = 1 - {\rm{tr}}\left[ {\hat \rho \left( \psi \right)^2 } \right] = 1 - \sum\nolimits_{j = 1}^m p\left( {B_j } \right)^2 = h\left( {F:\psi } \right).

This finishes the proof of the following proposition.

Proposition 4

h(F : ψ) = h(π(F : ψ)) = h(ρ̂(ψ)).

This shows how the quantum case is so closely related to the set case that, in many instances, we can compute results in the quantum case by converting to the set case where computations are simpler.

Measurement creates distinctions, i.e., turns coherences into “decoherences,” which, classically, is the operation of distinguishing elements by classifying them according to some attribute like classifying the faces of a die by their parity. The fundamental theorem about quantum logical entropy and projective measurement, in the density matrix version, shows how the quantum logical entropy created (starting with h(ρ(ψ)) = 0 for the pure state ψ) by the measurement can be computed directly from the coherences of ρ(ψ) that are decohered in ρ̂(ψ).

Proposition 5

(Measuring measurement). The increase in quantum logical entropy, h(F : ψ) = h(ρ̂(ψ)) due to the F -measurement of the pure state ψ is the sum of the absolute squares of the non-zero off-diagonal terms (coherences) in ρ(ψ)(represented in an ON basis of F -eigenvectors) that are zeroed (“decohered”) in the post-measurement Lüders mixture density matrix ρ̂(ψ) = ∑_r∈f(U) P_rρ(ψ)P_r.

Proof

h(ρ̂(ψ)) − h(ρ(ψ)) = (1 − tr[ρ̂(∈)²]) − (1 − tr[ρ(∈)²]) = ∑_i,k(|ρ_ik(ψ)|² − |ρ̂_ik(ψ)|²). Now u_i and u_k are a qudit of F iff they are the corresponding off-diagonal terms zeroed by the Lüders mixture operation ∑_r∈f(U) P_rρ(ψ)P_r to obtain ρ̂(ψ) from ρ(ψ).

Since h(F : ψ) = h(π(F : ψ)) we can carry over the probability interpretation in the classical case h(π(F : ψ)) to the quantum case.

Interpretation of quantum logical entropy

The quantum logical entropy h(F : ψ) is the probability, in two independent F -measurements of a prepared pure state ψ, that different eigenvalues will be obtained—just as the logical entropy h(f⁻¹) is the probability in two independent draws from U that different f-values will be obtained.

Example (continued)

It might be helpful to carry out a quantum version of the numerical example. In V = ℂ³, let |ψ〉 = α|a〉 + β|b〉 + γ|c〉 be a normalized state vector so that $p_{a} = α α^{*} = {| α |}^{2} = \frac{1}{2}$ p_a = \alpha \alpha ^ * = \left| \alpha \right|^2 = {1 \over 2} , $p_{b} = β β^{*} = {| β |}^{2} = \frac{1}{3}$ p_b = \beta \beta ^ * = \left| \beta \right|^2 = {1 \over 3} , and $p_{c} = γ γ^{*} = {| γ |}^{2} = \frac{1}{6}$ p_c = \gamma \gamma ^ * = \left| \gamma \right|^2 = {1 \over 6} are the point probabilities on the ON basis U = {a, b, c} as in our running numerical example. Let F : V → V be a Hermitian operator with the eigenvalue function f : U → ℝ so that f⁻¹(r₁) = {a, b} and f⁻¹(r₂) = {c}. In the quantum case, the complex density matrix ρ_C (ψ) is: $ρ_{C} (ψ) = [\begin{matrix} α \\ β \\ γ \end{matrix}] [\begin{matrix} α^{*} & β^{*} & γ^{*} \end{matrix}] = [\begin{matrix} \frac{1}{2} & α β^{*} & α γ^{*} \\ β α^{*} & \frac{1}{3} & β γ^{*} \\ γ α^{*} & γ β^{*} & \frac{1}{6} \end{matrix}] .$ \rho _C \left( \psi \right) = \left[ {\matrix{ \alpha \cr \beta \cr \gamma \cr } } \right]\left[ {\matrix{ {\alpha ^ * } & {\beta ^ * } & {\gamma ^ * } \cr } } \right] = \left[ {\matrix{ {{1 \over 2}} & {\alpha \beta ^ * } & {\alpha \gamma ^ * } \cr {\beta \alpha ^ * } & {{1 \over 3}} & {\beta \gamma ^ * } \cr {\gamma \alpha ^ * } & {\gamma \beta ^ * } & {{1 \over 6}} \cr } } \right]. The corresponding real density matrix for those probabilities is: $ρ (ψ) = ρ (0_{U}) = [\begin{matrix} 1 / 2 & \sqrt{1 / 6} & \sqrt{1 / 12} \\ \sqrt{1 / 6} & 1 / 3 & \sqrt{1 / 18} \\ \sqrt{1 / 12} & \sqrt{1 / 18} & 1 / 6 \end{matrix}] .$ \rho \left( \psi \right) = \rho \left( {{\bf{0}}_U } \right) = \left[ {\matrix{ {1/2} & {\sqrt {1/6} } & {\sqrt {1/12} } \cr {\sqrt {1/6} } & {1/3} & {\sqrt {1/18} } \cr {\sqrt {1/12} } & {\sqrt {1/18} } & {1/6} \cr } } \right]. The set partition on the ON basis set U is π = f⁻¹ = {{a, b}, {c}} so the ditset is dit(π) = {(a, c), (b, c), (c, a), (c, b)} = {(a, b), (b, c), ...} (where the ellipsis ... stands for the reversed cases of the previously listed ordered pairs). Hence the qudits in V ⊗ V are qudit(F) = {a ⊗ c, b ⊗ c, ...}. The quantum logical entropy resulting from the F -measurement (always projective) of ψ is h(F : ψ) = tr[P_[qudit(F)]ρ_C (ψ) ⊗ ρ_C (ψ)] where ρ_C (ψ) ⊗ ρ_C(ψ) is a 9 × 9 complex matrix which could be written with some shorthand (since each cell is a 3 × 3 matrix) as: $ρ_{C} (ψ) \otimes ρ_{C} (ψ) = [\begin{matrix} \frac{1}{2} ρ_{C} (ψ) & α β^{*} ρ_{C} (ψ) & α γ^{*} ρ_{C} (ψ) \\ β α^{*} ρ_{C} (ψ) & \frac{1}{3} ρ_{C} (ψ) & β γ^{*} ρ_{C} (ψ) \\ γ α^{*} ρ_{C} (ψ) & γ β^{*} ρ_{C} (ψ) & \frac{1}{6} ρ_{C} (ψ) \end{matrix}] .$ \rho _C \left( \psi \right) \otimes \rho _C \left( \psi \right) = \left[ {\matrix{ {{1 \over 2}\rho _C \left( \psi \right)} & {\alpha \beta ^ * \rho _C \left( \psi \right)} & {\alpha \gamma ^ * \rho _C \left( \psi \right)} \cr {\beta \alpha ^ * \rho _C \left( \psi \right)} & {{1 \over 3}\rho _C \left( \psi \right)} & {\beta \gamma ^ * \rho _C \left( \psi \right)} \cr {\gamma \alpha ^ * \rho _C \left( \psi \right)} & {\gamma \beta ^ * \rho _C \left( \psi \right)} & {{1 \over 6}\rho _C \left( \psi \right)} \cr } } \right].

The diagonal elements of ρ_C (ψ) ⊗ ρ_C (ψ) are real products of probabilities. The projection operator P_[qudit(F)] to the subspace generated by qudit[F] in V ⊗ V is a 9 × 9 diagonal matrix whose non-zero diagonal entries are ones corresponding to qudits(F) = {a ⊗ c, b ⊗ c, ...}. Thus the product P_[qudit(F)] ρ_C (ψ) ⊗ ρ_C (ψ) just picks out (along its diagonal) those pairs of probabilities corresponding to dit(f⁻¹), namely {p_ap_c, p_bp_c, ...} and then taking the trace sums them up to yield h(F : ψ) = h(π(F : ψ)). It sums the four entries corresponding to the qudits, $\frac{1}{2} \frac{1}{6}$ {1 \over 2}{1 \over 6} for a ⊗ c, $\frac{1}{3} \frac{1}{6}$ {1 \over 3}{1 \over 6} for b ⊗ c, and the two reverse products for a total of: $2 (\frac{1}{12} + \frac{1}{18}) = 2 (\frac{3}{36} + \frac{2}{36}) = \frac{10}{36} = \frac{5}{18} . ✓$ 2\left( {{1 \over {12}} + {1 \over {18}}} \right) = 2\left( {{3 \over {36}} + {2 \over {36}}} \right) = {{10} \over {36}} = {5 \over {18}}. \checkmark

Quantum logical entropy also has natural connections with other quantum notions such as the Hilbert-Schmidt distance tr[(ρ − τ)²] [20] between two density matrices ρ and τ. And, as usual, the quantum case is developed as the quantum version of the “classical” logical entropy. We previously defined the logical Hamming distance between two partitions. $d (π, σ) : = h (π | σ) + h (σ | π) = h (π \lor σ) - m (π, σ) = 2 h (π \lor σ) - h (π) - h (σ) .$ d\left( {\pi ,\sigma } \right): = h\left( {\pi |\sigma } \right) + h\left( {\sigma |\pi } \right) = h\left( {\pi \vee \sigma } \right) - m\left( {\pi ,\sigma } \right) = 2h\left( {\pi \vee \sigma } \right) - h\left( \pi \right) - h\left( \sigma \right).

Nielsen and Chuang are skeptical about developing the Hamming distance in the quantum context.

Unfortunately, the Hamming distance between two objects is simply a matter of labeling, and a priori there aren’t any labels in the Hilbert space arena of quantum mechanics! [24, p. 399]

We have seen how to define a density matrix ρ(π) from a set partition U, and this provides a bridge to quantum logical entropy for any density matrix ρ:

h (π) = 1 - tr [ρ {(π)}^{2}] extends to h (ρ) = 1 - tr [ρ^{2}] .

h\left( \pi \right) = 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right]\,\,\,{\rm{extends\,to\,\,}}h\left( \rho \right) = 1 - {\rm{tr}}\left[ {\rho ^2 } \right].

The formula h(ρ) = 1 − tr[ρ²] is not new; only the whole development of logical entropy from partition logic is new (and hence the new name). Indeed, tr[ρ²] is usually called the purity of the density matrix since a state ρ is pure if and only if tr[ρ²] = 1, so h(ρ) = 0, and otherwise, tr[ρ²] < 1, so h(ρ) > 0. The complement 1 − tr[ρ²] has been called the “mixedness” [25, p. 5] of the state ρ. The seminal paper of Manfredi and Feix [3] approaches the same formula 1 − tr[ρ²] (which they denote as S₂) from the advanced viewpoint of Wigner functions, and they present strong arguments for this notion of quantum entropy (which resulted in Manfredi editing a special issue of the journal 4Open on logical entropy [26]). This notion of quantum logical entropy is also called by the misnomer “linear entropy” [27] even though it is quadratic in ρ, so we will not continue that usage.

Using these density matrices, there is also the notion of logical cross-entropy of π and σ: $h (π | | σ) = 1 - tr [ρ (π) ρ (σ)]$ h\left( {\pi ||\sigma } \right) = 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)\rho \left( \sigma \right)} \right] where h(π||σ) = h(π, σ) = h(π ∨ σ). Since there is no join defined for density matrices ρ and τ, we can nevertheless define the quantum logical cross-entropy of ρ and τ as (where the dagger is conjugate transpose): $h (ρ | | τ) : = 1 - tr [ρ^{†} τ] .$ h\left( {\rho ||\tau } \right): = 1 - {\rm{tr}}\left[ {\rho ^\dagger \tau } \right]. This provides the path to then define the quantum logical Hamming distance between ρ and τ: $d (ρ, τ) : = 2 h (ρ | | τ) - h (ρ) - h (τ) .$ d\left( {\rho ,\tau } \right): = 2h\left( {\rho ||\tau } \right) - h\left( \rho \right) - h\left( \tau \right).

Since ρ and τ are also Hermitian matrices, each has an ON basis of eigenvectors and this approach to Hamming distance avoids the Nielsen-Chuang misgivings by using the amplitudes of all possible relations between the two ON bases [4, pp. 89–90] so no arbitrary labeling of the bases is involved. Then we have the theorem connecting quantum logical Hamming distance to an important existing notion in quantum information theory.

Proposition 6

(Hamming = Hilbert-Schmidt distance). tr[(ρ − τ)²] = d(ρ, τ).

Proof

tr[(ρ − τ)²] = tr[ρ² + τ² − 2ρ^†τ] = tr[ρ²] + tr[τ²] − 2 tr[ρ^†τ] = 2h(ρ||τ) − h(ρ) − h(τ) = d(ρ, τ).

8.

Concluding Remarks

In that manner, the computation of the quantum logical entropies can be reduced to the computations in the corresponding “classical” case of logical entropies. Moreover, the definitions are made in Table 7 so that we have all the usual compound notions of quantum logical entropy that satisfy the usual Venn diagram relationships as illustrated in Figure 4.

The overall purpose of this paper has been to develop quantum logical entropy starting from the logic of partitions at the set level, developing the quantitative version of partitions as logical entropy, and then the development of that corresponding quantum notion using the yoga of linearization to translate the set concepts into the corresponding (Hilbert) vector space concepts.

There are a number of other results in the literature ([4,20,28]) about quantum logical entropy such as its concavity, subadditivity, non-decreasing value under projective measurement, a Holevo-type bound for quantum logical Hamming distance, and the extension of quantum logical entropy to post-selected quantum systems. More results are sure to come as more researchers are familiar with the logical entropy concepts starting with the quantitative treatment of partitions in terms of distinctions.

We find this framework of partitions and distinction most suitable (at least conceptually) for describing the problems of quantum state discrimination, quantum cryptography and in general, for discussing quantum channel capacity. In these problems, we are basically interested in a distance measure between such sets of states, and this is exactly the kind of knowledge provided by logical entropy [Reference to [1]]. [2, p. 1]

In conventional information theory or in what Claude Shannon called the “Mathematical Theory of Communication” [14], he noted that “no concept of information itself was defined” [29, p. 458]. The extension of Shannon entropy to the quantum notion of von Neumann entropy did not solve that problem of defining quantum information or the problem of interpretation. Logical entropy as the quantification of partitions defines the notion of information-as-distinctions, and quantum logical entropy extends that notion to the quantum realm as the quantification of quantum distinctions or qudits. This answers the vision of Charles Bennett, one of the founders of quantum information theory.

So information really is a very useful abstraction. It is the notion of distinguishability abstracted away from what we are distinguishing, or from the carrier of information.... ...

And we ought to develop a theory of information which generalizes the theory of distinguishability to include these quantum properties.... [30, pp. 155–157]

Whenever possible, we ignore the ket notation |u_i〉 and just write u_i.

A New Logical Measure for Quantum Information

Full Article

Paradigm

My account