Have a personal or library account? Click to login
A New Logical Measure for Quantum Information Cover
By: David Ellerman  
Open Access
|Feb 2025

Full Article

1.
Introduction

The purpose of this paper is to derive the relatively new notion of quantum logical entropy ([1,2,3,4]) from the relatively new logic of partitions ([5,6,7]) that is category-theoretically dual to the usual Boolean logic of subsets. The “classical” notion of logical entropy is derived starting as the quantitative version of the distinctions of partitions just as probability is derived starting as the quantitative version of the elements of subsets [8]. The notion of logical entropy is compared and contrasted with the usual Shannon entropy. Then the notion of logical entropy is linearized using a semi-algorithmic procedure to translate set-based concepts into the corresponding vector space concepts. The concept of logical entropy linearized to Hilbert space gives the concept of quantum logical entropy. The linearization procedure also allows results proven at the set level with logical entropy to be extended in a straightforward manner to quantum logical entropy. For instance, logical entropy is a probability measure with a simple probability interpretation (i.e., the two-draw probability of getting a distinction of a partition) and that interpretation extends to quantum logical entropy. That is, given an observable and a quantum state vector, their quantum logical entropy is the probability in two independent projective measurements of the observable on the prepared state that different eigenvalues are obtained. As a measure, the compound notions of difference (or conditional) and mutual logical entropy are immediately defined and their relationships are illustrated in the usual Venn diagrams for measures. The derivation of the quantum logical entropy allows the definitions of the difference and mutual quantum logical entropies which satisfy the corresponding relationships. This method of deriving the concepts of quantum logical entropy makes it a logic-based and natural measure of information for quantum mechanics (QM). Our purpose is to show that naturality and fundamentality of this notion of quantum information so the contrasts and comparisons with other notions such as the von Neumann entropy are left to the reader.

2.
Subset Logic and Partition Logic

Subsets and partitions are category-theoretic dual concepts. That is, in the turn-around-the-arrows duality of category theory, a subset is also called a “part” and “The dual notion (obtained by reversing the arrows) of ‘part’ is the notion of partition” [9, p. 85]. A partition π = {B1, ..., Bm} on a universe set U = {u1, ..., un} (|U| ≥ 2) is a set of nonempty subsets Bj called “blocks” such that the blocks are disjoint and their union is U. The partitions on U form a lattice Π(U). The partial order (PO) for the lattice is refinement where a partition σ = {C1, ..., Cm} is refined by π, written σπ, if for every block Bjπ, there is a block Cjσ such that BjCj .

At a more atomic or granular level, the elements of a subset are dual to the distinctions (dits) of a partition which are ordered pairs of elements in different blocks of the partition. The set of distinctions or ditset of a partition π is dit(π) ⊆ U × U and the complementary set, indit(π)=U×Udit(π)=j=1mBj×Bj {\rm{indit}}\left( \pi \right) = U \times U - {\rm{dit}}\left( \pi \right) = \cup _{j = 1}^m B_j \times B_j of indistinctions is the equivalence relation on U associated with π where the equivalence classes are the blocks of π. The refinement PO on partitions is the same as the inclusion PO on ditsets: σπ if and only if (iff) dit(σ) ⊆ dit(π).

The join σπ is the partition whose blocks are the nonempty intersections BjCj' , and it is the least upper bound of π and σ for the refinement partial order. The ditset of the join is the union of the ditsets: dit(σπ) = dit(σ) ∪ dit(π). Since the arbitrary intersection of equivalence relations is an equivalence relation, the meet σπ can be defined as the partition whose ditset is the complement of the smallest equivalence relation containing indit(σ) ∪ indit(π), and it is the greatest lower bound of σ and π. The top of the lattice is the discrete partition 1U = {{ui}}u1U of singletons of the elements of U and the bottom of the lattice is the indiscrete partition 0U = {U} whose only block is all of U .

The lattice of partitions was known in the 19th century (e.g., Dedekind and Schröder). However, throughout the 20th century “the only operations on the family of equivalence relations fully studied, understood and deployed are the binary join ∨ and meet ∧ operations” [10, p. 445]. To go from a lattice of partitions to a logic of partitions comparable to the usual Boolean logic of subsets, there needs to be at least an implication operation defined on partitions. That operation would be the parallel to the subset implication or conditional ST = ScT for S, TU in the powerset Boolean algebra ℘(U) of subsets of U where the partial order is inclusion, the join and meet are union and intersection respectively, and the top and bottom are U and ∅ respectively. The implication σπ is the partition on U that is like π except that whenever a block Bjπ is contained in some block of σ, then Bj is replaced by the singletons of its elements. If we denote a block Bπ when it has been “discretized” as 1B and when it remains whole as 0B, then the implication σπ functions like a characteristic or indicator function for inclusion of π-blocks in σ-blocks. Thus when they are all included, i.e., when refinement holds, then the implication is the discrete partition 1U : σπ=1Uiffσπ \sigma \Rightarrow \pi = {\bf{1}}_U {\rm{iff}}\,\,\,\,\,\,\,\,\sigma \mathbin{\lower.3ex\hbox{$\buildrel\prec\over {\smash{\scriptstyle\sim}\vphantom{_x}}$}} \pi which is just the partition logic version of the subset logic relation: ST=UiffST. S \supset T = U\,{\rm{iff}}\,S \subseteq T.

Thus, the usual Boolean logic of subsets (often presented in only the special case of “propositional logic”) has a dual logic of partitions. The elements (Its) of subsets and the distinctions (Dits) of partitions have dual corresponding roles as illustrated in Table 1.

Table 1.

Elements-distinctions duality between the two dual logics.

Logic ℘(U) of subsets of ULogic of partitions Π(U) on U
Its/DitsElements of subsetsDistinctions of partitions
P.O.Inclusion of subsetsInclusion of ditsets
JoinUnion of subsetsUnion of ditsets
MeetSubset of common elementsDitset of common dits
Impl.ST = U iff STσπ = 1U iff σπ
TopSubset U with all elementsPartition 1U with all distinctions
BottomSubset ∅ with no elementsPartition 0U with no distinctions

For the simplest non-trivial case, the two lattices are illustrated in Figure 1 for U = {a, b, c}.

Figure 1.

The lattices of the dual subsets and partitions.

3.
The New Logical Measure of Information

Gian-Carlo Rota made the crucial connection between the dual notions of subsets and partitions. “The lattice of partitions plays for information the role that the Boolean algebra of subsets plays for size or probability” [11, p. 30]. SubsetsProbabilityPartitionsInformation. {{{\rm{Subsets}}} \over {{\rm{Probability}}}} \approx {{{\rm{Partitions}}} \over {{\rm{Information}}}}.

In his Fubini Lectures, Rota said “Probability is a measure on the Boolean algebra of events” that gives quantitatively the “intuitive idea of the size of a set,” so we may ask by “analogy” for some measure to capture a property for a partition like “what size is to a set.” Rota goes on to ask:

How shall we be led to such a property? We have already an inkling of what it should be: it should be a measure of information provided by a random variable. Is there a candidate for the measure of the amount of information? [12, p. 67]

We have seen the duality between elements of a subset and dits of a partition, i.e., ElementsSubsetDistinctionsPartition {{{\rm{Elements}}} \over {{\rm{Subset}}}} \approx {{{\rm{Distinctions}}} \over {{\rm{Partition}}}} so the “size” of a partition may be taken as the number of distinctions.

The new logical foundations for information theory [4] start with sets, not probabilities, as suggested by Andrei Kolmogorov.

Information theory must precede probability theory, and not be based on it. By the very essence of this discipline, the foundations of information theory have a finite combinatorial character. [13, p. 39]

Since logical probability theory [8] starts as the normalized size of a subset, i.e., Pr(S)=|S||U| \Pr \left( S \right) = {{\left| S \right|} \over {\left| U \right|}} , the notion of information-as-distinctions starts with the normalized size of a partition’s ditset. This gives the logical entropy (with equiprobable points of U) as: h(π)=|dit(π)||U×U|=|U×Uindit(π)||U×U|=1|j=1mBj×Bj|[U×U]=1j=1m(|Bj||U|)2=1j=1mPr(Bj)2=jjPr(Bj)Pr(Bj). \displaylines{ h\left( \pi \right) = {{\left| {{\rm{dit}}\left( \pi \right)} \right|} \over {\left| {U \times U} \right|}} = {{\left| {U \times U - {\rm{indit}}\left( \pi \right)} \right|} \over {\left| {U \times U} \right|}} = 1 - {{\left| { \cup _{j = 1}^m B_j \times B_j } \right|} \over {\left[ {U \times U} \right]}} = 1 - \sum\nolimits_{j = 1}^m \left( {{{\left| {B_j } \right|} \over {\left| U \right|}}} \right)^2 \cr = 1 - \sum\nolimits_{j = 1}^m \Pr \left( {B_j } \right)^2 = \sum\nolimits_{j \ne j'} \Pr \left( {B_j } \right)\Pr \left( {B_{j'} } \right). \cr}

Given any (always positive) probability measure p : U → [0, 1] on U = {u1, ..., un} which defines pi = p(ui) for i = 1, ..., n, the product measure p × p : U × U → [0, 1] has for any SU × U the value of: p×p(S)=(ui,uk)Spipk. p \times p\left( S \right) = \sum\nolimits_{\left( {u_i ,u_k } \right) \in S} p_i p_k.

The logical entropy of π is thus the product measure of its ditset: h(π)=p×p(dit(π))=(ui,uj)dit(π)pipj=jjPr(Bj)Pr(Bj) h\left( \pi \right) = p \times p\left( {{\rm{dit}}\left( \pi \right)} \right) = \sum\nolimits_{\left( {u_i ,u_j } \right) \in {\rm{dit}}\left( \pi \right)} p_i p_j = \sum\nolimits_{j \ne j'} \Pr \left( {B_j } \right)\Pr \left( {B_{j'} } \right) where Pr(Bj) = ∑uiBj pi.

Interpretation of logical entropy

The logical entropy h(π) of a partition π is the probability that in two draws from U (with replacement), one gets a distinction of the partition π.

Similarly, Pr(S) is the probability that in one draw from U , one gets an element of the subset SU . Thus the duality between subsets and partitions in their quantitative versions gives a duality between probability theory and information theory illustrated in Table 2.

Table 2.

Duality of quantitative subsets and partitions.

Logical Probability TheoryLogical Information Theory
OutcomesElements of SDistinctions of π
EventsSubsets SUDitsets dit(π) ⊆ U × U
pi=1n p_i = {1 \over n} Pr(S)=|S|[U] \Pr \left( S \right) = {{\left| S \right|} \over {\left[ U \right]}} h(π)=|dit(π)||U×U| h\left( \pi \right) = {{\left| {{\rm{dit}}\left( \pi \right)} \right|} \over {\left| {U \times U} \right|}}
Probs. pPr(S) = ∑uiS pih(π) = ∑(ui, uk)∈dit(π) pipk
Interpretation1-draw prob. of S-element2-draw prob. of π-distinction

Given partitions π = {B1, ..., Bm}, σ = {C1, ..., Cm'} on U , the ditset for their join is: dit(πσ)=dit(π)dit(σ)U×U. {\rm{dit}}\left( {\pi \vee \sigma } \right) = {\rm{dit}}\left( \pi \right) \cup {\rm{dit}}\left( \sigma \right) \subseteq U \times U.

Given probabilities p = {p1, ..., pn}, the joint logical entropy is: h(π,σ)=h(πσ)=p×p(dit(π)dit(σ))=1j,jp(BjCj)2. h\left( {\pi ,\sigma } \right) = h\left( {\pi \vee \sigma } \right) = p \times p\left( {{\rm{dit}}\left( \pi \right) \cup {\rm{dit}}\left( \sigma \right)} \right) = 1 - \sum\nolimits_{j,j'} p\left( {B_j \cap C_{j'} } \right)^2 .

The ditset for the difference (or conditional) logical entropy h(π|σ) is the difference of ditsets, and thus: h(π|σ) = p × p(dit(π) − dit(σ)). The ditset for the logical mutual information m(π, σ) is the intersection of ditsets, so: m(π, σ) = p × p(dit(π) ∩ dit(σ)). Venn diagrams apply to measures and since logical entropy is a probability measure on U × U, Figure 2 illustrates its Venn diagram for the compound notions of logical entropy.

Figure 2.

Venn diagram for compound logical entropies.

As in any Venn diagram for values of a measure, certain relationships hold such as: h(πσ)=h(π,σ)=h(π)+h(σ)m(π,σ)=h(π|σ)+m(π,σ)+h(σ|π). h\left( {\pi \vee \sigma } \right) = h\left( {\pi ,\sigma } \right) = h\left( \pi \right) + h\left( \sigma \right) - m\left( {\pi ,\sigma } \right) = h\left( {\pi |\sigma } \right) + m\left( {\pi ,\sigma } \right) + h\left( {\sigma |\pi } \right).

4.
Deriving the Shannon Entropies from the Logical Entropies

The simple and compound definitions for Shannon entropy H(π)=j=1mPr(Bj)log(1Pr(Bj)) H\left( \pi \right) = \sum\nolimits_{j = 1}^m \Pr \left( {B_j } \right)\log \left( {{1 \over {\Pr \left( {B_j } \right)}}} \right) were defined so that the Venn diagram relationships hold ([14,15]), but they are not defined in terms of a measure (in the sense of measure theory) [16]. However, all the Shannon entropies can be derived from the definitions of logical entropy (which is a measure) by a uniform monotonic transformation that preserves the Venn diagram relationships. It is easiest to work with the entropies of a probability distribution p = (p1, ..., pn) on U where: h(p)=h(1U)=1i=1npi2=ikpipk=i=1npi(1pi)H(p)=H(1U)=i=1npilog(1pi). \matrix{ {h\left( p \right) = h\left( {{\bf{1}}_U } \right) = 1 - \sum\nolimits_{i = 1}^n p_i^2 = \sum\nolimits_{i \ne k} p_i p_k = \sum\nolimits_{i = 1}^n \,p_i \left( {1 - p_i } \right)} \cr {H\left( p \right) = H\left( {{\bf{1}}_U } \right) = \sum\nolimits_{i = 1}^n \,p_i \log \left( {{1 \over {p_i }}} \right).} \cr }

Intuitively, if pi = 1, then there is no information in the occurrence of ui, so information is measured by the 1-complement. But there are two 1-complements, the additive 1-complement of 1 − pi and the multiplicative 1-complement of 1pi {1 \over {p_i }} .

  • The additive probability average of the additive 1-complements is the logical entropy: h(p)=i=1npi(1pi) h\left( p \right) = \sum\nolimits_{i = 1}^n p_i \left( {1 - p_i } \right) .

  • The multiplicative probability average of the multiplicative 1-complements is the log-free or anti-log version of Shannon entropy i=1n(1pi)pi=log1(H(p)) \prod\nolimits_{i = 1}^n \left( {{1 \over {p_i }}} \right)^{p_i } = \mathop {\log }\nolimits^{ - 1} \left( {H\left( p \right)} \right) .

Then the particular log is chosen according to the application, e.g., log2 in coding theory and ln in statistical mechanics. Since taking the log of the log-free version of Shannon entropy turns the multiplicative average into an additive average, we can then see how to directly transform the logical formulas into the Shannon formulas by the dit-bit transform: 1pilog(1pi). 1 - p_i \rightsquigarrow \log \left( {{1 \over {p_i }}} \right).

When the compound logical entropy formulas are formulated in terms of the additive 1-complements, then the dit-bit transform gives the corresponding compound formula for the Shannon entropies. This is illustrated in Table 3 for the probability distribution p : U → [0, 1] and a joint distribution p : X × Y → [0, 1].

Table 3.

The dit-bit transform from logical entropy to Shannon entropy.

The Dit-Bit Transform: 1pilog(1pi) 1 - p_i \rightsquigarrow \log \left( {{1 \over {p_i }}} \right)
h(p) =ipi(1 − pi)
H(p) =ipi log(1/pi)
h(X, Y) =x,y p (x, y) [1 − p(x, y)]
H(X, Y) = x,yp(x,y)log(1p(x,y)) \sum\nolimits_{x,y} p\left( {x,y} \right)\log \left( {{1 \over {p\left( {x,y} \right)}}} \right)
h(X|Y) =x,y p (x, y) [(1 − p(x, y) − (1 − p(y))]
H(X|Y) = x,yp(x,y)[log(1p(x,y))log(1p(y))] \sum\nolimits_{x,y} p\left( {x,y} \right)\left[ {\log \left( {{1 \over {p\left( {x,y} \right)}}} \right) - \log \left( {{1 \over {p\left( y \right)}}} \right)} \right]
m(X, Y)x,y p(x,y) [[1 − p(x)] + [1 − p(y)] − [1 − p(x,y)]]
I(X, Y) x,yp(x,y)[log(1p(x))+log(1p(y))log(1p(x,y))] \sum\nolimits_{x,y} p\left( {x,y} \right)\left[ {\log \left( {{1 \over {p\left( x \right)}}} \right) + \log \left( {{1 \over {p\left( y \right)}}} \right) - \log \left( {{1 \over {p\left( {x,y} \right)}}} \right)} \right]

The dit-bit transform preserves the same Venn diagram formulas for the Shannon entropies in spite of them not being a measure (in the sense of measure theory) so those relationships, illustrated in Figure 3, are normally termed a “mnemonic” [16, p. 112].

Figure 3.

Venn diagram “mnemonic” for compound Shannon entropies.

5.
Logical Entropy via Density Matrices

All this will carry over to the quantum version of logical entropy by using density matrices. First, the “classical” treatment of logical entropy is restated using density matrices over the reals. Then that will extend immediately to the quantum case of density matrices over the complex numbers by making the appropriate changes such as replacing the square with the absolute square.

Let’s do the density matrix version of p × p(dit(π)).The density matrix associated with each block Bjπ is the projection matrix ρ(Bj) = ∣bj〉〈bj∣ where ∣bj〉 is the n × 1 column vector with entries piPr(Bj) \sqrt {{{p_i } \over {\Pr \left( {B_j } \right)}}} if uiBj, else 0. Thus the entries are ρ(Bj)i,k=pipkPr(Bj) \rho \left( {B_j } \right)_{i,k} = {{\sqrt {p_i p_k } } \over {\Pr \left( {B_j } \right)}} if ui, ukBj, else 0. The density matrix for the partition is ρ(π)=j=1mPr(Bj)ρ(Bj) \rho \left( \pi \right) = \sum\nolimits_{j = 1}^m \Pr \left( {B_j } \right)\rho \left( {B_j } \right) where ρ(π)ik=pipk \rho \left( \pi \right)_{ik} = \sqrt {p_i p_k } if (ui, uk) ∈ indit(π), else 0. To recover the logical entropy h(π) = p × p(dit(π)) using density matrices, it can be calculated as: 1 − tr[ρ(π)2]. A basic result about any density matrix ρ is: tr[ρ2] = ∑i,k|ρik|2 [17, p. 77]

so tr[ρ(π)2]=(ui,uk)indit(π)pipk=1(ui,uk)dit(π)pipk=1p×p(dit(π)) {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] = \sum\nolimits_{(u_i ,u_k ) \in *{\rm{indit}}\left( \pi \right)} p_i p_k = 1 - \sum\nolimits_{\left( {u_i ,u_k } \right) \in {\rm{dit}}\left( \pi \right)} p_i p_k = 1 - p \times p\left( {{\rm{dit}}\left( \pi \right)} \right) and thus: h(π)=p×p(dit(π))=1tr[ρ(π)2]. h\left( \pi \right) = p \times p\left( {{\rm{dit}}\left( \pi \right)} \right) = 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right].

Example

Consider U = {a, b, c} with p : U → [0, 1] where pa=12 p_a = {1 \over 2} , pb=13 p_b = {1 \over 3} , and pc=16 p_c = {1 \over 6} and π = {B1, B2} = {{a, b}, {c}}. The usual calculation of the logical entropy is h(π)=1(56)2(16)2=12636=518 h\left( \pi \right) = 1 - \left( {{5 \over 6}} \right)^2 - \left( {{1 \over 6}} \right)^2 = 1 - {{26} \over {36}} = {5 \over {18}} . Then the density matrix calculation is: ρ(B1)=|b1b1|=[1/25/61/35/60][1/25/61/35/60]=[1/25/61/65/601/65/61/35/60000]; \rho \left( {B_1 } \right) = \left| {b_1 } \right\rangle \left\langle {b_1 } \right| = \left[ {\matrix{ {\sqrt {{{1/2} \over {5/6}}} } \cr {\sqrt {{{1/3} \over {5/6}}} } \cr 0 \cr } } \right]\left[ {\matrix{ {\sqrt {{{1/2} \over {5/6}}} } & {\sqrt {{{1/3} \over {5/6}}} } & 0 \cr } } \right] = \left[ {\matrix{ {{{1/2} \over {5/6}}} & {{{\sqrt {1/6} } \over {5/6}}} & 0 \cr {{{\sqrt {1/6} } \over {5/6}}} & {{{1/3} \over {5/6}}} & 0 \cr 0 & 0 & 0 \cr } } \right]; ρ(B2)=|b2b2|=[001][001]=[000000001] \rho \left( {B_2 } \right) = \left| {b_2 } \right\rangle \left\langle {b_2 } \right| = \left[ {\matrix{ 0 \cr 0 \cr 1 \cr } } \right]\left[ {\matrix{ 0 & 0 & 1 \cr } } \right] = \left[ {\matrix{ 0 & 0 & 0 \cr 0 & 0 & 0 \cr 0 & 0 & 1 \cr } } \right] so that: ρ(π)=j=12Pr(Bj)ρ(Bj)=[1/21/601/61/30001/6]andρ(π)2=[512563605636518000136]. \rho \left( \pi \right) = \sum\nolimits_{j = 1}^2 \Pr \left( {B_j } \right)\rho \left( {B_j } \right) = \left[ {\matrix{ {1/2} & {\sqrt {1/6} } & 0 \cr {\sqrt {1/6} } & {1/3} & 0 \cr 0 & 0 & {1/6} \cr } } \right]\,\,\,{\rm{and}}\,\,\,\,\rho \left( \pi \right)^2 = \left[ {\matrix{ {{5 \over {12}}} & {{{5\sqrt 6 } \over {36}}} & 0 \cr {{{5\sqrt 6 } \over {36}}} & {{5 \over {18}}} & 0 \cr 0 & 0 & {{1 \over {36}}} \cr } } \right].

Then tr[ρ(π)2]=1536+1036+136=2636 {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] = {{15} \over {36}} + {{10} \over {36}} + {1 \over {36}} = {{26} \over {36}} so 1tr[ρ(π)2]=518=h(π) 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] = {5 \over {18}} = h\left( \pi \right) \checkmark .

Borrowing the language of QM, ρ(Bj) as a projection matrix represents a pure state, i.e., ρ(Bj)2 = ρ(Bj). Then ρ(π) represents a mixed state where the pure states ρ(Bj) occur with the probabilities Pr(Bj). The dictionary giving the reformulation of set partition concepts in terms of density matrices is given in Table 4.

Table 4.

Dictionary translating set partitions into density matrices.

Set concept with probabilitiesSet level density matrix concept
Partition π with point probs. pDensity matrix ρ(π)=j=1mPr(Bj)|bjbj| \rho (\pi ) = \sum\nolimits_{j = 1}^m \Pr (B_j )|b_j \rangle \langle b_j |
Point probabilities {p1, ..., pn}Value of diagonal entries of ρ(π)
Trivial indits (ui, ui) of πDiagonal entries of ρ(π)
Non-trivial indits of πNon-zero off-diagonal entries of ρ(π)
Dits of πZero entries of ρ(π)
Sum Pr(Bj) = ∑uiBj piTrace tr[PBjρ(π)]
Block probabilities Pr(Bj) in πEigenvalues ≠ 0 of ρ(π)
Block prob. 1 of U in 0U = {U}Non-zero eigenvalue of 1 for ρ(0U)

We also need to give the set version of the “measurement” of a state ρ(π) by an “observable” given by a real-valued numerical attribute g : U which defines the inverse-image partition g−1 = {g−1(s)}sg(U). In QM, the transformation of a density matrix ρ(π) in the projective measurement by an observable is given by the Lüders mixture operation [18, p. 279]. For each block g−1(s) in the observable partition g−1, the diagonal projection matrix Ps has the diagonal entries χg−1(s) (ui), i.e., (Ps)ii = 1 if g(ui) = s, else 0. Then the Lüders mixture operation gives the post-measurement density matrix ρ̂(π) as: ρ^(π)=sg(U)Psρ(π)Ps. \hat \rho \left( \pi \right) = \sum\nolimits_{s \in g(U)} P_s \rho \left( \pi \right)P_s .

Proposition 1

ρ̂ (π) = ρ(πg−1).

Proof

A nonzero entry in ρ(π) has the form ρ(π)ik=pipk \rho \left( \pi \right)_{ik} = \sqrt {p_i p_k } iff there is some block Bjπ such that (ui, uk) ∈ Bj × Bj, i.e., if ui, ukBj and otherwise 0. The matrix operation Psρ(π) will preserve the entry pipk \sqrt {p_i p_k } if uig−1(s), otherwise the entry is zeroed. And if the entry was preserved, then the further matrix operation (Psρ(π))Ps will preserve the entry pipk \sqrt {p_i p_k } if ukg−1(s), otherwise it is zeroed. Hence the entries pjpk \sqrt {p_j p_k } in ρ(π) that are preserved in Psρ(π)Ps are the entries where both ui, ukBj for some Bjπ and ui, ukg−1(s). These are the entries in ρ(πg−1) corresponding to the blocks Bjg−1(s) for some Bjπ, so summing over the blocks g−1(s) ∈ g−1 gives the result: ρ̂(π) = ∑sg(U) Psρ(π)Ps = ρ(πg−1).

Proposition 2

(Measuring measurement). In the “projective measurement” ρ(π) ⇝ ρ( πg−1), the sum of the squares of the non-zero off-diagonal entries of ρ(π) that were zeroed in ρ̂(π) = ρ(πg−1) is the difference in their logical entropies h(πg−1) − h(π) = h(πg−1|π).

Proof

Since for any density matrix ρ, tr[ρ2] = ∑i,k|ρik|2, h(πg1|π)=h(πg1)h(π)=(1tr[ρ(πg1)2])(1tr[ρ(π)2])=i,k|ρ(π)ik|2i,k|ρ(πg1)ik|2 \eqalign{ & h\left( {\pi \vee g^{ - 1} |\pi } \right) = h\left( {\pi \vee g^{ - 1} } \right) - h\left( \pi \right) = (1 - {\rm{tr}}\left[ {\rho \left( {\pi \vee g^{ - 1} } \right)^2 } \right]) - \left( {1 - {\rm tr}\left[ {\rho \left( \pi \right)^2 } \right]} \right) = \sum\nolimits_{i,k} \left| {\rho \left( \pi \right)_{ik} } \right|^2 \cr & - \sum\nolimits_{i,k} \left| {\rho \left( {\pi \vee g^{ - 1} } \right)_{ik} } \right|^2 \cr} since the action of the projection operators in the Lüders mixture operation is either to zero an entry or leave it the same.

Example (continued)

Let g(a) = 1, g(b) = g(c) = 0. Then g−1 = {{a}, {b, c}} so πg−1 = {{a}, {b}, {c}} = 1U and thus h(πg1)=h(1U)=1(12)2(13)2(16)2=1936436136=11436=1118 h\left( {\pi \vee g^{ - 1} } \right) = h\left( {{\bf{1}}_U } \right) = 1 - \left( {{1 \over 2}} \right)^2 - \left( {{1 \over 3}} \right)^2 - \left( {{1 \over 6}} \right)^2 = 1 - {9 \over {36}} - {4 \over {36}} - {1 \over {36}} = 1 - {{14} \over {36}} = {{11} \over {18}} so that h(πg1)h(π)=1118518=13 h\left( {\pi \vee g^{ - 1} } \right) - h\left( \pi \right) = {{11} \over {18}} - {5 \over {18}} = {1 \over 3} . The density matrix for the discrete partition is: ρ(πg1)=ρ(1U)=[1/20001/30001/6]andρ(π)=[1/21/601/61/30001/6] \rho \left( {\pi \vee g^{ - 1} } \right) = \rho \left( {{\bf{1}}_U } \right) = \left[ {\matrix{ {1/2} & 0 & 0 \cr 0 & {1/3} & 0 \cr 0 & 0 & {1/6} \cr } } \right]\,\,\,\,{\rm{and}}\,\,\rho \left( \pi \right) = \left[ {\matrix{ {1/2} & {\sqrt {1/6} } & 0 \cr {\sqrt {1/6} } & {1/3} & 0 \cr 0 & 0 & {1/6} \cr } } \right] so the sum of the squares of the zeroed elements is (1/6)2+(1/6)2=13 \left( {\sqrt {1/6} } \right)^2 + \left( {\sqrt {1/6} } \right)^2 = {1 \over 3} .

The measuring measurement result deals with the non-zero off-diagonal terms in a density matrix.

[T]he off-diagonal terms of a density matrix... are often called quantum coherences because they are responsible for the interference effects typical of quantum mechanics that are absent in classical dynamics. [18, p. 177]

Since a projective measurement’s effect on a density matrix in QM is given by the Lüders mixture operation, that means that the effects of the measurement is the above-described “making distinctions” by decohering or zeroing certain coherence terms in the density matrix, and the sum of the absolute squares of the coherences that were decohered is the change in the logical entropy. This is a foretaste of the results for quantum logical entropy.

In coding theory, the Hamming distance between two n-ary 0, 1-vectors is the number of places where they differ. The partition version of this idea is the measure of where two partitions π and σ on U differ [19] which in terms of logical entropy is the logical Hamming distance between partitions (see Figure 2): d(π,σ):=h(π|σ)+h(σ|π)=h(πσ)m(π,σ)=2h(πσ)h(π)h(σ). d\left( {\pi ,\sigma } \right): = h\left( {\pi |\sigma } \right) + h\left( {\sigma |\pi } \right) = h\left( {\pi \vee \sigma } \right) - m\left( {\pi ,\sigma } \right) = 2h\left( {\pi \vee \sigma } \right) - h\left( \pi \right) - h\left( \sigma \right).

Intuitively, it is the logical information that is in each partition but not in the other, so it is a measure of how they differ, i.e., how “far apart” they are.

Lemma 1

h(πσ) = 1 − tr[ρ(π)ρ(σ)].

Proof

The kth diagonal entry in ρ(π)ρ(σ) is the scalar product ∑i ρ(π)kiρ(σ)ik with ρ(π)ki=pkpi \rho \left( \pi \right)_{ki} = \sqrt {p_k p_i } if (uk, ui) ∈ indit(π) and otherwise 0, and similarly for ρ(σ)ik. Hence the only non-zero terms in that sum are for (uk, ui) ∈ indit(π) ∩ indit(σ) = indit(πσ). Hence tr[ρ(π) ρ(σ)] = ∑(ui,uk)∈indit(πσ) pipk = 1 − ∑(ui,uk)∈dit(πσ) pipk so h([ ]πσ) = 1 − tr[ρ(π) ρ(σ)] and similarly for tr[ρ(σ) ρ(π)].

The quantity tr[(ρ(π) − ρ(σ))2] is usually termed the Hilbert-Schmidt distance between two density matrices ([20,21]) (sometimes with a 1/2 coefficient). It should be noted that the Hilbert-Schmidt distance is defined quite independently of the logical entropy and yet it is equal to the logical distance.

Proposition 3

tr[(ρ(π) − ρ(σ))2] = h(π|σ) + h(σ|π) = d(π, σ).

Proof

tr[(ρ(π) − ρ(σ))2] = tr[ρ(π)2] − tr[ρ(π) ρ(σ)] − tr[ρ(σ) ρ(π)] + tr[ρ(σ)2] so: tr[(ρ(π)ρ(σ))2]=2[1tr[ρ(π)ρ(σ)]](1tr[ρ(π)2](1tr[ρ(σ)2]))=2h(πσ)h(π)h(σ)=h(σ|π)+h(π|σ)=d(π,σ). \matrix{ \matrix{ {\rm{tr}}\left[ {\left( {\rho \left( \pi \right) - \rho \left( \sigma \right)} \right)^2 } \right] = 2\left[ {1 - {\rm{tr}}\left[ {\rho \left( \pi \right)\rho \left( \sigma \right)} \right]} \right] - \left( {1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right] - \left( {1 - {\rm{tr}}\left[ {\rho \left( \sigma \right)^2 } \right]} \right)} \right) \cr = 2h\left( {\pi \vee \sigma } \right) - h\left( \pi \right) - h\left( \sigma \right) = h\left( {\sigma |\pi } \right) + h\left( {\pi |\sigma } \right) = d\left( {\pi ,\sigma } \right). \cr} \cr {} \cr }

Corollary 1

tr[(ρ̂(π) − ρ(π))2] =h(ρ̂(π)|ρ(π)).

Proof

Taking σ = g−1 as the inverse-image partition of the numerical attribute, ρ̂(π) = ρ(πσ). Since ππσ, dit(π) ⊆ dit(πσ) so dit(π) − dit(πσ) = ∅ and thus h(π|πσ) = 0.

6.
Linearization from Sets to Vector Spaces

Our goal is to systematically derive the quantum logical entropy starting from the logic of partitions. We have so far developed the notion of logical entropy at the set level and given a number of results. There is a semi-algorithmic procedure or “yoga” [22, p. 271] to transform set concepts into the corresponding vector space concepts. The yoga is, in general, part of the mathematical folklore but parts have been stated explicitly [23, pp. 355–361].

Yoga of Linearization

Apply a set concept to a basis set of a vector space, and whatever is linearly generated is the corresponding vector space concept.

This yoga or procedure shows how the logical entropy concepts developed so far can be transformed into the corresponding concepts and results in the Hilbert vector spaces of quantum mechanics. Indeed, the previous results formulated using density matrices extend, mutatis mutandis (e.g., using the absolute square instead of the ordinary square), to the corresponding results about quantum logical entropy.

Hence we need to develop the dictionary to translate set concepts into vector space concepts. A subset of a basis set generates a subspace and the cardinality of the subset is the dimension of the subspace. Without assuming any probability distribution on U, a (real-valued) numerical attribute (e.g., weight, height, or age of persons) is a function f : U. In any vector space V over a field containing the reals where U is now a basis set, the numerical attribute generates a linear operator F : VV with real eigenvalues by the definition F ui = f(ui)ui. If we let fS = rS mean that the numerical attribute f restricted to S has the constant value of r, then the vector space version is the eigenvector-eigenvalue equation Fυ = . This means that the set-version of an eigenvector is a constant set of f and the constant value is a set-version of an eigenvalue. The numerical attribute’s inverse-image is a partition f−1 = {f−1(r)}rf(U) and each block f−1(r) generates a subspace Vr which is the eigenspace of the induced F for the eigenvalue r. Thus the partition f−1 generates a set of subspaces {Vr}rf(U) such that every vector υ can be uniquely expressed as a sum of non-zero vector υrVr, i.e., the partition f−1 generates a direct-sum decomposition (DSD) {Vr}rf(U) of the vector space. When the numerical attribute is just a characteristic function χ : U → 2 = {0, 1} of some attribute on U , then the induced operator P1 : VV is the projection operator to the subspace generated by χ−1(1). Hence for a general numerical attribute f, we have projection operators Pr to the subspace Vr generated by f−1(r). The spectral decomposition of the induced operator is: F = ∑rf(U) rPr which works backwards gives the spectral decomposition in the set case: f = ∑rf(U) f−1 (r) where χf−1(r) is the characteristic function for the subset f−1(r). And finally, the direct product U × U of a basis set U for V will (bi)linearly generate the tensor product VV (where the ordered pair (ui, uk) is written uiuk). These linearizations are summarized in Table 5.

Table 5.

Linearization dictionary to translate set concepts into corresponding vector space concepts.

Set conceptVector-space concept
Subset SUSubspace [S] ⊆ V
Partition {f−1(r)}rf(U)DSD {Vr}rf(U)
Disjoint union U = ⊎rf(U) f−1(r)Direct sum V = ⊕rf(U)Vr
Numerical attribute f : U → ℝObservable F ui = f(ui)ui
fS = rSF ui = rui
Constant set S of fEigenvector ui of F
Value r on constant set SEigenvalue r of eigenvector ui
Characteristic fcn. χS : U → {0, 1}Projection operator P[S]ui = χS (ui)ui
rf(U) χf−1 (r) = χUrf(U) Pr = I : VV
Spectral Decomp. f = ∑rf(U) rχf−1(r)Spectral Decomp. F = ∑rf(U) rPr
Set of r-constant sets ℘(f−1(r))Eigenspace Vr of r-eigenvectors
Direct product U × UTensor product VV
7.
Generalization to Quantum Logical Entropy

We have developed the notion of logical entropy as the quantitative version of partitions. The mathematics used was at the level of sets, e.g., numerical attributes and probability distributions on a set U . We have also outlined the semi-algorithmic yoga of linearization to translate set concepts into the corresponding vector space concepts. Hence the new concept of quantum logical entropy can be developed in a straightforward manner by linearizing the definition of logical entropy to Hilbert space.

The logical notion of information-as-distinctions generalizes to quantum information theory. A qubit is a pair of states definitely distinguishable in the sense of being orthogonal. In general, a qudit needs to be relativized to an observable—just as a dit is a dit of a partition such as the inverse-image partition f−1 of a numerical attribute f : U. Given such a numerical attribute f defined on an orthonormal (ON) basis for a (finite-dimensional) Hilbert space V , a Hermitian (or self-adjoint) operator F : VV is defined by F ui = f(ui)ui. The definition can be reversed. Given a Hermitian operator F on V , there is an ON basis of eigenvectors U and a real-valued numerical attribute f, the eigenvalue function, is defined on U by taking each element ui to its eigenvalue.

A qudit of an observable F is a pair (ui, uk) in the eigenbasis definitely distinguishable by F , i.e., f(ui) ≠ f(uk), distinct eigenvalues. Let qudit(F) be the set of tensor product basis elements uiuk for f(ui) ≠ f(uk). Since the quantum version of logical entropy is a straightforward generalization from sets to vector spaces, we give the generalization in Table 6. Numerical attributes f, g on U generate commuting observables F , G and commuting observables F , G generate eigenvalue functions f, g on the ON basis U of simultaneous eigenvectors. We follow Kolmogorov’s dictum by first giving the basic machinery without probabilities.

Table 6.

Yoga of Linearization without probabilities case.

Logical entropyQuantum logical entropy
U = {u1, ..., un}ON basis U for Hilbert space V
f, g :UCommuting F , G : VV
{r}rf(U), {S}sg(U)Eigenvalues of F and G
π = {f−1(r)}rf(U), σ = {g−1(S)}sg(U)DSDs of eigenspaces of F , G
Dits of π : (ui, uk), f(ui) ≠ f(uk)Qudits F : uiuk, f(ui) ≠ f(uk)
Dits of σ : (ui, uk), g(ui) ≠ g(uk)Qudits G: uiuk, g(ui) ≠ g(uk)
Ditset of π: dit(π)[qudit(F)]: Subspace generated in VV
Ditset of σ: dit(σ)[qudit(G)]: Subspace generated in VV
Join: dit(π) ∪ dit(σ) ⊆ U × U[qudit(F) ∪ qudit(G)] ⊆ VV
Difference: dit(π) − dit(σ) ⊆ U × U[qudit(F) − qudit(G)] ⊆ VV
Mutual: dit(π) ∩ dit(σ) ⊆ U × U[qudit(F) ∩ qudit(G)] ⊆ VV

In quantum mechanics, the probability information is carried by the state to be measured. Hence Table 6 deals with the set and quantum version of quantum observables, not quantum states. The next step is to apply linearization to the set and vector space versions of the quantum state which carries the probability information. At the set level, the universal set U is equipped with a probability distribution p : U → [0, 1]. Table 7 gives the translation dictionary to give the quantum logical entropy.

Table 7.

Logical entropy + Linearization = quantum logical entropy.

Logical entropyQuantum logical entropy
ρ(0U) = ρ(U) = ρ(U)2Pure state ρ(ψ) = ρ(ψ)2
p × p on U × Uρ(ψ) ⊗ ρ(ψ) on VV
h(0U) = 1 − tr[ρ(0U)2] = 0h(ρ(ψ)) = 1 − tr[ρ(ψ)2] = 0
π = f−1, h(π) = p × p(dit(π))h(F : ψ) = tr[P[qudit(F)]ρ (ψ) ⊗ ρ(ψ)]
h(π, σ) = p × p(dit(π) ∪ dit(σ))tr [P[qudit(F)∪qudit(G)]ρ(ψ) ⊗ ρ(ψ)]
h(π|σ) = p × p(dit(π) − dit(σ))tr [P[qudit(F)−qudit(G)]ρ(ψ) ⊗ ρ(ψ)]
m(π σ) = p × p(dit(π) ∩ dit(σ))tr [P[qudit(F)∩qudit(G)]ρ(ψ) ⊗ ρ(ψ)]
h (π) = h (π|σ) + m(π, σ)h(F : ψ) = h(F|G : ψ) + m(F, G : ψ)
ρ(π) = ρ̂(0U) = ∑rf(U) Prρ(0U)P rρ̂(ψ) = ∑rf(U) Prρ(ψ)Pr
h(π) = 1 − tr [ρ(π)2]h(F : ψ) = 1 − tr[ρ̂(ψ)2]

For an observable F , let f : U be the F -eigenvalue function assigning the real eigenvalue f(ui) for each ui in the ON basis U = {u1, … , un} of F -eigenvectors. The image f(U) is the set of F -eigenvalues {r1, … , rm}. Let Pr : VV be the projection matrix in the U -basis to the eigenspace of r. The projective F -measurement of the state ψ transforms the pure state density matrix ρ(ψ) (represented in the ON basis U of F -eigenvectors) to yield the Lüders mixture density matrix ρ̂(ψ) = ∑rf(U) Prρ(ψ)Pr [18, p. 279]. The off-diagonal elements of ρ(ψ) that are zeroed in ρ̂(ψ) are the coherences (quantum indistinctions or quindits) that are turned into “decoherences” (quantum distinctions or qudits of the observable being measured).

For any observable F and a pure state ψ, a quantum logical entropy was defined as h(F : ψ) = tr[P[qudit(F)]ρ(ψ) ⊗ ρ(ψ)]. That definition was the quantum generalization of the “classical” logical entropy defined as h(π) = p × p(dit(π)). When a projective F -measurement is performed on ψ, the pure state density matrix ρ(ψ) is transformed into the mixed state density matrix by the quantum Lüders mixture operation, which then defines the quantum logical entropy h(ρ̂(ψ)) = 1 − tr[ρ̂(ψ)2].

The first result is to show that these two entropies are the same: h(F : ψ) = h(ρ̂(ψ)). The proof proceeds by showing that they are both equal to classical logical entropy of the partition π(F : ψ) defined on the ON basis U = {u1, … , un} of F -eigenvectors by the F -eigenvalues with the point probabilities pi=αiαi p_i = \alpha _i^ * \alpha _i where |ψ=i=1nαi|ui \left| \psi \right\rangle = \sum\nolimits_{i = 1}^n \alpha _i \left| {u_i } \right\rangle . (1) That is, the inverse images Bj = f−1(rj) for j = 1, ..., m of the eigenvalue function f : U define the eigenvalue partition π(F : ψ) = {B1, … , Bm} on the ON basis U = {u1, … , un} with the point probabilities pi=αiαi=|αi|2 p_i = \alpha _i^ * \alpha _i = \left| {\alpha _i } \right|^2 provided by the state ψ for i = 1, … , n. The classical logical entropy of that partition is: h(π(F:ψ))=1j=1mp(Bi)2 h\left( {\pi \left( {F:\psi } \right)} \right) = 1 - \sum\nolimits_{j = 1}^m p\left( {B_i } \right)^2 where p(Bj) = ∑uiBj pi. h(F:ψ)=tr[P[qudit(F)]ρ(ψ)ρ(ψ)]=i.k=1n{pipk:f(ui)f(uk)}=jj{pipk:uiBj,ukBj}=jjp(Bj)p(Bj)=1j=1mp(Bj)2=h(π(F:ψ)). \matrix{ {h\left( {F:\psi } \right) = {\rm{tr}}\left[ {P_{\left[ {qudit(F)} \right]} \rho \left( \psi \right) \otimes \rho \left( \psi \right)} \right] = \sum\nolimits_{i.k = 1}^n \left\{ {p_i p_k :f\left( {u_i } \right) \ne f\left( {u_k } \right)} \right\}} \cr { = \sum\nolimits_{j \ne j'} \sum \left\{ {p_i p_k :u_i \in B_j ,u_k \in B_{j'} } \right\} = \sum\nolimits_{j \ne j'} \,p\left( {B_j } \right)p\left( {B_{j'} } \right)} \cr { = 1 - \sum\nolimits_{j = 1}^m p\left( {B_j } \right)^2 = h\left( {\pi \left( {F:\psi } \right)} \right).} \cr }

To show that h(ρ̂(ψ)) = 1 − tr[ρ̂(ψ)2] = h(π(F : ψ)) for ρ̂(ψ) = ∑rf(U) Prρ()Pr, we need to compute tr[ρ̂(ψ)2]. An off-diagonal element in ρik(ψ)=αiαk \rho _{ik} (\psi ) = \alpha _i \alpha _k^ * of ρ(ψ) survives (i.e., is not zeroed and has the same value) the Lüders operation if and only if f(ui) = f(uk). Hence, the i-th diagonal element of ρ̂(ψ)2 is: k=1n{αiαkαiαk:f(ui)=f(uk)}=k=1n{pipk:f(ui)=f(uk)}=pip(Bj) \sum\nolimits_{k = 1}^n \left\{ {\alpha _i^ * \alpha _k \alpha _i \alpha _k^ * :f\left( {u_i } \right) = f\left( {u_k } \right)} \right\} = \sum\nolimits_{k = 1}^n \left\{ {p_i p_k :f\left( {u_i } \right) = f\left( {u_k } \right)} \right\} = p_i p\left( {B_j } \right) where uiBj. Then, grouping the i-th diagonal elements for uiBj gives ∑uiBj pip(Bj) = p(Bj)2. Hence, the whole trace is: tr[ρ^(ψ)2]=j=1mp(Bj)2 {\rm{tr}}\left[ {\hat \rho \left( \psi \right)^2 } \right] = \sum\nolimits_{j = 1\,}^m p\left( {B_j } \right)^2 , and thus: h(ρ^(ψ))=1tr[ρ^(ψ)2]=1j=1mp(Bj)2=h(F:ψ). h\left( {\hat \rho \left( \psi \right)} \right) = 1 - {\rm{tr}}\left[ {\hat \rho \left( \psi \right)^2 } \right] = 1 - \sum\nolimits_{j = 1}^m p\left( {B_j } \right)^2 = h\left( {F:\psi } \right).

This finishes the proof of the following proposition.

Proposition 4

h(F : ψ) = h(π(F : ψ)) = h(ρ̂(ψ)).

This shows how the quantum case is so closely related to the set case that, in many instances, we can compute results in the quantum case by converting to the set case where computations are simpler.

Measurement creates distinctions, i.e., turns coherences into “decoherences,” which, classically, is the operation of distinguishing elements by classifying them according to some attribute like classifying the faces of a die by their parity. The fundamental theorem about quantum logical entropy and projective measurement, in the density matrix version, shows how the quantum logical entropy created (starting with h(ρ(ψ)) = 0 for the pure state ψ) by the measurement can be computed directly from the coherences of ρ(ψ) that are decohered in ρ̂(ψ).

Proposition 5

(Measuring measurement). The increase in quantum logical entropy, h(F : ψ) = h(ρ̂(ψ)) due to the F -measurement of the pure state ψ is the sum of the absolute squares of the non-zero off-diagonal terms (coherences) in ρ(ψ)(represented in an ON basis of F -eigenvectors) that are zeroed (“decohered”) in the post-measurement Lüders mixture density matrix ρ̂(ψ) = ∑rf(U) Prρ(ψ)Pr.

Proof

h(ρ̂(ψ)) − h(ρ(ψ)) = (1 − tr[ρ̂()2]) − (1 − tr[ρ()2]) = ∑i,k(|ρik(ψ)|2 − |ρ̂ik(ψ)|2). Now ui and uk are a qudit of F iff they are the corresponding off-diagonal terms zeroed by the Lüders mixture operation ∑rf(U) Prρ(ψ)Pr to obtain ρ̂(ψ) from ρ(ψ).

Since h(F : ψ) = h(π(F : ψ)) we can carry over the probability interpretation in the classical case h(π(F : ψ)) to the quantum case.

Interpretation of quantum logical entropy

The quantum logical entropy h(F : ψ) is the probability, in two independent F -measurements of a prepared pure state ψ, that different eigenvalues will be obtained—just as the logical entropy h(f−1) is the probability in two independent draws from U that different f-values will be obtained.

Example (continued)

It might be helpful to carry out a quantum version of the numerical example. In V = 3, let |ψ〉 = α|a〉 + β|b〉 + γ|c〉 be a normalized state vector so that pa=αα=|α|2=12 p_a = \alpha \alpha ^ * = \left| \alpha \right|^2 = {1 \over 2} , pb=ββ=|β|2=13 p_b = \beta \beta ^ * = \left| \beta \right|^2 = {1 \over 3} , and pc=γγ=|γ|2=16 p_c = \gamma \gamma ^ * = \left| \gamma \right|^2 = {1 \over 6} are the point probabilities on the ON basis U = {a, b, c} as in our running numerical example. Let F : VV be a Hermitian operator with the eigenvalue function f : U so that f−1(r1) = {a, b} and f−1(r2) = {c}. In the quantum case, the complex density matrix ρC (ψ) is: ρC(ψ)=[αβγ][αβγ]=[12αβαγβα13βγγαγβ16]. \rho _C \left( \psi \right) = \left[ {\matrix{ \alpha \cr \beta \cr \gamma \cr } } \right]\left[ {\matrix{ {\alpha ^ * } & {\beta ^ * } & {\gamma ^ * } \cr } } \right] = \left[ {\matrix{ {{1 \over 2}} & {\alpha \beta ^ * } & {\alpha \gamma ^ * } \cr {\beta \alpha ^ * } & {{1 \over 3}} & {\beta \gamma ^ * } \cr {\gamma \alpha ^ * } & {\gamma \beta ^ * } & {{1 \over 6}} \cr } } \right]. The corresponding real density matrix for those probabilities is: ρ(ψ)=ρ(0U)=[1/21/61/121/61/31/181/121/181/6]. \rho \left( \psi \right) = \rho \left( {{\bf{0}}_U } \right) = \left[ {\matrix{ {1/2} & {\sqrt {1/6} } & {\sqrt {1/12} } \cr {\sqrt {1/6} } & {1/3} & {\sqrt {1/18} } \cr {\sqrt {1/12} } & {\sqrt {1/18} } & {1/6} \cr } } \right]. The set partition on the ON basis set U is π = f−1 = {{a, b}, {c}} so the ditset is dit(π) = {(a, c), (b, c), (c, a), (c, b)} = {(a, b), (b, c), ...} (where the ellipsis ... stands for the reversed cases of the previously listed ordered pairs). Hence the qudits in VV are qudit(F) = {ac, bc, ...}. The quantum logical entropy resulting from the F -measurement (always projective) of ψ is h(F : ψ) = tr[P[qudit(F)]ρC (ψ) ⊗ ρC (ψ)] where ρC (ψ) ⊗ ρC(ψ) is a 9 × 9 complex matrix which could be written with some shorthand (since each cell is a 3 × 3 matrix) as: ρC(ψ)ρC(ψ)=[12ρC(ψ)αβρC(ψ)αγρC(ψ)βαρC(ψ)13ρC(ψ)βγρC(ψ)γαρC(ψ)γβρC(ψ)16ρC(ψ)]. \rho _C \left( \psi \right) \otimes \rho _C \left( \psi \right) = \left[ {\matrix{ {{1 \over 2}\rho _C \left( \psi \right)} & {\alpha \beta ^ * \rho _C \left( \psi \right)} & {\alpha \gamma ^ * \rho _C \left( \psi \right)} \cr {\beta \alpha ^ * \rho _C \left( \psi \right)} & {{1 \over 3}\rho _C \left( \psi \right)} & {\beta \gamma ^ * \rho _C \left( \psi \right)} \cr {\gamma \alpha ^ * \rho _C \left( \psi \right)} & {\gamma \beta ^ * \rho _C \left( \psi \right)} & {{1 \over 6}\rho _C \left( \psi \right)} \cr } } \right].

The diagonal elements of ρC (ψ) ⊗ ρC (ψ) are real products of probabilities. The projection operator P[qudit(F)] to the subspace generated by qudit[F] in VV is a 9 × 9 diagonal matrix whose non-zero diagonal entries are ones corresponding to qudits(F) = {ac, bc, ...}. Thus the product P[qudit(F)] ρC (ψ) ⊗ ρC (ψ) just picks out (along its diagonal) those pairs of probabilities corresponding to dit(f−1), namely {papc, pbpc, ...} and then taking the trace sums them up to yield h(F : ψ) = h(π(F : ψ)). It sums the four entries corresponding to the qudits, 1216 {1 \over 2}{1 \over 6} for ac, 1316 {1 \over 3}{1 \over 6} for bc, and the two reverse products for a total of: 2(112+118)=2(336+236)=1036=518. 2\left( {{1 \over {12}} + {1 \over {18}}} \right) = 2\left( {{3 \over {36}} + {2 \over {36}}} \right) = {{10} \over {36}} = {5 \over {18}}. \checkmark

Quantum logical entropy also has natural connections with other quantum notions such as the Hilbert-Schmidt distance tr[(ρτ)2] [20] between two density matrices ρ and τ. And, as usual, the quantum case is developed as the quantum version of the “classical” logical entropy. We previously defined the logical Hamming distance between two partitions. d(π,σ):=h(π|σ)+h(σ|π)=h(πσ)m(π,σ)=2h(πσ)h(π)h(σ). d\left( {\pi ,\sigma } \right): = h\left( {\pi |\sigma } \right) + h\left( {\sigma |\pi } \right) = h\left( {\pi \vee \sigma } \right) - m\left( {\pi ,\sigma } \right) = 2h\left( {\pi \vee \sigma } \right) - h\left( \pi \right) - h\left( \sigma \right).

Nielsen and Chuang are skeptical about developing the Hamming distance in the quantum context.

Unfortunately, the Hamming distance between two objects is simply a matter of labeling, and a priori there aren’t any labels in the Hilbert space arena of quantum mechanics! [24, p. 399]

We have seen how to define a density matrix ρ(π) from a set partition U, and this provides a bridge to quantum logical entropy for any density matrix ρ: h(π)=1tr[ρ(π)2]extends to h(ρ)=1tr[ρ2]. h\left( \pi \right) = 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)^2 } \right]\,\,\,{\rm{extends\,to\,\,}}h\left( \rho \right) = 1 - {\rm{tr}}\left[ {\rho ^2 } \right]. The formula h(ρ) = 1 − tr[ρ2] is not new; only the whole development of logical entropy from partition logic is new (and hence the new name). Indeed, tr[ρ2] is usually called the purity of the density matrix since a state ρ is pure if and only if tr[ρ2] = 1, so h(ρ) = 0, and otherwise, tr[ρ2] < 1, so h(ρ) > 0. The complement 1 − tr[ρ2] has been called the “mixedness” [25, p. 5] of the state ρ. The seminal paper of Manfredi and Feix [3] approaches the same formula 1 − tr[ρ2] (which they denote as S2) from the advanced viewpoint of Wigner functions, and they present strong arguments for this notion of quantum entropy (which resulted in Manfredi editing a special issue of the journal 4Open on logical entropy [26]). This notion of quantum logical entropy is also called by the misnomer “linear entropy” [27] even though it is quadratic in ρ, so we will not continue that usage.

Using these density matrices, there is also the notion of logical cross-entropy of π and σ: h(π||σ)=1tr[ρ(π)ρ(σ)] h\left( {\pi ||\sigma } \right) = 1 - {\rm{tr}}\left[ {\rho \left( \pi \right)\rho \left( \sigma \right)} \right] where h(π||σ) = h(π, σ) = h(πσ). Since there is no join defined for density matrices ρ and τ, we can nevertheless define the quantum logical cross-entropy of ρ and τ as (where the dagger is conjugate transpose): h(ρ||τ):=1tr[ρτ]. h\left( {\rho ||\tau } \right): = 1 - {\rm{tr}}\left[ {\rho ^\dagger \tau } \right]. This provides the path to then define the quantum logical Hamming distance between ρ and τ: d(ρ,τ):=2h(ρ||τ)h(ρ)h(τ). d\left( {\rho ,\tau } \right): = 2h\left( {\rho ||\tau } \right) - h\left( \rho \right) - h\left( \tau \right).

Since ρ and τ are also Hermitian matrices, each has an ON basis of eigenvectors and this approach to Hamming distance avoids the Nielsen-Chuang misgivings by using the amplitudes of all possible relations between the two ON bases [4, pp. 89–90] so no arbitrary labeling of the bases is involved. Then we have the theorem connecting quantum logical Hamming distance to an important existing notion in quantum information theory.

Proposition 6

(Hamming = Hilbert-Schmidt distance). tr[(ρτ)2] = d(ρ, τ).

Proof

tr[(ρτ)2] = tr[ρ2 + τ2 − 2ρτ] = tr[ρ2] + tr[τ2] − 2 tr[ρτ] = 2h(ρ||τ) − h(ρ) − h(τ) = d(ρ, τ).

8.
Concluding Remarks

In that manner, the computation of the quantum logical entropies can be reduced to the computations in the corresponding “classical” case of logical entropies. Moreover, the definitions are made in Table 7 so that we have all the usual compound notions of quantum logical entropy that satisfy the usual Venn diagram relationships as illustrated in Figure 4.

Figure 4.

Venn diagram relationships for quantum logical entropy.

The overall purpose of this paper has been to develop quantum logical entropy starting from the logic of partitions at the set level, developing the quantitative version of partitions as logical entropy, and then the development of that corresponding quantum notion using the yoga of linearization to translate the set concepts into the corresponding (Hilbert) vector space concepts.

There are a number of other results in the literature ([4,20,28]) about quantum logical entropy such as its concavity, subadditivity, non-decreasing value under projective measurement, a Holevo-type bound for quantum logical Hamming distance, and the extension of quantum logical entropy to post-selected quantum systems. More results are sure to come as more researchers are familiar with the logical entropy concepts starting with the quantitative treatment of partitions in terms of distinctions.

We find this framework of partitions and distinction most suitable (at least conceptually) for describing the problems of quantum state discrimination, quantum cryptography and in general, for discussing quantum channel capacity. In these problems, we are basically interested in a distance measure between such sets of states, and this is exactly the kind of knowledge provided by logical entropy [Reference to [1]]. [2, p. 1]

In conventional information theory or in what Claude Shannon called the “Mathematical Theory of Communication” [14], he noted that “no concept of information itself was defined” [29, p. 458]. The extension of Shannon entropy to the quantum notion of von Neumann entropy did not solve that problem of defining quantum information or the problem of interpretation. Logical entropy as the quantification of partitions defines the notion of information-as-distinctions, and quantum logical entropy extends that notion to the quantum realm as the quantification of quantum distinctions or qudits. This answers the vision of Charles Bennett, one of the founders of quantum information theory.

So information really is a very useful abstraction. It is the notion of distinguishability abstracted away from what we are distinguishing, or from the carrier of information.... ...

And we ought to develop a theory of information which generalizes the theory of distinguishability to include these quantum properties.... [30, pp. 155–157]

Whenever possible, we ignore the ket notation |ui〉 and just write ui.

DOI: https://doi.org/10.2478/qic-2025-0005 | Journal eISSN: 3106-0544 | Journal ISSN: 1533-7146
Language: English
Page range: 82 - 96
Submitted on: Dec 20, 2024
Accepted on: Feb 11, 2025
Published on: Feb 25, 2025
Published by: Cerebration Science Publishing Co., Limited
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Related subjects:

© 2025 David Ellerman, published by Cerebration Science Publishing Co., Limited
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.