The purpose of this paper is to derive the relatively new notion of quantum logical entropy ([1,2,3,4]) from the relatively new logic of partitions ([5,6,7]) that is category-theoretically dual to the usual Boolean logic of subsets. The “classical” notion of logical entropy is derived starting as the quantitative version of the distinctions of partitions just as probability is derived starting as the quantitative version of the elements of subsets [8]. The notion of logical entropy is compared and contrasted with the usual Shannon entropy. Then the notion of logical entropy is linearized using a semi-algorithmic procedure to translate set-based concepts into the corresponding vector space concepts. The concept of logical entropy linearized to Hilbert space gives the concept of quantum logical entropy. The linearization procedure also allows results proven at the set level with logical entropy to be extended in a straightforward manner to quantum logical entropy. For instance, logical entropy is a probability measure with a simple probability interpretation (i.e., the two-draw probability of getting a distinction of a partition) and that interpretation extends to quantum logical entropy. That is, given an observable and a quantum state vector, their quantum logical entropy is the probability in two independent projective measurements of the observable on the prepared state that different eigenvalues are obtained. As a measure, the compound notions of difference (or conditional) and mutual logical entropy are immediately defined and their relationships are illustrated in the usual Venn diagrams for measures. The derivation of the quantum logical entropy allows the definitions of the difference and mutual quantum logical entropies which satisfy the corresponding relationships. This method of deriving the concepts of quantum logical entropy makes it a logic-based and natural measure of information for quantum mechanics (QM). Our purpose is to show that naturality and fundamentality of this notion of quantum information so the contrasts and comparisons with other notions such as the von Neumann entropy are left to the reader.
Subsets and partitions are category-theoretic dual concepts. That is, in the turn-around-the-arrows duality of category theory, a subset is also called a “part” and “The dual notion (obtained by reversing the arrows) of ‘part’ is the notion of partition” [9, p. 85]. A partition π = {B1, ..., Bm} on a universe set U = {u1, ..., un} (|U| ≥ 2) is a set of nonempty subsets Bj called “blocks” such that the blocks are disjoint and their union is U. The partitions on U form a lattice Π(U). The partial order (PO) for the lattice is refinement where a partition σ = {C1, ..., Cm′} is refined by π, written σ ≾ π, if for every block Bj ∈ π, there is a block Cj′ ∈ σ such that Bj ⊆ Cj′ .
At a more atomic or granular level, the elements of a subset are dual to the distinctions (dits) of a partition which are ordered pairs of elements in different blocks of the partition. The set of distinctions or ditset of a partition π is dit(π) ⊆ U × U and the complementary set,
The join σ ∨ π is the partition whose blocks are the nonempty intersections Bj ∩ Cj' , and it is the least upper bound of π and σ for the refinement partial order. The ditset of the join is the union of the ditsets: dit(σ ∨ π) = dit(σ) ∪ dit(π). Since the arbitrary intersection of equivalence relations is an equivalence relation, the meet σ ∧ π can be defined as the partition whose ditset is the complement of the smallest equivalence relation containing indit(σ) ∪ indit(π), and it is the greatest lower bound of σ and π. The top of the lattice is the discrete partition 1U = {{ui}}u1∈U of singletons of the elements of U and the bottom of the lattice is the indiscrete partition 0U = {U} whose only block is all of U .
The lattice of partitions was known in the 19th century (e.g., Dedekind and Schröder). However, throughout the 20th century “the only operations on the family of equivalence relations fully studied, understood and deployed are the binary join ∨ and meet ∧ operations” [10, p. 445]. To go from a lattice of partitions to a logic of partitions comparable to the usual Boolean logic of subsets, there needs to be at least an implication operation defined on partitions. That operation would be the parallel to the subset implication or conditional S ⊃ T = Sc ∪ T for S, T ⊆ U in the powerset Boolean algebra ℘(U) of subsets of U where the partial order is inclusion, the join and meet are union and intersection respectively, and the top and bottom are U and ∅ respectively. The implication σ ⇒ π is the partition on U that is like π except that whenever a block Bj ∈ π is contained in some block of σ, then Bj is replaced by the singletons of its elements. If we denote a block B ∈ π when it has been “discretized” as 1B and when it remains whole as 0B, then the implication σ ⇒ π functions like a characteristic or indicator function for inclusion of π-blocks in σ-blocks. Thus when they are all included, i.e., when refinement holds, then the implication is the discrete partition 1U :
Thus, the usual Boolean logic of subsets (often presented in only the special case of “propositional logic”) has a dual logic of partitions. The elements (Its) of subsets and the distinctions (Dits) of partitions have dual corresponding roles as illustrated in Table 1.
Elements-distinctions duality between the two dual logics.
| Logic ℘(U) of subsets of U | Logic of partitions Π(U) on U | |
|---|---|---|
| Its/Dits | Elements of subsets | Distinctions of partitions |
| P.O. | Inclusion of subsets | Inclusion of ditsets |
| Join | Union of subsets | Union of ditsets |
| Meet | Subset of common elements | Ditset of common dits |
| Impl. | S ⊃ T = U iff S ⊆ T | σ ⇒ π = 1U iff σ ≾ π |
| Top | Subset U with all elements | Partition 1U with all distinctions |
| Bottom | Subset ∅ with no elements | Partition 0U with no distinctions |
For the simplest non-trivial case, the two lattices are illustrated in Figure 1 for U = {a, b, c}.

The lattices of the dual subsets and partitions.
Gian-Carlo Rota made the crucial connection between the dual notions of subsets and partitions. “The lattice of partitions plays for information the role that the Boolean algebra of subsets plays for size or probability” [11, p. 30].
In his Fubini Lectures, Rota said “Probability is a measure on the Boolean algebra of events” that gives quantitatively the “intuitive idea of the size of a set,” so we may ask by “analogy” for some measure to capture a property for a partition like “what size is to a set.” Rota goes on to ask:
How shall we be led to such a property? We have already an inkling of what it should be: it should be a measure of information provided by a random variable. Is there a candidate for the measure of the amount of information? [12, p. 67]
The new logical foundations for information theory [4] start with sets, not probabilities, as suggested by Andrei Kolmogorov.
Information theory must precede probability theory, and not be based on it. By the very essence of this discipline, the foundations of information theory have a finite combinatorial character. [13, p. 39]
Since logical probability theory [8] starts as the normalized size of a subset, i.e.,
Given any (always positive) probability measure p : U → [0, 1] on U = {u1, ..., un} which defines pi = p(ui) for i = 1, ..., n, the product measure p × p : U × U → [0, 1] has for any S ⊆ U × U the value of:
The logical entropy of π is thus the product measure of its ditset:
Interpretation of logical entropy
The logical entropy h(π) of a partition π is the probability that in two draws from U (with replacement), one gets a distinction of the partition π.
Similarly, Pr(S) is the probability that in one draw from U , one gets an element of the subset S ⊆ U . Thus the duality between subsets and partitions in their quantitative versions gives a duality between probability theory and information theory illustrated in Table 2.
Duality of quantitative subsets and partitions.
| Logical Probability Theory | Logical Information Theory | |
|---|---|---|
| Outcomes | Elements of S | Distinctions of π |
| Events | Subsets S ⊆ U | Ditsets dit(π) ⊆ U × U |
|
|
|
|
| Probs. p | Pr(S) = ∑ui∈S pi | h(π) = ∑(ui, uk)∈dit(π) pipk |
| Interpretation | 1-draw prob. of S-element | 2-draw prob. of π-distinction |
Given partitions π = {B1, ..., Bm}, σ = {C1, ..., Cm'} on U , the ditset for their join is:
Given probabilities p = {p1, ..., pn}, the joint logical entropy is:
The ditset for the difference (or conditional) logical entropy h(π|σ) is the difference of ditsets, and thus: h(π|σ) = p × p(dit(π) − dit(σ)). The ditset for the logical mutual information m(π, σ) is the intersection of ditsets, so: m(π, σ) = p × p(dit(π) ∩ dit(σ)). Venn diagrams apply to measures and since logical entropy is a probability measure on U × U, Figure 2 illustrates its Venn diagram for the compound notions of logical entropy.

Venn diagram for compound logical entropies.
As in any Venn diagram for values of a measure, certain relationships hold such as:
The simple and compound definitions for Shannon entropy
Intuitively, if pi = 1, then there is no information in the occurrence of ui, so information is measured by the 1-complement. But there are two 1-complements, the additive 1-complement of 1 − pi and the multiplicative 1-complement of
The additive probability average of the additive 1-complements is the logical entropy:
.h\left( p \right) = \sum\nolimits_{i = 1}^n p_i \left( {1 - p_i } \right) The multiplicative probability average of the multiplicative 1-complements is the log-free or anti-log version of Shannon entropy
.\prod\nolimits_{i = 1}^n \left( {{1 \over {p_i }}} \right)^{p_i } = \mathop {\log }\nolimits^{ - 1} \left( {H\left( p \right)} \right)
Then the particular log is chosen according to the application, e.g., log2 in coding theory and ln in statistical mechanics. Since taking the log of the log-free version of Shannon entropy turns the multiplicative average into an additive average, we can then see how to directly transform the logical formulas into the Shannon formulas by the dit-bit transform:
When the compound logical entropy formulas are formulated in terms of the additive 1-complements, then the dit-bit transform gives the corresponding compound formula for the Shannon entropies. This is illustrated in Table 3 for the probability distribution p : U → [0, 1] and a joint distribution p : X × Y → [0, 1].
The dit-bit transform from logical entropy to Shannon entropy.
| The Dit-Bit Transform:
| |
| h(p) = | ∑ipi(1 − pi) |
| H(p) = | ∑ipi log(1/pi) |
| h(X, Y) = | ∑x,y p (x, y) [1 − p(x, y)] |
| H(X, Y) = |
|
| h(X|Y) = | ∑x,y p (x, y) [(1 − p(x, y) − (1 − p(y))] |
| H(X|Y) = |
|
| m(X, Y) | ∑x,y p(x,y) [[1 − p(x)] + [1 − p(y)] − [1 − p(x,y)]] |
| I(X, Y) |
|
The dit-bit transform preserves the same Venn diagram formulas for the Shannon entropies in spite of them not being a measure (in the sense of measure theory) so those relationships, illustrated in Figure 3, are normally termed a “mnemonic” [16, p. 112].

Venn diagram “mnemonic” for compound Shannon entropies.
All this will carry over to the quantum version of logical entropy by using density matrices. First, the “classical” treatment of logical entropy is restated using density matrices over the reals. Then that will extend immediately to the quantum case of density matrices over the complex numbers by making the appropriate changes such as replacing the square with the absolute square.
Let’s do the density matrix version of p × p(dit(π)).The density matrix associated with each block Bj ∈ π is the projection matrix ρ(Bj) = ∣bj〉〈bj∣ where ∣bj〉 is the n × 1 column vector with entries
so
Consider U = {a, b, c} with p : U → [0, 1] where
Then
Borrowing the language of QM, ρ(Bj) as a projection matrix represents a pure state, i.e., ρ(Bj)2 = ρ(Bj). Then ρ(π) represents a mixed state where the pure states ρ(Bj) occur with the probabilities Pr(Bj). The dictionary giving the reformulation of set partition concepts in terms of density matrices is given in Table 4.
Dictionary translating set partitions into density matrices.
| Set concept with probabilities | Set level density matrix concept |
|---|---|
| Partition π with point probs. p | Density matrix
|
| Point probabilities {p1, ..., pn} | Value of diagonal entries of ρ(π) |
| Trivial indits (ui, ui) of π | Diagonal entries of ρ(π) |
| Non-trivial indits of π | Non-zero off-diagonal entries of ρ(π) |
| Dits of π | Zero entries of ρ(π) |
| Sum Pr(Bj) = ∑ui∈ Bj pi | Trace tr[PBjρ(π)] |
| Block probabilities Pr(Bj) in π | Eigenvalues ≠ 0 of ρ(π) |
| Block prob. 1 of U in 0U = {U} | Non-zero eigenvalue of 1 for ρ(0U) |
We also need to give the set version of the “measurement” of a state ρ(π) by an “observable” given by a real-valued numerical attribute g : U → ℝ which defines the inverse-image partition g−1 = {g−1(s)}s∈g(U). In QM, the transformation of a density matrix ρ(π) in the projective measurement by an observable is given by the Lüders mixture operation [18, p. 279]. For each block g−1(s) in the observable partition g−1, the diagonal projection matrix Ps has the diagonal entries χg−1(s) (ui), i.e., (Ps)ii = 1 if g(ui) = s, else 0. Then the Lüders mixture operation gives the post-measurement density matrix ρ̂(π) as:
ρ̂ (π) = ρ(π ∨g−1).
A nonzero entry in ρ(π) has the form
(Measuring measurement). In the “projective measurement” ρ(π) ⇝ ρ( π ∨ g−1), the sum of the squares of the non-zero off-diagonal entries of ρ(π) that were zeroed in ρ̂(π) = ρ(π ∨ g−1) is the difference in their logical entropies h(π ∨ g−1) − h(π) = h(π ∨ g−1|π).
Since for any density matrix ρ, tr[ρ2] = ∑i,k|ρik|2,
Let g(a) = 1, g(b) = g(c) = 0. Then g−1 = {{a}, {b, c}} so π ∨ g−1 = {{a}, {b}, {c}} = 1U and thus
The measuring measurement result deals with the non-zero off-diagonal terms in a density matrix.
[T]he off-diagonal terms of a density matrix... are often called quantum coherences because they are responsible for the interference effects typical of quantum mechanics that are absent in classical dynamics. [18, p. 177]
In coding theory, the Hamming distance between two n-ary 0, 1-vectors is the number of places where they differ. The partition version of this idea is the measure of where two partitions π and σ on U differ [19] which in terms of logical entropy is the logical Hamming distance between partitions (see Figure 2):
Intuitively, it is the logical information that is in each partition but not in the other, so it is a measure of how they differ, i.e., how “far apart” they are.
h(π ∨ σ) = 1 − tr[ρ(π)ρ(σ)].
The kth diagonal entry in ρ(π)ρ(σ) is the scalar product ∑i ρ(π)kiρ(σ)ik with
The quantity tr[(ρ(π) − ρ(σ))2] is usually termed the Hilbert-Schmidt distance between two density matrices ([20,21]) (sometimes with a 1/2 coefficient). It should be noted that the Hilbert-Schmidt distance is defined quite independently of the logical entropy and yet it is equal to the logical distance.
tr[(ρ(π) − ρ(σ))2] = h(π|σ) + h(σ|π) = d(π, σ).
tr[(ρ(π) − ρ(σ))2] = tr[ρ(π)2] − tr[ρ(π) ρ(σ)] − tr[ρ(σ) ρ(π)] + tr[ρ(σ)2] so:
tr[(ρ̂(π) − ρ(π))2] =h(ρ̂(π)|ρ(π)).
Taking σ = g−1 as the inverse-image partition of the numerical attribute, ρ̂(π) = ρ(π ∨ σ). Since π ≾ π ∨ σ, dit(π) ⊆ dit(π ∨ σ) so dit(π) − dit(π ∨ σ) = ∅ and thus h(π|π ∨ σ) = 0.
Our goal is to systematically derive the quantum logical entropy starting from the logic of partitions. We have so far developed the notion of logical entropy at the set level and given a number of results. There is a semi-algorithmic procedure or “yoga” [22, p. 271] to transform set concepts into the corresponding vector space concepts. The yoga is, in general, part of the mathematical folklore but parts have been stated explicitly [23, pp. 355–361].
Yoga of Linearization
Apply a set concept to a basis set of a vector space, and whatever is linearly generated is the corresponding vector space concept.
This yoga or procedure shows how the logical entropy concepts developed so far can be transformed into the corresponding concepts and results in the Hilbert vector spaces of quantum mechanics. Indeed, the previous results formulated using density matrices extend, mutatis mutandis (e.g., using the absolute square instead of the ordinary square), to the corresponding results about quantum logical entropy.
Hence we need to develop the dictionary to translate set concepts into vector space concepts. A subset of a basis set generates a subspace and the cardinality of the subset is the dimension of the subspace. Without assuming any probability distribution on U, a (real-valued) numerical attribute (e.g., weight, height, or age of persons) is a function f : U → ℝ. In any vector space V over a field containing the reals where U is now a basis set, the numerical attribute generates a linear operator F : V → V with real eigenvalues by the definition F ui = f(ui)ui. If we let f ↾ S = rS mean that the numerical attribute f restricted to S has the constant value of r, then the vector space version is the eigenvector-eigenvalue equation Fυ = rυ. This means that the set-version of an eigenvector is a constant set of f and the constant value is a set-version of an eigenvalue. The numerical attribute’s inverse-image is a partition f−1 = {f−1(r)}r∈f(U) and each block f−1(r) generates a subspace Vr which is the eigenspace of the induced F for the eigenvalue r. Thus the partition f−1 generates a set of subspaces {Vr}r∈f(U) such that every vector υ can be uniquely expressed as a sum of non-zero vector υr ∈ Vr, i.e., the partition f−1 generates a direct-sum decomposition (DSD) {Vr}r∈f(U) of the vector space. When the numerical attribute is just a characteristic function χ : U → 2 = {0, 1} of some attribute on U , then the induced operator P1 : V → V is the projection operator to the subspace generated by χ−1(1). Hence for a general numerical attribute f, we have projection operators Pr to the subspace Vr generated by f−1(r). The spectral decomposition of the induced operator is: F = ∑r∈f(U) rPr which works backwards gives the spectral decomposition in the set case: f = ∑r∈f(U) rχf−1 (r) where χf−1(r) is the characteristic function for the subset f−1(r). And finally, the direct product U × U of a basis set U for V will (bi)linearly generate the tensor product V ⊗ V (where the ordered pair (ui, uk) is written ui ⊗ uk). These linearizations are summarized in Table 5.
Linearization dictionary to translate set concepts into corresponding vector space concepts.
| Set concept | Vector-space concept |
|---|---|
| Subset S ⊆ U | Subspace [S] ⊆ V |
| Partition {f−1(r)}r∈f(U) | DSD {Vr}r∈f(U) |
| Disjoint union U = ⊎r∈f(U) f−1(r) | Direct sum V = ⊕r∈f(U)Vr |
| Numerical attribute f : U → ℝ | Observable F ui = f(ui)ui |
| f ↾ S = rS | F ui = rui |
| Constant set S of f | Eigenvector ui of F |
| Value r on constant set S | Eigenvalue r of eigenvector ui |
| Characteristic fcn. χS : U → {0, 1} | Projection operator P[S]ui = χS (ui)ui |
| ∑r∈f(U) χf−1 (r) = χU | ∑r∈f(U) Pr = I : V → V |
| Spectral Decomp. f = ∑r∈f(U) rχf−1(r) | Spectral Decomp. F = ∑r∈f(U) rPr |
| Set of r-constant sets ℘(f−1(r)) | Eigenspace Vr of r-eigenvectors |
| Direct product U × U | Tensor product V ⊗ V |
We have developed the notion of logical entropy as the quantitative version of partitions. The mathematics used was at the level of sets, e.g., numerical attributes and probability distributions on a set U . We have also outlined the semi-algorithmic yoga of linearization to translate set concepts into the corresponding vector space concepts. Hence the new concept of quantum logical entropy can be developed in a straightforward manner by linearizing the definition of logical entropy to Hilbert space.
The logical notion of information-as-distinctions generalizes to quantum information theory. A qubit is a pair of states definitely distinguishable in the sense of being orthogonal. In general, a qudit needs to be relativized to an observable—just as a dit is a dit of a partition such as the inverse-image partition f−1 of a numerical attribute f : U → ℝ. Given such a numerical attribute f defined on an orthonormal (ON) basis for a (finite-dimensional) Hilbert space V , a Hermitian (or self-adjoint) operator F : V → V is defined by F ui = f(ui)ui. The definition can be reversed. Given a Hermitian operator F on V , there is an ON basis of eigenvectors U and a real-valued numerical attribute f, the eigenvalue function, is defined on U by taking each element ui to its eigenvalue.
A qudit of an observable F is a pair (ui, uk) in the eigenbasis definitely distinguishable by F , i.e., f(ui) ≠ f(uk), distinct eigenvalues. Let qudit(F) be the set of tensor product basis elements ui ⊗ uk for f(ui) ≠ f(uk). Since the quantum version of logical entropy is a straightforward generalization from sets to vector spaces, we give the generalization in Table 6. Numerical attributes f, g on U generate commuting observables F , G and commuting observables F , G generate eigenvalue functions f, g on the ON basis U of simultaneous eigenvectors. We follow Kolmogorov’s dictum by first giving the basic machinery without probabilities.
Yoga of Linearization without probabilities case.
| Logical entropy | Quantum logical entropy |
|---|---|
| U = {u1, ..., un} | ON basis U for Hilbert space V |
| f, g :U → ℝ | Commuting F , G : V → V |
| {r}r∈f(U), {S}s∈g(U) | Eigenvalues of F and G |
| π = {f−1(r)}r∈f(U), σ = {g−1(S)}s∈g(U) | DSDs of eigenspaces of F , G |
| Dits of π : (ui, uk), f(ui) ≠ f(uk) | Qudits F : ui ⊗ uk, f(ui) ≠ f(uk) |
| Dits of σ : (ui, uk), g(ui) ≠ g(uk) | Qudits G: ui ⊗ uk, g(ui) ≠ g(uk) |
| Ditset of π: dit(π) | [qudit(F)]: Subspace generated in V ⊗ V |
| Ditset of σ: dit(σ) | [qudit(G)]: Subspace generated in V ⊗ V |
| Join: dit(π) ∪ dit(σ) ⊆ U × U | [qudit(F) ∪ qudit(G)] ⊆ V ⊗ V |
| Difference: dit(π) − dit(σ) ⊆ U × U | [qudit(F) − qudit(G)] ⊆ V ⊗ V |
| Mutual: dit(π) ∩ dit(σ) ⊆ U × U | [qudit(F) ∩ qudit(G)] ⊆ V ⊗ V |
In quantum mechanics, the probability information is carried by the state to be measured. Hence Table 6 deals with the set and quantum version of quantum observables, not quantum states. The next step is to apply linearization to the set and vector space versions of the quantum state which carries the probability information. At the set level, the universal set U is equipped with a probability distribution p : U → [0, 1]. Table 7 gives the translation dictionary to give the quantum logical entropy.
Logical entropy + Linearization = quantum logical entropy.
| Logical entropy | Quantum logical entropy |
|---|---|
| ρ(0U) = ρ(U) = ρ(U)2 | Pure state ρ(ψ) = ρ(ψ)2 |
| p × p on U × U | ρ(ψ) ⊗ ρ(ψ) on V ⊗ V |
| h(0U) = 1 − tr[ρ(0U)2] = 0 | h(ρ(ψ)) = 1 − tr[ρ(ψ)2] = 0 |
| π = f−1, h(π) = p × p(dit(π)) | h(F : ψ) = tr[P[qudit(F)]ρ (ψ) ⊗ ρ(ψ)] |
| h(π, σ) = p × p(dit(π) ∪ dit(σ)) | tr [P[qudit(F)∪qudit(G)]ρ(ψ) ⊗ ρ(ψ)] |
| h(π|σ) = p × p(dit(π) − dit(σ)) | tr [P[qudit(F)−qudit(G)]ρ(ψ) ⊗ ρ(ψ)] |
| m(π σ) = p × p(dit(π) ∩ dit(σ)) | tr [P[qudit(F)∩qudit(G)]ρ(ψ) ⊗ ρ(ψ)] |
| h (π) = h (π|σ) + m(π, σ) | h(F : ψ) = h(F|G : ψ) + m(F, G : ψ) |
| ρ(π) = ρ̂(0U) = ∑r∈f(U) Prρ(0U)P r | ρ̂(ψ) = ∑r∈f(U) Prρ(ψ)Pr |
| h(π) = 1 − tr [ρ(π)2] | h(F : ψ) = 1 − tr[ρ̂(ψ)2] |
For an observable F , let f : U → ℝ be the F -eigenvalue function assigning the real eigenvalue f(ui) for each ui in the ON basis U = {u1, … , un} of F -eigenvectors. The image f(U) is the set of F -eigenvalues {r1, … , rm}. Let Pr : V → V be the projection matrix in the U -basis to the eigenspace of r. The projective F -measurement of the state ψ transforms the pure state density matrix ρ(ψ) (represented in the ON basis U of F -eigenvectors) to yield the Lüders mixture density matrix ρ̂(ψ) = ∑r∈f(U) Prρ(ψ)Pr [18, p. 279]. The off-diagonal elements of ρ(ψ) that are zeroed in ρ̂(ψ) are the coherences (quantum indistinctions or quindits) that are turned into “decoherences” (quantum distinctions or qudits of the observable being measured).
For any observable F and a pure state ψ, a quantum logical entropy was defined as h(F : ψ) = tr[P[qudit(F)]ρ(ψ) ⊗ ρ(ψ)]. That definition was the quantum generalization of the “classical” logical entropy defined as h(π) = p × p(dit(π)). When a projective F -measurement is performed on ψ, the pure state density matrix ρ(ψ) is transformed into the mixed state density matrix by the quantum Lüders mixture operation, which then defines the quantum logical entropy h(ρ̂(ψ)) = 1 − tr[ρ̂(ψ)2].
The first result is to show that these two entropies are the same: h(F : ψ) = h(ρ̂(ψ)). The proof proceeds by showing that they are both equal to classical logical entropy of the partition π(F : ψ) defined on the ON basis U = {u1, … , un} of F -eigenvectors by the F -eigenvalues with the point probabilities
To show that h(ρ̂(ψ)) = 1 − tr[ρ̂(ψ)2] = h(π(F : ψ)) for ρ̂(ψ) = ∑r∈f(U) Prρ(∈)Pr, we need to compute tr[ρ̂(ψ)2]. An off-diagonal element in
This finishes the proof of the following proposition.
h(F : ψ) = h(π(F : ψ)) = h(ρ̂(ψ)).
This shows how the quantum case is so closely related to the set case that, in many instances, we can compute results in the quantum case by converting to the set case where computations are simpler.
Measurement creates distinctions, i.e., turns coherences into “decoherences,” which, classically, is the operation of distinguishing elements by classifying them according to some attribute like classifying the faces of a die by their parity. The fundamental theorem about quantum logical entropy and projective measurement, in the density matrix version, shows how the quantum logical entropy created (starting with h(ρ(ψ)) = 0 for the pure state ψ) by the measurement can be computed directly from the coherences of ρ(ψ) that are decohered in ρ̂(ψ).
(Measuring measurement). The increase in quantum logical entropy, h(F : ψ) = h(ρ̂(ψ)) due to the F -measurement of the pure state ψ is the sum of the absolute squares of the non-zero off-diagonal terms (coherences) in ρ(ψ)(represented in an ON basis of F -eigenvectors) that are zeroed (“decohered”) in the post-measurement Lüders mixture density matrix ρ̂(ψ) = ∑r∈f(U) Prρ(ψ)Pr.
h(ρ̂(ψ)) − h(ρ(ψ)) = (1 − tr[ρ̂(∈)2]) − (1 − tr[ρ(∈)2]) = ∑i,k(|ρik(ψ)|2 − |ρ̂ik(ψ)|2). Now ui and uk are a qudit of F iff they are the corresponding off-diagonal terms zeroed by the Lüders mixture operation ∑r∈f(U) Prρ(ψ)Pr to obtain ρ̂(ψ) from ρ(ψ).
Since h(F : ψ) = h(π(F : ψ)) we can carry over the probability interpretation in the classical case h(π(F : ψ)) to the quantum case.
Interpretation of quantum logical entropy
The quantum logical entropy h(F : ψ) is the probability, in two independent F -measurements of a prepared pure state ψ, that different eigenvalues will be obtained—just as the logical entropy h(f−1) is the probability in two independent draws from U that different f-values will be obtained.
It might be helpful to carry out a quantum version of the numerical example. In V = ℂ3, let |ψ〉 = α|a〉 + β|b〉 + γ|c〉 be a normalized state vector so that
The diagonal elements of ρC (ψ) ⊗ ρC (ψ) are real products of probabilities. The projection operator P[qudit(F)] to the subspace generated by qudit[F] in V ⊗ V is a 9 × 9 diagonal matrix whose non-zero diagonal entries are ones corresponding to qudits(F) = {a ⊗ c, b ⊗ c, ...}. Thus the product P[qudit(F)] ρC (ψ) ⊗ ρC (ψ) just picks out (along its diagonal) those pairs of probabilities corresponding to dit(f−1), namely {papc, pbpc, ...} and then taking the trace sums them up to yield h(F : ψ) = h(π(F : ψ)). It sums the four entries corresponding to the qudits,
Quantum logical entropy also has natural connections with other quantum notions such as the Hilbert-Schmidt distance tr[(ρ − τ)2] [20] between two density matrices ρ and τ. And, as usual, the quantum case is developed as the quantum version of the “classical” logical entropy. We previously defined the logical Hamming distance between two partitions.
Nielsen and Chuang are skeptical about developing the Hamming distance in the quantum context.
Unfortunately, the Hamming distance between two objects is simply a matter of labeling, and a priori there aren’t any labels in the Hilbert space arena of quantum mechanics! [24, p. 399]
Using these density matrices, there is also the notion of logical cross-entropy of π and σ:
Since ρ and τ are also Hermitian matrices, each has an ON basis of eigenvectors and this approach to Hamming distance avoids the Nielsen-Chuang misgivings by using the amplitudes of all possible relations between the two ON bases [4, pp. 89–90] so no arbitrary labeling of the bases is involved. Then we have the theorem connecting quantum logical Hamming distance to an important existing notion in quantum information theory.
(Hamming = Hilbert-Schmidt distance). tr[(ρ − τ)2] = d(ρ, τ).
tr[(ρ − τ)2] = tr[ρ2 + τ2 − 2ρ†τ] = tr[ρ2] + tr[τ2] − 2 tr[ρ†τ] = 2h(ρ||τ) − h(ρ) − h(τ) = d(ρ, τ).
In that manner, the computation of the quantum logical entropies can be reduced to the computations in the corresponding “classical” case of logical entropies. Moreover, the definitions are made in Table 7 so that we have all the usual compound notions of quantum logical entropy that satisfy the usual Venn diagram relationships as illustrated in Figure 4.

Venn diagram relationships for quantum logical entropy.
The overall purpose of this paper has been to develop quantum logical entropy starting from the logic of partitions at the set level, developing the quantitative version of partitions as logical entropy, and then the development of that corresponding quantum notion using the yoga of linearization to translate the set concepts into the corresponding (Hilbert) vector space concepts.
There are a number of other results in the literature ([4,20,28]) about quantum logical entropy such as its concavity, subadditivity, non-decreasing value under projective measurement, a Holevo-type bound for quantum logical Hamming distance, and the extension of quantum logical entropy to post-selected quantum systems. More results are sure to come as more researchers are familiar with the logical entropy concepts starting with the quantitative treatment of partitions in terms of distinctions.
We find this framework of partitions and distinction most suitable (at least conceptually) for describing the problems of quantum state discrimination, quantum cryptography and in general, for discussing quantum channel capacity. In these problems, we are basically interested in a distance measure between such sets of states, and this is exactly the kind of knowledge provided by logical entropy [Reference to [1]]. [2, p. 1]
In conventional information theory or in what Claude Shannon called the “Mathematical Theory of Communication” [14], he noted that “no concept of information itself was defined” [29, p. 458]. The extension of Shannon entropy to the quantum notion of von Neumann entropy did not solve that problem of defining quantum information or the problem of interpretation. Logical entropy as the quantification of partitions defines the notion of information-as-distinctions, and quantum logical entropy extends that notion to the quantum realm as the quantification of quantum distinctions or qudits. This answers the vision of Charles Bennett, one of the founders of quantum information theory.
So information really is a very useful abstraction. It is the notion of distinguishability abstracted away from what we are distinguishing, or from the carrier of information.... ...
And we ought to develop a theory of information which generalizes the theory of distinguishability to include these quantum properties.... [30, pp. 155–157]
Whenever possible, we ignore the ket notation |ui〉 and just write ui.