334 Geometric Embedding of Multidimensional Random Variables: Geometric Operations of Joint Distributions, Marginals, and Conditionals

334 Geometric Embedding of Multidimensional Random Variables: Geometric Operations of Joint Distributions, Marginals, and Conditionals

Bosley Zhang

164 0

2026/05/25

12 mins read

☕

Paper 3：Geometric Embedding of Multivariate Random Variables: A Unified Triple Operation of Joint Distribution, Marginalization, and Conditioning

Author: Zhang Suhang

Affiliation: Luoyang, Henan, China

Abstract

Building upon the algebraic-probabilistic-geometric triple isomorphism paradigm and the midpoint extremum theorem established in Paper 1, and the univariate distribution triple realization in Paper 2, this paper rigorously extends the triple unification system to n-dimensional continuous random vectors. It breaks through the limitation of traditional multivariate probability theory, which only maintains a probability-geometry bidirectional correspondence, and establishes a complete mapping in which algebraic convex analysis, multivariate probability measures, and Euclidean hypersurface geometry are placed in one-to-one correspondence: any n-dimensional joint probability distribution can be bidirectionally and equivalently embedded into two dual hypersurfaces (the density surface and the convex potential surface) in ℝ^{n+1} space. Relying on the multivariate midpoint extremum theorem, this paper achieves the triple synchronous derivation of three core classes of multivariate probability operations: 1. Marginalization: algebraic multiple integration elimination ⇔ probabilistic marginal distribution solving ⇔ geometric hypersurface orthogonal weighted projection; 2. Conditioning: algebraic fractional normalization transformation ⇔ probabilistic conditional distribution definition ⇔ geometric hyperplane slicing with vertical rescaling; 3. Independence: algebraic potential function additive decomposition ⇔ probabilistic density product decomposition ⇔ geometric hypersurface direct product topological decomposition. Simultaneously, it completes the triple-language unified characterization of Bayes' theorem and the multivariate normal distribution, proving that the elliptic paraboloid hypersurface of the multivariate normal naturally satisfies the multivariate midpoint extremum equivalence relation. This paper fills the algebraic deficiency gap in multivariate scenarios and achieves a fully self-consistent triple unification for static multivariate probability across the entire domain, providing high-dimensional foundational support for Paper 4 (geometric-algebraic reconstruction of probability axioms) and Paper 5 (geometric flows of stochastic processes).

Keywords: multivariate triple isomorphism; multivariate midpoint extremum theorem; convex potential hypersurface; marginalization projection; conditioning slicing; Bayesian geometrization and algebrization

---

§1 Introduction

1.1 The Legacy Gap in the Previous Paradigm

Paper 1, through the univariate midpoint extremum theorem, established the underlying equivalence among convex algebra, probability, and geometry for single variables, explicitly identifying the potential function h(x) = −log p(x) as the core medium for triple intercommunication. Paper 2 completed the concrete triple verification for univariate distributions. However, both are confined to one-dimensional spaces. Direct extension to multiple dimensions suffers from fatal shortcomings:

1. Traditional multivariate geometric probability and information geometry only construct probability-geometry mappings; multivariate convex algebra (quadratic forms, Hessian matrices, multivariate gradients) remains entirely detached, serving merely as a computational tool;
2. There is no corresponding midpoint extremum axiom in the multivariate setting, failing to explain the binding relationship among the expectation vector of multivariate random variables, the density peak, and the extremum point of the hypersurface;
3. The three fundamental operations—marginalization, conditioning, and independence—lack synchronous algebraic structural interpretations and remain at the levels of integration and geometric intuition.

1.2 Core Innovation Goals of This Paper

This paper no longer implicitly invokes the conclusions of Paper 1. It independently proposes the multivariate midpoint extremum theorem and forcibly decomposes every multivariate probability proposition into three equivalent formulations—algebraic, probabilistic, and geometric—achieving triple synchronous mutual derivation at every theorem and every operation:

· Algebraic layer: multivariate convex analysis, matrix quadratic forms, differential homeomorphisms, and multiple integration as the foundational language;
· Probabilistic layer: multivariate joint, marginal, conditional, independence, and Bayesian measures as the standard language;
· Geometric layer: ℝ^{n+1} hypersurfaces, projections, slices, and direct-product Riemannian geometry as the intuitive language.

Two dual embedding surfaces are distinguished: the density surface z = p(x) for intuitive visualization, and the convex potential surface z = h(x) = −log p(x) for rigorous algebraic derivation, connected via the mutually inverse exponential-logarithmic transformation.

1.3 Organization of the Paper

§2 constructs the foundational definitions of multivariate triple embedding and the multivariate midpoint extremum theorem (the algebraic anchor of this paper); §3 presents a two-dimensional visualization triple comparison case; §4 proves the triple equivalence of marginalization; §5 proves the triple equivalence of conditioning; §6 proves the triple equivalence of independence; §7 presents the triple unified reconstruction of Bayes' theorem; §8 provides the全域 triple verification of the multivariate normal distribution; §9 demonstrates the invariance of the triple framework in high-dimensional non-visualizable spaces; §10 concludes and outlines connections to the series papers.

---

§2 Foundational Multivariate Triple Embedding and the Multivariate Midpoint Extremum Theorem

2.1 Dual Hypersurface Triple Embedding Definition

Let X = (X₁, X₂, ..., Xₙ) be an n-dimensional continuous random vector with joint Borel probability measure P, absolutely continuous with respect to Lebesgue measure on ℝⁿ, with joint density p(x). Define two pairs of bidirectional embedding mappings that serve as the carriers of the triple isomorphism:

1. Probabilistic-Intuitive Geometric Embedding (Density Surface)

Embedding space ℝ^{n+1}, embedding map ιₚ: ℝⁿ → ℝ^{n+1}, ιₚ(x) = (x, p(x)), with image Σₚ as the n-dimensional density hypersurface.

Probability-geometry correspondence: P(X ∈ A) = Vol_{n+1}{x ∈ A, 0 ≤ z ≤ p(x)}—that is, the volume under the hypersurface.

2. Algebraic-Foundational Geometric Embedding (Convex Potential Surface)

Embedding space ℝ^{n+1}, embedding map ιₕ: ℝⁿ → ℝ^{n+1}, ιₕ(x) = (x, h(x)), h(x) = −log p(x), with image Σₕ as the n-dimensional potential hypersurface.

Algebra-probability correspondence: P(X ∈ A) = ∫_A e^{−h(x)} dx, exactly matching the Gibbs algebraic measure form.

The two surfaces satisfy the algebraic inverse relation: p(x) = exp(−h(x)); all geometric operations are bidirectionally transformable via exponential and logarithmic mappings.

---

2.2 Theorem 2.1 (Multivariate Midpoint Extremum Theorem, the Overall Algebraic Anchor of This Paper)

If h(x) ∈ C²(ℝⁿ) is a strictly multivariate convex potential function (corresponding to a multivariate log-concave unimodal distribution), and μ = 𝔼[X] = ∫_{ℝⁿ} x e^{−h(x)} dx is the multivariate expectation midpoint vector, then the following three sets of propositions are pairwise strictly equivalent:

1. Algebraic Proposition: The multivariate gradient vanishes at ∇h(μ) = 0, the Hessian matrix ∇²h(μ) is positive definite, and h(x) attains its global strict minimum at μ;

2. Probabilistic Proposition: The joint density p(x) attains its global strict maximum at μ, and μ is the peak midpoint of the multivariate distribution;

3. Geometric Proposition: The potential hypersurface Σₕ at the point (μ, h(μ)) is the global weighted volumetric centroid and the global valley bottom of the hypersurface, with all directional principal curvatures strictly positive.

Corollary 2.1 (Multivariate Direct Product Extremum Corollary)

If X = (X₁, X₂) with independent components, then h(x₁, x₂) = h₁(x₁) + h₂(x₂). The global minimum point of the joint potential is (μ₁, μ₂), equal to the direct product of the component midpoints, and the joint density peak is the product of the component peaks.

Proof Sketch

· Algebraic layer: The necessary and sufficient condition for the global minimum of a strictly multivariate convex function is zero gradient and positive definite Hessian;
· Probabilistic layer: ∇h(x) = −∇log p(x) = −∇p(x)/p(x). Combined with the multivariate Euler integral identity 𝔼[∇h(X)] = 0, substitution at the midpoint directly yields the zero gradient condition;
· Geometric layer: e^{−h(x)} serves as the weighted volume element of the surface; the weighted centroid naturally coincides with the surface valley bottom, proven directly by the Riemannian measure centroid definition.

---

2.3 Discrete and Mixed Distribution Triple Adaptations

· Discrete multivariate distributions: Replace the reference measure with counting measure; algebraic potential hᵢ = −log pᵢ; the probability midpoint is the discrete weighted mean; the geometry comprises extreme points of vertical line segments on a discrete lattice;
· Mixed multivariate distributions: Use a Lebesgue-counting mixed measure; the triple equivalence relations retain the same form, with only a partition of the integration domain according to the measure.

---

§3 Two-Dimensional Visualization Triple Comparison Benchmark

Taking the two-dimensional random vector (X, Y) as a concrete case, establish a static triple comparison table as a reference for all subsequent dynamic operations:

Algebraic (Bivariate Convex Analysis) Probabilistic (Bivariate Joint Distribution) Geometric (z = h(x,y) Surface)
Bivariate convex quadratic potential Joint log-density Convex elliptic paraboloid potential surface
Gradient zero point Bivariate expectation midpoint, density peak Surface valley bottom, volumetric centroid
Positive definite Hessian matrix Log-concave, unimodal distribution with no multiple modes Globally positive curvature in all directions, no saddle points
— Total joint probability over entire space sums to 1 Total volume under the surface normalized to 1

Intuitive illustration: The bivariate normal density surface is a bell-shaped hill; the dual potential surface is a concave-upward bowl-shaped paraboloid. The hilltop and the bowl bottom coincide exactly in coordinate position, corresponding to the unified triple midpoint extremum.

---

§4 Marginalization: Triple Equivalence Proof

4.1 Algebraic Formulation (Multiple Integration Elimination Theorem)

For the bivariate joint potential h(x, y), define the marginal potential:

hₓ(x) = −log ∫_{ℝ} e^{−h(x,y)} dy

Algebraic operation: Perform Lebesgue multiple integration elimination along the y-dimension—this is a marginal projection algebraic transformation of multivariate functions. The marginal potential remains strictly convex and satisfies the univariate midpoint extremum theorem.

---

4.2 Probabilistic Formulation (Marginal Distribution Definition)

Marginal density:

pₓ(x) = ∫{ℝ} p(x,y) dy = ∫{ℝ} e^{−h(x,y)} dy = e^{−hₓ(x)}

Probabilistic meaning: Eliminating the uncertainty of the random variable Y, retaining only the probability measure in the X-dimension.

---

4.3 Geometric Formulation (Weighted Orthogonal Projection)

1. Potential surface geometry: Σₕ projected along the y-axis via weighted orthogonal projection onto the xz-coordinate plane—the resulting one-dimensional convex curve is the marginal potential curve;
2. Density surface geometry: The bell-shaped hill is flattened laterally along y, accumulating vertical volume; the flattened profile is the marginal density curve;
3. Triple invariance: By the multivariate midpoint extremum theorem, the marginal midpoint after projection remains the algebraic gradient zero point, the marginal density peak, and the marginal surface valley bottom—the triple equivalence is not destroyed by projection.

---

4.4 Unified Conclusion

Marginalization ⇔ Algebraic multiple integration elimination ⇔ Probabilistic variable elimination ⇔ Geometric hypersurface weighted orthogonal projection

---

§5 Conditioning: Triple Equivalence Proof

5.1 Algebraic Formulation (Fractional Normalized Differential Transformation)

Fixing x = x₀, take the cross-section of the bivariate potential function to obtain the sectional potential h_{ₓ₀}(y) = h(x₀, y). The algebraic normalization transformation is:

h(y|x₀) = h{ₓ₀}(y) − log ∫{ℝ} e^{−h_{ₓ₀}(y)} dy

Essence: Translation normalization of the local sectional potential. Translation does not alter convexity or the gradient zero point; therefore, the cross-section still satisfies the univariate midpoint extremum theorem.

---

5.2 Probabilistic Formulation (Conditional Distribution Axiom)

Conditional density definition:

p(y|x₀) = p(x₀,y) / pₓ(x₀) = e^{−h(x₀,y)} / ∫ e^{−h(x₀,y)} dy = e^{−h(y|x₀)}

Probabilistic meaning: When X = x₀ is known, the posterior probability of Y is renormalized.

---

5.3 Geometric Formulation (Parallel Slicing + Vertical Rescaling)

1. Geometric operation: Use a hyperplane perpendicular to the x-axis, x = x₀, to slice the joint hypersurface in parallel, obtaining a one-dimensional cross-sectional curve;
2. Rescaling logic: The original volume under the cross-sectional curve is not equal to 1; rescale vertically along the z-axis to achieve volume normalization;
3. Triple invariance: The conditional distribution midpoint after slicing is simultaneously the algebraic gradient zero point of the cross-section, the peak of the sectional density, and the valley bottom of the sectional curve.

---

5.4 Unified Conclusion

Conditioning ⇔ Algebraic sectional potential normalized translation ⇔ Probabilistic conditional measure normalization ⇔ Geometric hyperplane slicing with vertical rescaling

---

§6 Independence: Triple Equivalence Proof

6.1 Algebraic Formulation (Potential Function Additive Decomposition)

X ⟂ Y ⇔ h(x,y) = hₓ(x) + hᵧ(y)

Algebraic property: The joint Hessian matrix is block-diagonal: ∇²h = diag(∇²hₓ, ∇²hᵧ). Global convexity is guaranteed independently by component convexity, with no cross second-order partial derivatives.

---

6.2 Probabilistic Formulation (Density Product Decomposition)

X ⟂ Y ⇔ p(x,y) = pₓ(x) pᵧ(y)

Derived directly from the exponential-logarithmic inverse relation: e^{−hₓ − hᵧ} = e^{−hₓ} ⋅ e^{−hᵧ}. Probabilistic independence is equivalent to density multiplicative decomposition.

---

6.3 Geometric Formulation (Hypersurface Direct Product Topological Decomposition)

1. Topological structure: The joint hypersurface can be decomposed as the Cartesian direct product of two one-dimensional surfaces of X and Y, with no surface coupling deformation;
2. Slicing characteristics: At any positions x₁ and x₂, the normalized slices coincide exactly; the surface contour lines form standard orthogonal rectangular grids;
3. Curvature characteristics: The Gaussian curvature of the joint surface equals the product of the curvatures of the two component surfaces, with no cross-curvature terms.

---

6.4 Unified Conclusion

Independence ⇔ Algebraic potential additivity, block-diagonal Hessian ⇔ Probabilistic density product decomposition ⇔ Geometric hypersurface direct product with no coupling

---

§7 Triple Unified Reconstruction of Bayes' Theorem

7.1 Algebraic Form

Let parameter θ be the prior variable and observation x be given. The joint potential is h(x, θ) = hθ(θ) + h{x|θ}(x|θ).

Algebraic Bayes:

h(θ|x₀) = h(x₀, θ) − log ∫ e^{−h(x₀, θ)} dθ

Underlying algebraic logic: The joint potential cross-section is normalized by translation, identical in origin to the conditional distribution algebraic operation.

---

7.2 Probabilistic Form

Standard Bayesian measure:

π(θ|x₀) = p(x₀|θ) π(θ) / p(x₀)

The prior corresponds to the marginal potential, the likelihood to the conditional potential, and the posterior to the slice-normalized potential.

---

7.3 Geometric Form

1. Prior: The global base shape of the joint surface along the θ-direction;
2. Likelihood: The undulating deformation of the joint surface along the x-direction;
3. Posterior: After slicing at the observation x = x₀ and vertical normalization, the new surface.

Geometric straightforward interpretation: Bayesian updating = one slicing operation + one algebraic normalization + one geometric rescaling—no additional probabilistic semantics.

---

7.4 Triple Unified Core

Bayesian inference ⇔ Algebraic sectional potential translation normalization ⇔ Probabilistic conditional measure updating ⇔ Geometric observation slicing normalization

---

§8 Multivariate Normal Distribution:全域 Triple Verification

Let the n-dimensional normal random vector X ~ N(μ, Σ) have joint density:

p(x) = 1/((2π)^{n/2} |Σ|^{1/2}) exp(−½ (x − μ)ᵀ Σ⁻¹ (x − μ))

---

8.1 Algebraic Layer

Potential function: h(x) = ½ (x − μ)ᵀ Σ⁻¹ (x − μ) + C

1. Gradient: ∇h(μ) = 0, satisfying the midpoint extremum zero condition;
2. Hessian matrix: ∇²h = Σ⁻¹ is always positive definite—globally strictly convex;
3. Independence special case: Σ block-diagonal ⇔ cross partial derivatives vanish ⇔ potential function has additive decomposition.

---

8.2 Probabilistic Layer

1. μ is the multivariate expectation and the global density peak, satisfying the multivariate midpoint extremum;
2. Marginalization: Normal projections remain normal; Conditioning: Normal slices remain normal;
3. Component uncorrelatedness is equivalent to independence, matching the algebraic block-diagonal Hessian condition.

---

8.3 Geometric Layer

1. Potential surface: An elliptic paraboloid convex hypersurface in ℝ^{n+1}, with μ as the global valley bottom;
2. Marginalization: Orthogonal projection of the elliptic paraboloid remains a lower-dimensional elliptic paraboloid;
3. Conditioning: Parallel slicing of the elliptic paraboloid remains a lower-dimensional elliptic paraboloid.

Verification Conclusion: The multivariate normal distribution is the standard exemplar of multivariate triple isomorphism; all algebraic, probabilistic, and geometric conclusions are fully self-consistent, with no exceptional deviations.

---

§9 Triple Invariance in High-Dimensional Non-Visualizable Spaces

For n ≥ 3 hypersurfaces that cannot be visually intuited, the triple unification relations remain entirely invariant and are not constrained by dimensionality:

1. Algebraic invariance: Multivariate convex analysis, matrix quadratic forms, and multiple integration rules do not change with increasing dimensionality;
2. Probabilistic invariance: Multivariate measure axioms and the definitions of marginalization, conditioning, and independence are universally applicable;
3. Geometric invariance: No visual depiction is required; relying solely on the intrinsic geometry of Riemannian manifolds, projection, slicing, and direct product are intrinsic operations independent of external visualization.

Computational value: Traditional high-dimensional probability multiple integration has exponentially increasing computational complexity; relying on the triple isomorphism, probability integrals can be directly transformed into intrinsic volume measurements on hypersurfaces, circumventing explicit algebraic integration and providing a fundamentally new paradigm for high-dimensional Bayesian numerical computation.

---

§10 Conclusion and Series Connections

10.1 Core Achievements of This Paper (Filling the Multivariate Algebraic Gap)

1. Independently proposed the multivariate midpoint extremum theorem, filling the algebraic deficiency gap in the previous Paper 3, achieving independent triple unification in multivariate scenarios without relying on implicit citations from Paper 1;
2. For all fundamental multivariate probability operations, achieved bidirectional mutual derivation among algebra, probability, and geometry, with no layer left detached;
3. Distinguished the dual system of density surfaces and convex potential surfaces, suited respectively for intuitive applications and rigorous theoretical derivations.

---

10.2 Final Qualitative Characterization of the Triple Unification

After this rewrite: from definitions, core axioms, fundamental operations, to example verification—the entire chain is fully connected across multivariate convex algebra, multivariate probability measures, and high-dimensional hypersurface geometry—forming a complete static闭环 with the univariate triple unification of Paper 1.

---

10.3 Series Paper Connections

1. Paper 1: Univariate triple isomorphism axiomatic system;
2. Paper 2: Concrete triple implementation of univariate distributions;
3. Paper 3: Concrete triple implementation of multivariate distributions (this paper);
4. Paper 4: Geometric reconstruction of probability axiom systems: Kolmogorov axioms equivalent to geometric measure axioms;
5. Paper 5: Stochastic processes and geometric flows: from random walks to Brownian motion to quantum probability.

---

Appendix A: Two-Dimensional Triple Operation Flowchart

1. Joint distribution: Algebraic bivariate convex function ⇔ Bivariate joint probability ⇔ 3D dual hypersurfaces
2. Marginalization: Algebraic integration over y ⇔ Marginal probability ⇔ Orthogonal projection onto the xz-plane
3. Conditioning: Algebraic sectional normalization ⇔ Conditional probability ⇔ Vertical slice rescaling
4. Independence: Algebraic potential additivity ⇔ Density product ⇔ Surface direct product decomposition

---

References

[1] Zhang Suhang. The Foundational Paradigm of Probability-Geometry Isomorphism: From Gaussian Distributions to General Measure Correspondence, 2026. (Univariate triple axioms)

[2] Zhang Suhang. Geometric Realization of Univariate Probability Distributions: Bell, Step, Lattice, and Fractal, 2026. (Univariate concrete verification)

[3] Anderson, T. W. An Introduction to Multivariate Statistical Analysis. Wiley, 2003. (Multivariate normal measures)

[4] Boyd S, Vandenberghe L. Convex Optimization. Cambridge University Press, 2004. (Multivariate convex algebra theory)

[5] Billingsley, P. Probability and Measure. Wiley, 1995. (Multivariate Lebesgue measure)

[6] Chern, S. S. Lectures on Differential Geometry. Peking University Press, 2009. (Riemannian intrinsic projection and slicing theory)

WriterShelf™ is a unique multiple pen name blogging and forum platform. Protect relationships and your privacy. Take your writing in new directions. ** Join WriterShelf**

WriterShelf™ is an open writing platform. The views, information and opinions in this article are those of the author.

333 Geometric Realizations of One-Dimensional Probability...

335 Geometric Reconstruction of Probability Axiom System:...

Article info

This article is part of:

Categories:

Technology

⟩

Science

⟩

Climate Change

Date: