Discrete approximations of continuous probability distributions obtained by minimizing Cramér-von Mises-type distances

We consider the problem of approximating a continuous random variable, characterized by a cumulative distribution function (cdf) F ( x ), by means of k points, x 1 < x 2 < ⋯ < x k , with probabilities p i , i = 1 , ⋯ , k . For a given k , a criterion for determining the x i and p i of the a...

Full description

Saved in:
Bibliographic Details
Published in:Statistical papers (Berlin, Germany) Vol. 64; no. 5; pp. 1669 - 1697
Main Authors: Barbiero, Alessandro, Hitaj, Asmerilda
Format: Journal Article
Language:English
Published: Berlin/Heidelberg Springer Berlin Heidelberg 01.10.2023
Springer Nature B.V
Subjects:
ISSN:0932-5026, 1613-9798
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We consider the problem of approximating a continuous random variable, characterized by a cumulative distribution function (cdf) F ( x ), by means of k points, x 1 < x 2 < ⋯ < x k , with probabilities p i , i = 1 , ⋯ , k . For a given k , a criterion for determining the x i and p i of the approximating k -point discrete distribution can be the minimization of some distance to the original distribution. Here we consider the weighted Cramér-von Mises distance between the original cdf F ( x ) and the step-wise cdf F ^ ( x ) of the approximating discrete distribution, characterized by a non-negative weighting function w ( x ). This problem has been already solved analytically when w ( x ) corresponds to the probability density function of the continuous random variable, w ( x ) = F ′ ( x ) , and when w ( x ) is a piece-wise constant function, through a numerical iterative procedure based on a homotopy continuation approach. In this paper, we propose and implement a solution to the problem for different choices of the weighting function w ( x ), highlighting how the results are affected by w ( x ) itself and by the number of approximating points k , in addition to F ( x ); although an analytic solution is not usually available, yet the problem can be numerically solved through an iterative method, which alternately updates the two sub-sets of k unknowns, the x i ’s (or a transformation thereof) and the p i ’s, till convergence. The main apparent advantage of these discrete approximations is their universality, since they can be applied to most continuous distributions, whether they possess or not the first moments. In order to shed some light on the proposed approaches, applications to several well-known continuous distributions (among them, the normal and the exponential) and to a practical problem where discretization is a useful tool are also illustrated.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0932-5026
1613-9798
DOI:10.1007/s00362-022-01356-2