TY - GEN

T1 - Universal ε-approximators for integrals

AU - Langberg, Michael

AU - Schulman, Leonard J.

PY - 2010

Y1 - 2010

N2 - Let X be a space and F a family of 0,1-valued functions on X. Vapnik and Chervonenkis showed that if F is "simple" (finite VC dimension), then for every probability measure μ on X and ε > 0 there is a finite set S such that for all f ∈ F, ∑x∈Sf(x)/|S| = [∫ f(x)dμ(x)]± ε. Think of S as a "universal ε-approximator" for integration in F. S can actually be obtained w.h.p. just by sampling a few points from μ. This is a mainstay of computational learning theory. It was later extended by other authors to families of bounded (e.g., [0, 1]-valued) real functions. In this work we establish similar "universal ε-approximators" for families of unbounded nonnegative real functions - in particular, for the families over which one optimizes when performing data classification. (In this case the ε-approximation should be multiplicative.) Specifically, let F be the family of "k-median functions" (or k-means, etc.) on ℝd with an arbitrary norm o. That is, any set u1, ..., uk ∈ ℝd determines an f by f(x) = (min i o(x - ui))α. (Here α ≥ 0.) Then for every measure μ on ℝd there exists a set S of cardinality poly(k, d, 1/ε) and a measure ν supported on S such that for every f ∈ F, ∑x∈s f(x)ν(x) ∈ (1 ± ε)·(∫ f(x)dμ(x)).

AB - Let X be a space and F a family of 0,1-valued functions on X. Vapnik and Chervonenkis showed that if F is "simple" (finite VC dimension), then for every probability measure μ on X and ε > 0 there is a finite set S such that for all f ∈ F, ∑x∈Sf(x)/|S| = [∫ f(x)dμ(x)]± ε. Think of S as a "universal ε-approximator" for integration in F. S can actually be obtained w.h.p. just by sampling a few points from μ. This is a mainstay of computational learning theory. It was later extended by other authors to families of bounded (e.g., [0, 1]-valued) real functions. In this work we establish similar "universal ε-approximators" for families of unbounded nonnegative real functions - in particular, for the families over which one optimizes when performing data classification. (In this case the ε-approximation should be multiplicative.) Specifically, let F be the family of "k-median functions" (or k-means, etc.) on ℝd with an arbitrary norm o. That is, any set u1, ..., uk ∈ ℝd determines an f by f(x) = (min i o(x - ui))α. (Here α ≥ 0.) Then for every measure μ on ℝd there exists a set S of cardinality poly(k, d, 1/ε) and a measure ν supported on S such that for every f ∈ F, ∑x∈s f(x)ν(x) ∈ (1 ± ε)·(∫ f(x)dμ(x)).

UR - http://www.scopus.com/inward/record.url?scp=77951686690&partnerID=8YFLogxK

U2 - 10.1137/1.9781611973075.50

DO - 10.1137/1.9781611973075.50

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:77951686690

SN - 9780898717013

T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms

SP - 598

EP - 607

BT - Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms

PB - Association for Computing Machinery (ACM)

T2 - 21st Annual ACM-SIAM Symposium on Discrete Algorithms

Y2 - 17 January 2010 through 19 January 2010

ER -