TY - GEN
T1 - Universal ε-approximators for integrals
AU - Langberg, Michael
AU - Schulman, Leonard J.
PY - 2010
Y1 - 2010
N2 - Let X be a space and F a family of 0,1-valued functions on X. Vapnik and Chervonenkis showed that if F is "simple" (finite VC dimension), then for every probability measure μ on X and ε > 0 there is a finite set S such that for all f ∈ F, ∑x∈Sf(x)/|S| = [∫ f(x)dμ(x)]± ε. Think of S as a "universal ε-approximator" for integration in F. S can actually be obtained w.h.p. just by sampling a few points from μ. This is a mainstay of computational learning theory. It was later extended by other authors to families of bounded (e.g., [0, 1]-valued) real functions. In this work we establish similar "universal ε-approximators" for families of unbounded nonnegative real functions - in particular, for the families over which one optimizes when performing data classification. (In this case the ε-approximation should be multiplicative.) Specifically, let F be the family of "k-median functions" (or k-means, etc.) on ℝd with an arbitrary norm o. That is, any set u1, ..., uk ∈ ℝd determines an f by f(x) = (min i o(x - ui))α. (Here α ≥ 0.) Then for every measure μ on ℝd there exists a set S of cardinality poly(k, d, 1/ε) and a measure ν supported on S such that for every f ∈ F, ∑x∈s f(x)ν(x) ∈ (1 ± ε)·(∫ f(x)dμ(x)).
AB - Let X be a space and F a family of 0,1-valued functions on X. Vapnik and Chervonenkis showed that if F is "simple" (finite VC dimension), then for every probability measure μ on X and ε > 0 there is a finite set S such that for all f ∈ F, ∑x∈Sf(x)/|S| = [∫ f(x)dμ(x)]± ε. Think of S as a "universal ε-approximator" for integration in F. S can actually be obtained w.h.p. just by sampling a few points from μ. This is a mainstay of computational learning theory. It was later extended by other authors to families of bounded (e.g., [0, 1]-valued) real functions. In this work we establish similar "universal ε-approximators" for families of unbounded nonnegative real functions - in particular, for the families over which one optimizes when performing data classification. (In this case the ε-approximation should be multiplicative.) Specifically, let F be the family of "k-median functions" (or k-means, etc.) on ℝd with an arbitrary norm o. That is, any set u1, ..., uk ∈ ℝd determines an f by f(x) = (min i o(x - ui))α. (Here α ≥ 0.) Then for every measure μ on ℝd there exists a set S of cardinality poly(k, d, 1/ε) and a measure ν supported on S such that for every f ∈ F, ∑x∈s f(x)ν(x) ∈ (1 ± ε)·(∫ f(x)dμ(x)).
UR - http://www.scopus.com/inward/record.url?scp=77951686690&partnerID=8YFLogxK
U2 - 10.1137/1.9781611973075.50
DO - 10.1137/1.9781611973075.50
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:77951686690
SN - 9780898717013
T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
SP - 598
EP - 607
BT - Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms
PB - Association for Computing Machinery (ACM)
T2 - 21st Annual ACM-SIAM Symposium on Discrete Algorithms
Y2 - 17 January 2010 through 19 January 2010
ER -