L^p (1<p<infty) 的一致凸性

定理. 设(X,\mathcal{X},\mu)是测度空间. 当1<p<\infty时, L^p(X)是一致凸的.

证明. 首先注意到2<p<\infty的情形根据Hanner不等式立刻得到.下面考虑1<p<2的情形, 根据Hanner 不等式, 我们有

\displaystyle \|2f\|_{L^p}^p+\|2g\|_{L^p}^p\geq (\|f+g\|_{L^p}+\|f-g\|_{L^p})^p+|\|f+g\|_{L^p}-\|f-g\|_{L^p}|^p.

在上式中令\|f\|_{L^p}=\|g\|_{L^p}=1, 那么有

\displaystyle  (\|f+g\|_{L^p}+\|f-g\|_{L^p})^p+|\|f+g\|_{L^p}-\|f-g\|_{L^p}|^p\leq 2\cdot 2^p\ \ \ \ \ (1)

注意到对于a\geq b\geq 0, 我们有下面的估计:

\displaystyle (a+b)^p+(a-b)^p\geq 2a^p+p(p-1)a^{p-2}b^2\ \ \ \ \ (2).

分为有两种情形:

(1) \|f-g\|_{L^p}\leq \|f+g\|_{L^p}, 根据(1)式和(2)式可得

\displaystyle 2\|f+g\|_{L^p}^p+p(p-1)\|f+g\|_{L^p}^{p-2}\|f-g\|_{L^p}^2\leq 2\cdot 2^p.

因此

\displaystyle \left\|\frac{f+g}{2}\right\|_{L^p}^p\leq 1-\frac{p(p-1)}{2}\left\|\frac{f+g}{2}\right\|_{L^p}^{p-2}\left\|\frac{f-g}{2}\right\|_{L^p}^2\leq 1-\frac{p(p-1)}{2}\left\|\frac{f-g}{2}\right\|_{L^p}^2.

(2)  \|f+g\|_{L^p}\leq \|f-g\|_{L^p},则

\displaystyle 2\|f-g\|_{L^p}^p+p(p-1)\|f-g\|_{L^p}^{p-2}\|f+g\|_{L^p}^2\leq 2\cdot 2^p.

因此

\displaystyle \left\| \frac{f-g}{2} \right\|_{L^p}^p+\frac{p(p-1)}{2}\left\|\frac{f-g}{2}\right\|_{L^p}^{p-2}\left\|\frac{f+g}{2}\right\|_{L^p}^2\leq 1\ \ \ \ \ (3).

\|\frac{f+g}{2}\|_{L^p}\leq \frac{1}{2}时, 我们平凡地有

\displaystyle \left\|\frac{f+g}{2}\right\|_{L^p}<1-\delta.

因此可以假设\| \frac{f+g}{2} \|_{L^p}\geq \frac{1}{2}, 那么由(3)式可得

\displaystyle \left\| \frac{f-g}{2} \right\|_{L^p}^p+\frac{p(p-1)}{2}\left\|\frac{f-g}{2}\right\|_{L^p}^{p-2}\cdot \frac{1}{4}\leq 1.

注意到\|\frac{f-g}{2}\|_{L^p}\leq 1, 因此\|\frac{f-g}{2}\|_{L^p}^{p-2}\geq 1因为1<p<2, 因此

\displaystyle \left\|\frac{f-g}{2}\right\|_{L^p}^p+\frac{1}{8}(p-1)p\leq 1.

因此

\displaystyle \left\|\frac{f-g}{2}\right\|_{L^p}<1-\delta.

因此根据假设 \|f+g\|_{L^p}\leq \|f-g\|_{L^p}\leq 1-\delta

 

域扩张

我们考虑域的一般结构。域论探讨的基本课题是域嵌入,或者说域扩张。我们主要考察代数扩张,采取的角度是系统地利用代数闭包的存在性,对代数扩张尽量广泛的处理。

代数初步

代数学中出现的许多环结构同时是域上的向量空间:环的加法来自向量空间的加法,而乘法(x,y)\mapsto xy是向量空间上的双线性型。典型的例子是域k上的n\times n矩阵环M_n(k)

交换环上的代数

以下设R是非零的交换幺环。所谓的代数都是要幺的结合代数。

定义 环R上的代数式一个具有环与R-模结构的结合$latex $A,使得环乘法(x,y)\mapsto xy是平衡积,亦即

\displaystyle (rx)y=x(ry)=r(xy),\ \ x,y\in A,\ r\in R.

数论问题的价值

Fight with Infinity

我以为判断一个数学问题的价值大致有2种途径:一是看它是否有趣,一是看它是否重要。

所谓“有趣”,大多与一个命题出乎意料的程度有关。一般来说,一个表述简单却难以证明或证否的问题通常是有趣的。如果这个问题最终得到证明,事情就更加有趣了:从一片混沌中诞生出简单的图景,提示我们必然有某种值得深入研究的机理存在。此时这个问题开始变得“重要”了:围绕着它,数学家们构筑起理论,试图把他们在解决这个问题的过程中所获得的经验推广到更多的问题上。新的数学产生了。

一个问题的重要性取决于它在我们对现象的理解(即“理论”)中占据何种位置。挡在通衢大道上的石头是谁都想搬开的,躺在路边的石头则不会有多少人注意。当然,如果路边的石头固执地抗拒一切推开它的努力,它将以另一种方式变得“重要”:这种重要性由种种失败中所产生的新数学的多少来衡量。

数论是数学中最特别的分支:我们对整数惊人的无知,几乎所有数论问题都在某种程度上是“有趣”的——正是凭借这一点数论吸引了人类最优秀的头脑。另一方面,除了少数几个经典问题外,似乎很难先验地知道哪些问题在整体图景中是“重要”的:此时我们只能转而求助第2种重要性的定义,希望从中产生尽可能多的数学。

下面是一个粗糙的分类。我们仅给出每一类中最具代表性(往往也最有价值)的例子,并不代表所有同类问题都具有同等价值。横向上看,通常(1)(3)中的问题都是值得珍视的(在数论中并不多见),(2)(4)的价值次之,充斥整个数论的(5)则不那么重要:

(1)第一类问题包括Riemann猜想Langlands纲领。它们在2种意义上都是重要的:既处在现代数论的核心,又催生了大量“好的数学”,是整个现代数论前进的定向标。

未解决的Hilbert问题中,第9问题第12问题(Kronecker’s Jugendtraum)都可以归入此类。除了本身的理论价值外,它们还是类域论复乘理论和Langlands纲领的渊薮。

(2)另一类问题有理论上的重要性,却因为太难或者太偏而没有产生太多主流数学,或者必须借助(1)中的问题才能得到迂回的理解:例子包括Gauss的类数猜想Artin的原根猜想,等等。如果有人能以“正确的方式”理解它们,则此类问题可能提升为(1)中的问题。

(3)Fermat大定理本身并不重要,但它在第二种意义上极端重要:例如,它催生了Kummer的理想理论,从而建立了代数数论和代数几何的基础。Wiles的证明则增进了对Langlands纲领的理解,由此产生的系列数学工具也极具威力(参见Richard Taylor的工作)。

目前看来,比Fermat大定理更强的abc猜想应该也属于此类。望月新一最近的工作能否像Wiles的工作那样推动整个领域的进步,我们拭目以待。

(4)同样,Goldbach猜想孪生素数猜想本身也没有太大的重要性(尽管它们是“有趣”的典型例子)。人们因此发展了加性数论(华罗庚的“堆垒数论”)。经典工具(例如筛法)的应用范围狭窄,和Fermat大定理衍生出的数学相比,眼下处在边缘位置。这解释了为什么某些数学家轻视这方面的工作。当然,(4)中的问题也有可能提升到(3):例如,加性数论最近接受了来自遍历理论的新想法,似有重新回归主流的趋势(参见陶哲轩的工作),而后者又依赖于从到van der Waerden定理Szemerédi定理的提升。

Erdős是“趣味主义”的代言人,他提出的猜想大多属于(4)。概率数论(Erdős–Kac定理, etc.)和随机图(Erdős–Rényi模型, etc.)等工作是成功提升到(3)的例子,上面提到的陶哲轩的工作可能使Erdős猜想(若$latex sum 1/a_i$发散,则整数序列$latex {a_i}$中包含任意长的算术级数)获得提升。Ramanujan在模函数方面的工作中,Ramanujan猜想已通过Weil猜想成功提升。古老的同余数问题并无重要性,但它通过与BSD猜想联系获得了重要性。另一个相对近代的例子是经由Vojta的工作,Roth定理成功融入了算术几何的理论框架。

(5)证否和反例不一定是不重要的(尤其在第二种意义上):例如,Littlewood证否了Gauss猜想$latex pi(n)<mathrm{Li}(n)$,这增进了我们对$latex zeta$函数的理解,值得划入(4)。在寻找Euler猜想反例的过程中,Elkies和Frye等人发现了椭圆曲线理论的一个意外应用,这有一定的算法价值 (更不要说类似的构造椭圆曲线的方法提供了从谷山-志村猜想推出Fermat大定理的途径)。

Guy的Unsolved Problems in Number Theory中收录的问题也不一定是不重要的。事实上,它们中相当大的一部分都有某种程度的重要性。我们已在(2)(4)中提到部分例子,尝鼎一脔,其余可知。

很遗憾,在我看来Guy, F26不属于上述两类,而属于最不重要(也最常见)的一类数论问题:既在整个理论中没有位置,也不太可能产生有意思的数学。反例并不巨大(这意味着问题并不是那么难),同时,由于找到反例的方式是完全初等的,其潜在的算法价值也相当有限——这还是在不考虑Fuller,Iraids等人已得到好得多的结果的情况下。

【注记】
本文写成之后,豆瓣上的魔术师同学提醒我Guy, F26和素数的Kolmogorov复杂度有关。这样看来,一个Littlewood式的证否原本可能将它提升到(4)。我同意他的看法:“这个解法把他降低到了(5)。……一个昭示如何构造反例,或者证明仅有穷多反例,或者无穷多反例才可算是好的回答。”

View original post

The Kakeya problem in finite fields

The kakeya problem, in its best known formulation, is the following. Let E\subset\mathbf{R}^n be set which contains a translate of every unit segment; equivalently, for every direction e\in S^{n-1}, E contains a unit line segment parallel to e. An n-dimensional ball of radius 1/2 is a simple example of  a set with this property, but there are many other such sets, some of which have n-dimensional measure 0.

Can E be even smaller than that and have Hausdorff dimension strictly smaller that n?

Ordered field

An ordered filed is a field together with a total ordering of its elements that is compatible with the field operations.

An ordered field necessarily has characteristic 0 since the elements 0<1<1+1<1+1+1<\dots necessarily are all distinct$. Thus, an ordered field necessarily contains an infinite number of elements: a finite field cannot be ordered.

Every ordered field contains an ordered subfield that is isomorphic to rational numbers. Any Dedekind complete ordered field is isomorphic to the real numbers. Squares are necessarily non-negative in an ordered filed.

Definition A filed (F,+,\times) together with a total order \leq on F is an ordered field if the order satisfies the following properties for all a,b and $c$ in F:

  • if a\leq b then a+c\leq b+c, and
  • if 0\leq a and 0\leq b then 0\leq a\times b.

Fréchet derivative

The Fréchet derivative is a derivative defined on Banach spaces, it is commonly used to generalize the derivative of a real-valued function of a single real variable to the case of a vector-valued function of multiple real variables, and to define the functional derivative used widely in the calculus variations.

Definition Let V and W be Banach spaces, and U\subset V be an open subset of V. A function f:U\to W is called Fréchet differentiable at x\in U if there exists a bounded linear operator A:V\to W such that

\displaystyle \lim_{h\to 0}\frac{\|f(x+h)-f(x)-Ah\|_W}{\|h\|_V}=0.

The limit here is meant in the usual sense of a limit of a function defined on  a metric space.

Relation to the Gâteaux derivative

Definition A function f:U\subset V\to W is called Gâteaux differentiable at x\in U if f has a directional derivative along all direction at x. This means that the limit 

\displaystyle \lim_{t\to 0}\frac{f(x+th)-f(x)}{t}

exists for any choosen vector h in V, where is t is from the scalar filed associated with V.

If f is Fréchet differentiable at x, is is also Gâteaux differentiable there, and the limit is just Df(x)(h).

Higher derivatives

If f:U\subset V\to W is a differentiable function at all points in an open subset U of V, it follows that its derivative

\displaystyle Df:U\to L(V,W)

is a function from U to the space L(V,W) of all bounded liner operators from V to W. This function may also have a derivative, the second order derivative of f, which, by the definition of derivative, will be a map

\displaystyle D^2f:U\to L(V,L(V,W)).

To make it easier to work with second order derivatives, the space on the right-hand side is identified with the Banach space L^2(V\times V,W) of all continuous bilinear map from V to W. An element \varphi in L(V,L(V,W)) is thus identified with \psi in L^2(V\times V,W) such that for all x and y in V

\displaystyle \varphi(x)(y)=\psi(x,y).

Invariance of domain and retraction

Invariacne of domain

Invariance of domain states:

Theorem (Invariance of domain) If U is an open subset of \mathbf{R}^n and f:U\to\mathbf{R}^n is an injective continuous map, then V=f(U) is open and f is a homeomorphism between U and V.

The conclusion of the theorem can equivalently be formulated as: “f is an open map”. It is of crucial importance that both domain and range of f are contained in the Euclidean space of the same dimension. The theorem is also not generally true in infinite dimension.

retraction is a continuous mapping from the entire space into a subspace which preserves the position of all points in that space. A deformation retraction is a map which captures the idea of continuously shrinking a space into a subspace.

Retract

Definition Let X be a topological space and A be a subspace of X. Then a continuous map

\displaystyle r:X\to A

is a retraction if the restriction of r to A is the identity map on A; that is, r(a)=a for all a in A. Equivalently, denoting by

\displaystyle \iota:A\hookrightarrow X

the inclusion, a retraction is a continuous map r such that

\displaystyle r\circ\iota=\mathrm{id}_A.

A subspace A is called a retract of X if such a retraction exists.

Theorem Let X be a Hausdorff space and A be a retract of X. Then A is closed.

Proof Let x\notin A and a=r(x)\in A. Since X is Hausdorff, x and a have disjoint neighborhood U and V, respectively. Then r^{-1}(V\cap A)\cap U is a neighborhood of x disjoint from A. Hence, A is closed. \Box

A space X is known as an absolute retract if for every normal space Y contains X4 as a closed subspace, latex X$ is a retract of Y.

Deformation retract and strong deformation retract

Definition A continuous map

\displaystyle F:X\times [0,1]\to X

is a deformation retraction of a space X onto a subspace A if, for every x in X and a in A,

\displaystyle F(x,0)=x,\ F(x,1)\in A,\ \text{and}\ F(a,1)=a.

In other words, a deformation retraction is a homotopy between a retraction and the identity map on X. The subspace A is called a deformation retract of A. A deformation retraction is a special case of homotop equivalence.

Separation theorems in the plane

1 The Jordan separation theorem

Our proof of the Jordan curve theorem divides into three parts. The first, which we call the Jordan separation theorem, states that a simple closed curve in the plane separates it into at least two components. The second says that an arc in the plane does not separate the plane. And the third, the Jordan curve theorem proper, says that a simple closed curve C in the plane separates it into precisely two components, of which C is the common boundary.

Lemma 1 Let C be a compact subspace of S^2, let b be a point of S^2\setminus C; and let h be a homeomorphism of S^2\setminus b with \mathbf{R}^2. Suppose U is a component of S^2\setminus C. If U does not contain b, then h(U) is a bounded component of \mathbf{R}^2\setminus h(C). If U contains b, then h(U\setminus b) is the bounded component of \mathbf{R}^2\setminus h(C).

In particular, if S^2\setminus C has n components, then \mathbf{R}^2\setminus h(C) has n components.

Proof  We show first that if U is a component of S^2\setminus C, then U\setminus \{b\} is connencted.

Let (U_\alpha) be the set of components of S^2\setminus C; let V_\alpha=h(U_\alpha\setminus\{b\}). Because S^2\setminus C is locally connected, the set U_\alpha are connected, disjoint open subsets of S^2. Therefore the sets V_\alpha are connected, disjoint, open subsets of \mathbf{R}^2\setminus h(C), so the sets V_\alpha are the components of \mathbf{R}^2\setminus h(C).

Lemma 2 (Nulhomotopy lemma) Let a and b be points of S^2. Let A be a compact space, and let

\displaystyle f:A\to S^2\setminus\{a,b\}

be a continuous map. If a and b lie in the same component of S^2\setminus f(A), then f is nulhomotopic.

Definition 1 If X is a connected space and A\subset X, we say that A separates X if X\setminus A is not conneted; if X\setminus A has n components, we say that A separates X into n components.

Definition 2 An arc is a space homeomorphic to the unit inverval [0,1]. The end points of A are the two points p and q such that A\setminus\{p\} and A\setminus\{q\} are connected; the other points of A are called interior points of A.

simple closed curve is a space homeomorphic to the unit circle S^1.

Theorem 1Suppose X=U\cup V, where U and V are open sets of X. Suppose that U\cap V is path connected, and that x_0\in U\cap V. Let i and j be the inclusion mappings of U and V, respectively, into X. Then the images of the induced homomorphisms

\displaystyle i_*:\pi_1(U,x_0)\to \pi_1(X,x_0)\ \ \text{and} \ \ j_*:\pi_1(V,x_0)\to \pi_1(X,x_0)

generate \pi_1(X,x_0).

Theorem 2 (The Jordan separation theorem) Let C be a simple closed curve in S^2. Then C separate S^2.

2 Invariance of domain

Lemma 3 (Homotopy extension lemma) Let X be a space such that X\times I is normal. Let A be a closed subspace of X, and let f:A\to Y be a continuous map, where Y is an open subspace of \mathbf{R}^n. If f is nulhomotopic, then f may be extended to a continuous map g:X\to Y that is also nulhomotoptic.

The following lemma is partial converse to the nulhomotopy lemma of the preceding section.

Lemma 4 (Borsuk lemma) Let a and b be points of S^2. Let A be a compact space, and let f:A\to S^2\setminus\{a,b\} be a continuous injective map. If f is nulhomotopic, then a and b lie in the same component of S^2\setminus f(A).

Theorem 3 (Invariance of domain) If U is an open subset o \mathbf{R}^2 and f:U\to\mathbf{R}^2 is continuous and injective, then f(U) is open in \mathbf{R}^2 and the inverse function f^{-1}:f(U)\to U is continuous.

3 The Jordan curve theorem

The special case of the Seifert-van Kampen theorem that we used in proving the Jordan separation theorem tell us something about the fundamental group of the space X=U\cup V in the case where the intersection U\cap V is path connected. In the next theorem, we examine what happens when U\cap V is not path connected. This result will enable us to complete the proof the Jordan curve theorem.

Now we prove the Jordan curve theorem

Theorem (The Jordan curve theorem) Let C be a simple closed curve in S^2. Then C separates S^2 into precisely two components W_1 and W_2. Each of the sets W_1 and W_2 has C as its boundary; that is, C=\overline{W}_i-W_i for i=1,2.

Application of ultrafilters

An ultrafilter on a set X is a collection \mathcal{U} of subsets of X with the following properties:

  1. \emptyset \notin\mathcal{U} and X\in\mathcal{U};
  2. \mathcal{U} is closed under finite intersection;
  3. if A\in\mathcal{U} and A\subset B, then B\in\mathcal{U}
  4. fr every A\subset X, either A\in\mathcal{U} or X\setminus A\in\mathcal{U}.

A trivial example of an ultrafilter is the collection of all sets containing some fixed element x of X. Such ultrafilters are called principal. It is not trivial that there are any non-principal ultrafilters, but this can be proved using Zorn’s lemma.

The following facts about filters, which follow easily from the basic definition, will be used in this post. Let \mathcal{U} be an ultrafilter on a set X.

  1. If X is partitioned into finitely many sets A_1,\dots,A_n, the precisely one A_i belongs to \mathcal{U}.
  2. If F and G do not belong to \mathcal{U} then neither does F\cup G.
  3. If any finite set belongs to \mathcal{U} then \mathcal{U} is a principal filter.

Examples 1: generalized limits

We can think of the process of taking limits of sequence as a linear functional defined on the convergent sequences.

Can we generalize L by finding a linear functional \phi that is defined on all bounded sequences and not just all convergent on ? In order for it to count as a generalization, we would like \phi to be linear, and we would like \phi(a) to equal L(a) whenever a is convergent sequence.

If \mathcal{U} is a non-principal ultrafilter, and (a_1,a_2,\dots) is a sequence that takes values in [-1,1], then we can define a limit along \mathcal{U} as follows. Let \mathcal{J} be the collection of all subintervals J of [-1,1] such that \{n:a_n\in J\} belongs \mathcal{U}. Then the ultrafilter properties of \mathcal{U} imply that \mathcal{J} has all ultrafilter properties but restricted to intervals.

From this it follows that \mathcal{J} is something like a “principal interval-ultrafilter”. More precisely, it contains all open intervals that contain some particular point a. To see this, for each n\in\mathbf{N} partition [-1,1] into finitely many subintervals of length at most 1/n. Then one of these subintervals belongs to \mathcal{J}. So for every n we have an interval \mathcal{J}_n of length 1/n that belongs to \mathcal{J}. Now let I_n=J_1\cap \dots\cap J_n. Since \mathcal{J} is closed under intersection, I_n belongs to \mathcal{J}. Let \{a\} be the intersection of the closures of the  I_n (which are non-empety and nested). If U is any open interval containing a, then U contains some I_n, so belongs to \mathcal{U}.

Thus ,we have found a number a with the following property: for every \varepsilon>0, the set \{n:|a_n-a|<\varepsilon\} belongs to \mathcal{U}. Moreover, it is easy to see that this a is unique. We write it as \lim_{\mathcal{U}}a_n. It is easy to cheak that \lim_{\mathcal{U}} is linear.

To see ever more clearly how this ties in with the usual notion of a limit, note that a_n converges to a if and only if for every \varepsilon>0, the set \{n:|a_n-a|<\varepsilon\} belongs to the cofinite filter.

Set systems as quantifiers

It is often better to think of a set system as a quantifiers. In particular, if \mathcal{U} is an ultrafilter then one often finds oneself writing sentences of the form \{x\in X:P(x)\}\in\mathcal{U}, as we have already seen. But it can be much easier to deal with these sentences if one instead writes \mathcal{U}x\in X\ P(x). One can read this as  “For \mathcal{U}-almost every x\in X\ P(x)“.

Lattice property of the class of signed measures

Signed measures have values either in (-\infty,+\infty] or [-\infty,-\infty), to avoid the possibility of adding +\infty to -\infty. If  (X,\mathcal{X},\mu) is a signed measure space and A is a measurable set, define

\displaystyle \mu_+(A):=\sup_{B\subset A}\mu(B),\quad \mu_-(A):=-\inf_{B\subset A}\mu(B),\quad |\mu|=\mu_++\mu_-.

The set function \mu_-,\mu_+ and |\mu| are respectively the positive, negative and total variations of \mu.

Theorem If \mu and \nu are signed measures on a measurable space, there is a signed measure \mu\vee \nu majorizing \mu and \nu and majorized by every other signed measure majorant \mu and \nu.

Proof If \mu-\nu is a well-defined signed measure, that is if \mu(X) and \nu(X) are not both +\infty or both -\infty. Let X_+ be a maximal positivity set and X_- be a maximal negativity set, for \mu-\nu. Define

\displaystyle (\mu\vee\nu)(A):=\mu(A\cap X_+)+\nu(A\cap X_-).

This sum defines a measure with the required properties.