Study notes from Convex Optimization by Stephen Boyd, Lieven Vandenberghe.
1. Affine and convex sets
1.1 Lines and line segments
Suppose x1=x2 are two points in Rn. Points of the form
y=θx1+(1−θ)x2(1.1.1)
where θ∈R,form the line passing through x1 and x2. The parameter value θ=0 corresponds to y=x2, and the parameter value θ=1 corresponds to y=x1. Values of the parameter θ between 0 and 1 corresponds to the line segment between x1 and x2.
1.2 Affine sets
A set C⊆Rn is affine sets if the line through any two distinct points in C lies in C, i.e.,
ifthen∀x1,x2∈C,θ∈Rθx1+(1−θ)x2∈C(1.2.1)
In other words, C contains the linear combination of any two points in C, provided the coefficients in the linear combination sum to one. This idea can be generalized to more than two points. We refer to a point of the form θx1+⋯+θkxk, where θ1+⋯+θk=1, as an affine combination of the points x1,⋯,xk.
Using induction from the definition of affine, it can be shown that an affine set contains every affine combination of its points: if C is an affine set, x1,⋯,xk∈C, and θ1+⋯+θk=1, the the point θ1x1+⋯+θkxk∈C. If C is an affine set and x0∈C, then the set
V=C−x0={x−x0∣x∈C}(1.2.3)
is a subspace, i.e., closed under sums and scalar multiplication. Thus, the affine set C can be expressed as C=V+x0={v+x0∣v∈V}, i.e., as a subplace plus an offset. We define the dimension of an affine set C as the dimension of the subspace V=C−x0, where x0 is any element of C.
The set of all affine combinations of points in some set C⊆Rn is called the affine hull of C, and denoted aff C:
The affine hull is the smallest affine set that contains C, in the following sense: if S is any affine set with C⊆S, then aff C⊆S.
We define the affine dimension of a set C as the dimension of its affine hull. Affine dimension is useful in the context of convex analysis and optimization, but is not always consistent with other definitions of dimension. If the affine dimension of a set C⊆Rn is less than n, then the set lies in the affine set aff C=Rn. We define the releative interior of the set C, denoted relint C, as its interior relative to aff C:
relint C={x∈C∣B(x,r)∩aff C⊆C for some r>0}(1.2.5)
where B(x,r)={y∣∥y−x∥≤r}, the ball of radius r and center x in the norm ∥⋅∥. (Here ∥⋅∥ is any norm, all norms define the same relative interior.) We can the define the relative boundary of a set C as cl C\relint C, where cl C is the closure of C.
1.3 Convex sets
A set C is convex if the line segment between any two points in C lies in C, i.e., if for any x1,x2∈C and any θ with 0≤θ≤1, we have
θx1+(1−θ)x2∈C(1.3.1)
Every affine set is also convex, since it contains the entire line between any two distinct points in it, and therefor also the line segment between the points.
We call a point of the form θ1x1+⋯+θkxk, where θ1+⋯+θk=1 and θi≥0,i=1,⋯,k, a convex combination of the points x1,⋯,xk. As with affine sets, it can be shown that a set is convex if and only if it contains every convex combination of its points.
The convex hull of a set C, denoted conv C, is the set of all convex combinations of points in C:
As the name suggests, the convex hull conv C is always convex. It is the smallest convex set that contains C: If B is any convex set that contains C, then conv C⊆B.
The idea of a convex combination can be generalized to include infinite sums, integrals, and, in the most general form, probability distributions. Suppose θ1,θ2,⋯ satisfy
θi≥0,i=1,2,⋯,∑i=1∞θi=1
and x1,x2,⋯∈C, where C⊆Rn is convex. Then
∑i=1∞θixi∈C
if the series converges. More generally, suppose p:Rn→R satisfies p(x)≥0 for all x∈C and ∫Cp(x)dx=1, where C⊂Rn is convex. Then
∫Cp(x)xdx∈C
if the integral exists.
In the most general form, suppose C⊆Rn is convex and x is a random vector with x∈C with probability one. The Ex∈C. Indeed, this form includes all the others as special cases.
2. Cones
A set C is called a cone, or nonnegative homogeneous, if for every x∈C and θ≥0 we have
θx∈C(2.1)
A set C is a convex cone is it is convex and a cone, which means that for any x1,x2∈C and θ1,θ2≥0, we have
θ1x1+θ2x2∈C(2.2)
A point of the form θ1x1+⋯+θkxk with θ1,⋯,θk≥0 is called a conic combination (or a nonnegative linear combination) of x1,⋯,xk. If xi are in a convex cone C, then every conic combination of xi is in C. Conversely, a set C is a convex cone if and only if it contains all conic combinations of its elements.
The conic hull of a set C is the set of all conic combinations of points in C, i.e.,
{θ1x1+⋯+θkxk∣xi∈C,θi≥0,i=1,⋯,k}(2.3)
The conic hull is also the the smallest convex cone that contains C.
3. Hyperplanes and halfspaces
3.1 Hyperplanes
A hyperplane is a set of the form
{x∣aTx=b}(3.1.1)
where a∈Rn,a=0 and b∈R. Analytically it is the solution set of a nontrivial linear equation among the components of x (and hence an affine set). The geometric interpretation can be understood by expressing the hyperplane in the form
{x∣aT(x−x0)=0}(3.1.2)
where x0 is any point in the hyperplane (i.e., any point that satisfies aTx0=b). This representation can in turn be expressed as
{x∣aT(x−x0)=0}=x0+a⊥(3.1.3)
where a⊥ denotes the orthogonal complement of a, i.e., the set of all vectors orthogonal to it:
a⊥={v∣aTv=0}(3.1.4)
This shows that the hyperplane consists of an offset x0, plus all vectors orthogonal to the vector a.
3.2 Halfspaces
A hyperplane divides Rn into two halfspaces. A (closed) halfspace is a set of the form
{x∣aTx≤b}(3.2.1)
where a=0, i.e., the solution set of one (nontrivial) linear inequality. Halfspaces are convex, but not affine. The halfspace can also be expressed as
{x∣aT(x−x0)≤0}(3.2.2)
Where x0 is any point on the associated hyperplane, i.e., satisfies aTx0=b. The representation (3.2.2) suggest a simple geometric interpretation: the halfspace consists of x0 plus any vector makes an obtuse (or right) angle with the vector a. The boundary of the halfspace is the hyperplane {x∣aT=b}. The set {x∣aTx<b}, which is the interior of the halfspace {x∣aTx≤b}, is called an open halfspace.
4. Euclidean balls and ellipsoids
4.1 Euclidean balls
A Euclidean ball (or just ball) in Rn has the form
where r>0, and ∥⋅∥2 denotes the Euclidean norm, i.e., ∥u∥2=(uTu)21. The vector xc is the center of the ball and the scalar r is its radius. B(xc,r) consists of all points within a distance r of the center xc. Another common representation for Euclidean ball is
B(xc,r)={xc+ru∣∥u∥2≤1}(4.1.2)
A Euclidean ball is a convex set: if ∥x1−xc∥2≤r,∥x2−xc∥2≤r,and 0≤θ≤1, then
A related family of convex sets is the ellipsoids, which have the form
E={x∣(x−xc)TP−1(x−xc)≤1}(4.2.1)
where P=PT≻0, i.e., P is symmetric and positive definite. The vector xc∈Rn is the center of the ellipsoid. The matrix P determines how far the ellipsoid extends in every direction from xc. The lengths of the semi-axes of E are given by λi, where λi are the eigenvalues of P. A ball is an ellipsoid with P=r2I. Another common representation of an ellipsoid is
E={xc+Au∣∥u∥2≤1}(4.2.2)
where A is square and nonsingular. In this representation we can assume without loss of generality that A is symmetric and positive definite. By taking A=P21, this representation gives the ellipsoid defined in (4.2.1). When the matrix A in (4.2.2) is symmetric positive semidefinite but singular, the set in (4.2.2) is called a degenerate ellipsoid. Its affine dimension is equal to the rank of A. Degenrate elliposids are also convex.
5. Norm balls and norm cones
5.1 Norm balls
Suppose ∥⋅∥ is any norm on Rn. From the general properties of norms it can be shown that a norm ball of radius r and center xc, given by
{x∣∥x−xc∥≤r}(5.1.1)
Norm ball is convex.
5.2 Norm cones
The norm cone associated with the norm ∥⋅∥ is the set
C={(x,t)∣∥x∥≤t}⊆Rn+1(5.2.1)
Norm cone is also a convex cone.
6. Polyhedra
A polyhedron is defined as the solution set of a finite number of linear equalities and inequalities:
P={x∣ajTx≤bj,j=1,⋯,m,cjTx=dj,j=1,⋯,p}(6.1)
A polyhedron is thus the intersection of a finite number of halfspaces and hyperplanes. Affine sets, rays, line segments, and halfspaces are all polyhedra. It is easily shown that polyhedra are convex sets. A bounded polyhedron is sometimes called a polytope, but some authors use the opposite convention (i.e., polytope for any set of the form (6.1), and polyhedron when it is bounded). It will be convenient to use the compact notation
where the symbol ⪯ denotes vector inequality or componentwise inequality in Rm: u⪯v means ui≤vi for i=1,⋯,m.
7. Simplexes
Simplexes are another important family of polyhedra. Suppose the k+1 points v0,⋯,vk∈Rn are affinely independent, which means v1−v0,⋯,vk−v0 are linearly independent. The simplex determined by them is given by
Where 1 denotes the vector with all entries one. The affine dimension of this simplex is k, so it is sometimes referred to as a k-dimensional simplex in Rn.
8. Positive semidefinite cone
We use the notation Sn to denote the set of symmetric n×n matrices,
Sn={X∈Rn×n∣X=XT}(8.1)
which is a vector space with dimension 2n(n+1). We use the notation S+n to denote the set of symmetric positive semidefinite matrices:
S+n={X∈Sn∣X⪰0}(8.2)
and the notation S++n to denote the set of symmetric positive definite matrices:
S++n={X∈Sn∣X⪰0}(8.3)
(This notation is meant to be analogous to R+, which denotes the nonnegative reals, and R++, which denotes the positive reals.)
The set S+n is a positive semidefinite cone. Obviously, S+n is also a convex cone: if θ1,θ2≥0 and A,B∈S+n, then θ1A+θ2B∈S+n. This can be seen directly from the definition of positive semidefiniteness: for any x∈Rn, we have