The proximal point method

Problem setup

Consider the unconstrained minimization problem

\[\minimize_{x \in \calH} f(x) \quad \iff \quad \text{find } x \in \calH \text{ such that } 0 \in \partial f(x),\]

where \(f : \calH \to \reals\) is \(\mu\)-strongly convex with \(\mu>0\).

For an initial point \(x^0 \in \calH\) and step size \(\gamma \in \reals_{++}\), the proximal point update [Mar70, Mor65, Roc76] is

\[(\forall k \in \naturals)\quad x^{k+1} = \prox_{\gamma f}(x^k).\]

In this example, we search for the smallest contraction factor \(\rho\in[0,1)\) provable using AutoLyap such that

\[\|x^k - x^\star\|^2 = O(\rho^k) \quad \textup{ as } \quad k\to\infty,\]

where

\[x^\star \in \Argmin_{x \in \calH} f(x).\]

Model the problem in AutoLyap and search for the smallest rho

\(f\) is modeled by StronglyConvex.
The optimization problem is represented by InclusionProblem.
The update rule is represented by ProximalPoint.
Distance-to-solution parameters are obtained with IterationIndependent.LinearConvergence.get_parameters_distance_to_solution.
The contraction factor is searched with IterationIndependent.LinearConvergence.bisection_search_rho.

Run the iteration-independent analysis

This example uses the MOSEK Fusion backend (backend="mosek_fusion"). Install the optional MOSEK dependency first:

pip install "autolyap[mosek]"

from autolyap import SolverOptions
from autolyap.algorithms import ProximalPoint
from autolyap.problemclass import InclusionProblem, StronglyConvex
from autolyap.iteration_independent import IterationIndependent

mu = 1.0
gamma = 0.6

problem = InclusionProblem([StronglyConvex(mu)])
algorithm = ProximalPoint(gamma=gamma)
solver_options = SolverOptions(backend="mosek_fusion")
# License-free option:
# solver_options = SolverOptions(backend="cvxpy", cvxpy_solver="CLARABEL")

# Build V(P, p, k) and R(T, t, k) for distance-to-solution analysis.
P, p, T, t = IterationIndependent.LinearConvergence.get_parameters_distance_to_solution(
    algorithm
)

result = IterationIndependent.LinearConvergence.bisection_search_rho(
    problem,
    algorithm,
    P,
    T,
    p=p,
    t=t,
    S_equals_T=True,
    s_equals_t=True,
    remove_C3=True,
    solver_options=solver_options,
)

if result["status"] != "feasible":
    raise RuntimeError("No feasible Lyapunov certificate in the requested rho interval.")

rho_autolyap = result["rho"]
rho_theory = (1.0 / (1.0 + gamma * mu)) ** 2

print(f"rho (AutoLyap): {rho_autolyap:.8f}")
print(f"rho (theory):   {rho_theory:.8f}")

What to inspect in result:

result["status"]: feasible, infeasible, or not_solved.
result["solve_status"]: raw backend status.
result["rho"]: certified contraction factor when feasible.
result["certificate"]: Lyapunov certificate matrices/scalars.

The computed value rho (AutoLyap) matches (up to solver numerical tolerances) the theoretical rate expression in [Roc76], i.e.,

\[\|x^k - x^\star\|^2 = O(\rho^k) \quad \textup{ as } \quad k\to\infty,\]

where

\[\rho = \left(\frac{1}{1+\gamma\mu}\right)^2,\]

and

\[x^\star \in \Argmin_{x \in \calH} f(x).\]

Equivalently,

\[\|x^k - x^\star\| = O\!\left(\left(\frac{1}{1+\gamma\mu}\right)^k\right).\]

Sweeping over 100 values of \(\gamma\) on \(0 < \gamma \le 5\) gives the plot below, with the theoretical rate in black and AutoLyap certificates as blue dots.

Proximal-point rho versus gamma with theoretical line and AutoLyap points.

References

[Mar70]

B. Martinet. Brève communication. Régularisation d'inéquations variationnelles par approximations successives. Revue française d'informatique et de recherche opérationnelle. Série rouge, 4(R3):154–158, 1970. doi:10.1051/m2an/197004R301541.

[Mor65]

Jean-Jacques Moreau. Proximité et dualité dans un espace hilbertien. Bulletin de la Société Mathématique de France, 93:273–299, 1965. doi:10.24033/bsmf.1625.

[Roc76] (1,2)

R. Tyrrell Rockafellar. Monotone operators and the proximal point algorithm. SIAM Journal on Control and Optimization, 14(5):877–898, 1976. doi:10.1137/0314056.