Iterative Methods and Their Dynamics With Applications A Contemporary Study

Iterative Methods and Their
Dynamics with Applications

A Contemporary Study
Iterative Methods and Their
Design, Fabrication, Properties
Dynamics with Applications
and Applications of Smart and
Advanced Materials
A Contemporary Study
Ioannis K. Argyros
Department of Mathematical
Editor Sciences
Cameron University
Xu
Lawton, Hou USA
Oklahoma,
Harvard University
andand Applied Sciences
School of Engineering
Á.Cambridge, MA, USA
Alberto Magreñán
Universidad Internacional de La Rioja
Escuela Superior de Ingenierı́a y Tecnologı́a
Logroño, La Rioja, Spain
p,
A SCIENCE PUBLISHERS BOOK
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2017 by Taylor & Francis Group, LLC

CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Printed on acid-free paper

Version Date: 20160826
International Standard Book Number-13: 978-1-4987-6360-8 (Hardback)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyright holders if permission to publish in this form has not been obtained. If any copyright material has
not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit-
ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying, microfilming, and recording, or in any information storage or retrieval system,
without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the CCC,
a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
Library of Congress Cataloging‑in‑Publication Data
Names: Argyros, Ioannis K. | Magreñán, Aì. Alberto (Aìngel Alberto)

Title: Iterative methods and their dynamics with applications : a
contemporary study / Ioannis K. Argyros and Aì. Alberto Magreñán.
Description: Boca Raton, FL : CRC Press, [2016] | “A science publishers
book.” | Includes bibliographical references and index.
Identifiers: LCCN 2016038689| ISBN 9781498763608 (hardback : alk. paper) |
ISBN 9781498763622 (e-book : alk. paper)
Subjects: LCSH: Iterative methods (Mathematics) | Numerical analysis.
Classification: LCC QA297.8 .A7364 2016 | DDC 518/.26--dc23
LC record available at https://lccn.loc.gov/2016038689
Visit the Taylor & Francis Web site at

http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Dedication
To my parents Mercedes and Alberto

my grandmother Ascensión
and my beloved Lara.
Preface
This book comprises a modern approach which combines recent and past results in
numerical methods with applications in fields such us Engineering, Chemistry, etc.
The book also provides a comparison between various investigations of recent years.
Each chapter contains several new theoretical results and important applications.
The applications appear in the form of examples or study cases or they are implied
since our results improve earlier ones that have already been applied in concrete
problems. Chapters and sections have been written independent of each other. Hence,
the interested reader can go directly to a certain chapter or section and understand
the material without having to go back and forth. Furthermore, several references to
recent studies appear along the whole book.
This monograph presents recent works of authors on Convergence and Dynamics
of iterative methods. It is the natural outgrowth of their related publications in these
areas. Chapters are self-contained and can be read independently. Moreover, an
extensive list of references is given in each chapter, in order to allow reader to use
the previous ideas. For these reasons, we think that several advanced courses can be
taught using this book.
The book’s results are expected to find applications in many areas of applied
mathematics, engineering, computer science and real problems. As such this
monograph is suitable to researchers, graduate students and seminars in the above
subjects, also to be in all science and engineering libraries.
The preparation of this book took place during the year 2016 in Lawton, Oklahoma,
USA and Logroño, La Rioja, Spain.
October 2016 Ioannis K. Argyros
Á. Alberto Magreñán
Contents
Dedication v
Preface vii
List of Figures xiii
List of Tables xv
Symbol Description xvii
1 Halley’s method 1
1.1 Introduction 1
1.2 Semilocal convergence of Halley’s method 3
1.3 Numerical examples 10
1.4 Basins of attraction 13
1.4.1 2 roots 13
1.4.2 3 roots 15
1.4.3 4 roots 15
References 23
2 Newton’s method for k-Fréchet differentiable operators 25
2.1 Introduction 26
2.2 Semilocal convergence analysis for Newton’s method 27
2.2.1 Uniqueness of solution 32
2.2.2 Special choices for function g 34
2.2.2.1 Choice 1 34
2.2.2.2 Choice 2 36
References 43
3 Nonlinear Ill-posed equations 46
3.1 Introduction 46
3.2 Convergence analysis 50
3.3 Error bounds 55
3.4 Implementation of adaptive choice rule 57
3.4.1 Algorithm 57
References 60
x < Iterative Methods and Their Dynamics with Applications: A Contemporary Study
4 Sixth-order iterative methods 63

4.1 Introduction 63
4.2 Scheme for constructing sixth-order iterative methods 65
4.3 Sixth-order iterative methods contained in family USS 69
4.4 Numerical work 70
4.4.1 Solving nonlinear equations 71
4.5 Dynamics for method SG 72
4.5.1 Study of the fixed points and their stability 73
4.5.2 Study of the critical points and parameter spaces 74
References 84
5 Local convergence and basins of attraction of a two-step 88
Newton-like method for equations with solutions of multiplicity
greater than one
5.1 Introduction 89
5.2 Local convergence 90
5.3 Basins of attraction 96
5.3.1 Basins of F(x) = (x – 1)2(x + 1) 98
5.3.2 Basins of F(x) = (x – 1)3(x + 1) 99
5.3.3 Basins of F(x) = (x – 1)4(x + 1) 99
References 104
6 Extending the Kantorovich theory for solving equations 108
6.1 Introduction 108
6.2 First convergence improvement 110
6.3 Second convergence improvement 113
References 116
7 Robust convergence for inexact Newton method 119
7.2 Standard results on convex functions 120
7.3 Semilocal convergence 121
7.4 Special cases and applications 134
References 140
8 Inexact Gauss-Newton-like method for least square problems 144
8.2 Auxiliary results 146
8.3 Local convergence analysis 148
8.4 Applications and examples 156
References 159
9 Lavrentiev Regularization methods for Ill-posed equations 161
9.2 Basic assumptions and some preliminary results 163
9.3 Error estimates 165
9.3.1 Apriori parameter choice 166
Contents < xi
9.3.2 Aposteriori parameter choice 166

References 174
10 King-Werner-type methods of order 1 + √2 176
10.2 Majorizing sequences for King-Werner-type methods 178
10.3 Convergence analysis of King-Werner-type methods 183
References 191
11 Generalized equations and Newton’s method 193
11.2 Preliminaries 194
References 202
12 Newton’s method for generalized equations using 204
restricted domains
12.4 Special cases 210
References 212
13 Secant-like methods 216
13.2 Semilocal convergence analysis of the secant method I 218
13.3 Semilocal convergence analysis of the secant method II 226
13.4 Local convergence analysis of the secant method I 227
13.5 Local convergence analysis of the secant method II 231
References 238
14 King-Werner-like methods free of derivatives 241
References 249
15 Müller’s method 251
15.2 Convergence ball for method (15.1.2) 252
References 258
16 Generalized Newton method with applications 259
xii < Iterative Methods and Their Dynamics with Applications: A Contemporary Study

References 268
17 Newton-secant methods with values in a cone 271
17.2 Convergence of the Newton-secant method 273
References 279
18 Gauss-Newton method with applications to convex 282
optimization
18.2 Gauss-Newton Algorithm and Quasi-Regularity condition 283
18.2.1 Gauss-Newton Algorithm GNA 283
18.2.2 Quasi Regularity 284
18.3 Semilocal convergence for GNA 286
18.4 Specializations and numerical examples 291
References 292
19 Directional Newton methods and restricted domains 295
19.2 Semilocal convergence analysis 298
References 304
20 Expanding the applicability of the Gauss-Newton method 306
for convex optimization—under restricted convergence
domains and majorant conditions
20.2 Gauss-Newton Algorithm and Quasi-Regularity condition 307
20.2.1 Gauss-Newton Algorithm GNA 307
20.2.2 Quasi Regularity 308
20.3 Semi-local convergence 310
References 316
21 Ball Convergence for eighth order method 319
21.2 Local convergence analysis 320
References 328
22 Expanding Kantorovich’s theorem for solving 331
generalized equations
References 341
Index 343
List of Figures
1.1 Basins of attraction associated to the Halley’s method applied to 14

p(z) = z2 – 1
p(z) = (z2 – 1)(z – 1)
p(z) = (z2 – 1)2
p(z) = z3 – 1
p(z) = (z3 – 1)(z – 1)
p(z) = (z3 – 1) (z – (– 1 – √ i))
3

2 2
p(z) = (z3 – 1) (z – (– 1 + √ i))
3
2 2
p(z) = (z3 – 1) (z – (– 1 – √ i)) (z – 1)
3
2 2
p(z) = (z3 – 1) (z – (– 1 + √ i)) (z – 1)
3
2 2
p(z) = (z3 – 1) (z2 + z + 1)
p(z) = (z3 – 1)2
p(z) = z4 – 1
p(z) = (z4 – 1)(z – 1)
p(z) = (z4 – 1)(z2 – 1)
p(z) = (z4 – 1)(z2 – 1)(z – i)
xiv < Iterative Methods and Their Dynamics with Applications: A Contemporary Study

p(z) = (z4 – 1)2
4.1 Bifurcation diagram of the fixed points 73
4.2 Region of stability of z = 1 74
4.3 Bifurcation diagram of the critical points 75
4.4 Parameter space associated to the free critical point cr1(a) 77
4.8 Detail of the parameter space associated to the free critical point cr1(a) 79
4.9 Basins of attraction associated to the method with a = –2.125 79
4.12 Basins of attraction associated to the method with a = –2 81
4.13 Basins of attraction associated to the method with a = 0 81
4.14 Basins of attraction associated to the method with a = –5 82
4.15 Basins of attraction associated to the method with a = 5 82
4.16 Basins of attraction associated to the method with a = 2i 83
5.1 Basins of attraction of the method (5.2) associated to the roots 98
of F(x) = (x – 1)2(x + 1) with ϕ(x) = x2
of F(x) = (x – 1)2(x + 1) with ϕ(x) = 2x
of F(x) = (x – 1)2(x + 1) with ϕ(x) = x4 + 1
of F(x) = (x – 1)3(x + 1) with ϕ(x) = x2
of F(x) = (x – 1)3(x + 1) with ϕ(x) = 2x
of F(x) = (x – 1)3(x + 1) with ϕ(x) = x4 + 1
of F(x) = (x – 1)4(x + 1) with ϕ(x) = x2
of F(x) = (x – 1)4(x + 1) with ϕ(x) = 2x
of F(x) = (x – 1)4(x + 1) with ϕ(x) = x4 + 1
13.1 Ammonia process.Taken from www.essentialchemicalindustry. 237
org/chemicals/ammonia
List of Tables
2.1 Sequences u–n and un and a priori error bounds 39

2.2 Sequences {tn}, {u–n} and {un} 42
2.3 A priori error estimates 42
4.1 The (number of function evaluations, COC) for various sixth order 71
iterative methods
14.1 The comparison results of error estimates 247
22.1 Error bound comparison 340
Symbol Description
α To solve the generator algorithms have also been

maintenance scheduling, in tested.
the past, several mathematical θ√abc This paper presents a survey
techniques have been applied. of the literature
σ2 These include integer ζ over the past fifteen years in
programming, integer linear the generator
programming, dynamic ∂ maintenance scheduling.
programming, branch and The objective is to
bound etc. sdf present a clear picture of the
∑ Several heuristic search available recent literature
algorithms have also been ewq of the problem, the
developed. In recent years constraints and the other
expert systems, aspects of
abc fuzzy approaches, simulated bvcn the generator maintenance
annealing and genetic schedule.
Chapter 1
Halley’s method
Ioannis K. Argyros
Department of Mathematical Sciences, Cameron University, Lawton, OK 73505,
USA, Email: iargyros@cameron.edu
Á. Alberto Magreñán

Universidad Internacional de La Rioja, Escuela Superior de Ingenierı́a y Tecnologı́a,
26002 Logroño, La Rioja, Spain, Email: alberto.magrenan@unir.net
CONTENTS
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Semilocal convergence of Halley’s method . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Basins of attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.1 2 roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.2 3 roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.3 4 roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.1 Introduction
In this chapter we are concerned with the problem of approximating a locally
unique solution x of the nonlinear equation
F(x) = 0, (1.1)
where F is twice Fréchet-differentiable operator defined on a nonempty open and
convex subset of a Banach space X with values in a Banach space Y .
Many problems from computational sciences and other disciplines can be
1
2 Iterative methods and their dynamics with applications: a contemporary study
brought in a form similar to equation (1.1) using mathematical modelling [1, 3,

5, 13, 19–21]. The solutions of these equations can be rarely be found in closed
form. That is why most solution methods for these equations are iterative. The
convergence matter is usually divided in local and semilocal study. The local
study is based on the solution whereas the semilocal one is based on the initial
point.
In the present chapter we provide a semilocal convergence analysis for Hal-
ley’s method defined by
xn+1 = xn − ΓF (xn )F(xn ), f or each n = 0, 1, 2, . . . , (1.2)

1
where, ΓF (x) = (I − LF (x))−1 and LF (x) = F (x)−1 F (x)F (x)−1 F(x).
2
The convergence of Halley’s method has been studied in many works such
as [1–5, 8, 11, 12]. The most used conditions for the study of the semilocal con-
vergence of Halley’s method are the following ones
(C1 ) There exists x0 ∈ D such that F (x0 )−1 ∈ L(Y, X), the space of bounded
linear operator from Y into X;
(C2 ) F (x0 )−1 F(x0 ) ≤ γ, γ > 0;
(C3 ) F (x0 )−1 F (x) ≤ M for each x in D, M > 0;
(C4 ) F (x0 )−1 [F (x) − F (y)] ≤ Kx − y for each x and y in D, K > 0.
The corresponding sufficient convergence condition is given by

√
4K + M 2 − M M 2 + 2K
γ≤ √ , (1.3)
3K(M + M 2 + 2K)
where all the parameters comes from the above conditions.
Condition (C4 ) is not satisfied in many cases as the following one.
Example 1.1 Let X = Y = R, define D = [0, +∞) and let
4 2√
F(x) = x x + x2 + x − 3.
15
Then, we obtain
√ √ |x − y|
|F (x) − F (y)| = | x − y| = √ √ .
x+ y
Therefore, there is no constant K satisfying (C4 ).

Ezquerro and Hernández in [12], using recurrence relations, expanded the
applicability of Halley’s method dropping condition (C4 ) and replacing (1.3) by
√
4− 6
Mγ < . (1.4)
5
Halley’s method 3
In the present chapter we show how to expand even further the applicability
of Halley’s method using (C1 ), (C2 ), (C3 ) and the well-known center-Lipschitz
condition, defined by
(C5 ) F (x0 )−1 [F (x) − F (x0 )] ≤ Lx − x0 f or each x ∈ D, L > 0.
We have that
L≤M (1.5)
M
holds in general. Moreover, as it was stated in [4] and [5] can be arbitrarily
L
large. In the literature (C3 ) is used to obtain
1
F (x)−1 F (x0 ) ≤ f or each x ∈ D. (1.6)
1 − Mx − x0
However, using condition (C5 ) and if L < M we obtain the tighter estimate
1
F (x)−1 F (x0 ) ≤ f or each x ∈ D. (1.7)
1 − Lx − x0
This idea, used in the study of many methods [4, 5], allows to obtain:
a tighter convergence analysis,

weaker sufficient convergence conditions,
more precise error bounds on the location of the solution.
As a consequence of the idea, we can use a different approach than recur-

rent relations in our analysis. The chapter is organized as follows: Section 1.2
contains the study of the semilocal convergence of Halley’s method and the nu-
merical examples are presented Section 1.3. Finally, some basins of attraction
related to Halley’s method are presented in the concluding Section 1.4.
1.2 Semilocal convergence of Halley’s method

In order to present the semilocal convergence analysis of Halley’s method, we
shall first introduce an additional condition.
Denoting
Mγ γ
a= and γ0 = ,
2 1−a
we can define condition
(C6 ) Suppose that

M
(L + )γ < 1 (1.8)
2
and there exists

b ∈ (0, 1 − Lγ0 ) (1.9)
which is the smallest positive zero of function g defined on (0, 1 − Lγ0 ) given by
a a
+
Lγ0 1−a
1−
g(t) = 1 − t − t.
Lγ0 a a a
(1 − ) 1− ( + )
1−t Lγ0 2 Lγ0 1−a
(1 − a)(1 − ) 1−
1−t 1−t
(1.10)
Then, we get
a a
+
Lγ0 1−a
1−
b= 1 − b .
Lγ0 a a a (1.11)
(1 − ) 1− ( + )
1−b Lγ0 2 Lγ0 1−a
(1 − a)(1 − ) 1−
1−b 1−b
Now, define
γ0
R0 = and β = LR0 .
1−b
Then, we have that
a ∈ (0, 1), β ∈ (0, 1) and
a a a M 2 γ0 γ (1.12)
( + )= ( + γ0 ) < 1.
(1 − a)(1 − β ) 1 − β 1 − a
2 4(1 − β )2 1 − β
Rewriting (1.11) as
M γ
( + γ0 )
2 1−β
b= .
M 2 γ0 γ
(1 − β )(1 − ( + γ0 ))
4(1 − β )2 1 − β
Furthermore, before stating the main theorem, we must define the following
condition.
(C7 ) Suppose that
M M
γ + γ0
2(1 − β ) 2
d= < 1.
M
(1 − β )(1 − γ)
2(1 − β )2
We refer to (C1 )−(C3 ), (C5 )−(C7 ) as the (C) conditions. Let B(x, R), B(x, R)
Halley’s method 5
stand, respectively, for the open and closed balls in X with center x and radius
R > 0.
Theorem 1.1
Let F : D ⊂ X → Y be continuously twice Fréchet differentiable, where X, Y are
Banach spaces and D is an open and convex subset of X. Suppose that the (C) condi-
tions hold and B(x0 , R0 ) ⊂ D. Then, the sequence {xn } generated by Halley’s method
(1.2) is well defined, remains in B(x0 , R0 ) for all n ≥ 0 and converges to a unique
solution x ∈ B(x0 , R0 ) of equation F(x) = 0. Moreover, the following estimate holds
for each n = 0, 1, 2, . . .
xn+2 − xn+1 ≤ bxn+1 − xn . (1.13)
Furthermore, if there exists R ≥ R0 such that
B(x0 , R ) ⊆ D (1.14)
and
L
(R0 + R ) < 1, (1.15)
2
then, x is the only solution of equation F(x) in B(x0 , R ).
Proof: By the definition of (C1 ) − (C3 ) we get that

1
I − (I − LF (x0 )) = LF (x0 ) = F (x0 )−1 F (x0 )F (x0 )−1 F(x0 )
2
1
≤ F (x0 )−1 F (x0 )F (x0 )−1 F(x0 )
2
1
≤ Mγ = a < 1.
2
(1.16)
Moreover, it follows from the Banach lemma on invertible operators [2], [13]
and (1.16) that (I − LF (x0 ))−1 exists and (I − LF (x0 ))−1 is bounded by
1 1
(I − LF (x0 ))−1 ≤ ≤ .
1 − LF (x0 ) 1 − a
Then, by (C2 ), (1.2) and the above estimate we obtain that
x1 − x0 = (I − LF (x0 ))−1 F (x0 )−1 F(x0 ) ≤ (I − LF (x0 ))−1
F (x0 )−1 F(x0 )
γ
≤ = γ0 < R0 .
1−a
Now, using (C5 ) we get that
I − F (x0 )−1 F (x1 ) ≤ Lx1 − x0 ≤ Lγ0 < LR0 = β < 1.

Hence, F (x1 )−1 exists and F (x1 )−1 F (x0 ) is bounded by

1
F (x1 )−1 F (x0 ) < .
1−β
By (1.2) we can write
[I − LF (x0 )](x1 − x0 ) + F (x0 )−1 F(x0 ) = 0
or
1
F(x0 ) + F (x0 )(x1 − x0 ) = F (x0 )F (x0 )−1 F(x0 )(x1 − x0 )
2
or
F (x0 )−1 F(x1 ) = F (x0 )−1 [F(x1 ) − F(x0 ) − F (x0 )(x1 − x0 ) + F(x0 )
+F (x0 )(x1 − x0 )]
1
= F (x0 )−1 F (x0 )F (x0 )−1 F(x0 )(x1 − x0 )
2 1
+ 0 F (x0 )−1 F (x0 + θ (x1 − x0 ))(1 − θ )dθ (x1 − x0 )2 .
Hence, we have that
1 1
F (x0 )−1 F(x1 ) ≤ Mγx1 − x0 + Mx1 − x0 2
2 2
M 1
≤ γx1 − x0 + Mx1 − x0 2
2(1 − β ) 2
M γ
≤ ( + γ0 )x1 − x0 .
2 1−β
Moreover, we get that
1
LF (x1 ) = F (x1 )−1 F (x0 )F (x0 )−1 F (x1 )F (x1 )−1
2
F (x0 )F (x0 )−1 F(x1 )
1
≤ F (x1 )−1 F (x0 )2 F (x0 )−1 F (x1 )F (x0 )−1 F(x1 )
2
M M 1
≤ γx1 − x0 + Mx1 − x0 2
2(1 − β ) 2(1 − β )
2 2
γ
M2( + γ0 )x1 − x0
1−β M 2 γ0 γ
≤ ≤ ( + γ0 ) < 1.
4(1 − β )2 4(1 − β )2 1 − β
Then, (I − LF (x1 ))−1 exists,
1
(I − LF (x1 ))−1 ≤
1 − LF (x1 )
1
≤ M
M 1
1− γx1 − x0 + Mx1 − x0 2
2(1 − β ) 2(1 − β )
2 2
1
≤ .
M 2 γ0 γ
1− ( + γ0 )
4(1 − β )2 1 − β
Halley’s method 7
So, x2 is well defined, and using (1.2) we get

F (x1 )−1 F (x0 )F (x0 )−1 F(x1 )
x2 − x1 ≤
1 − LF (x1 )
M 1
γx1 − x0 + Mx1 − x0 2
2(1 − β ) 2
≤
M M 1
(1 − β ) 1 − [ γx1 − x0 + Mx1 − x0 2 ]
2(1 − β ) 2(1 − β )
2 2
M γ
( + γ0 )
2 1−β
≤ x1 − x0 = bx1 − x0 .
M 2 γ0 γ
(1 − β )(1 − ( + γ0 ))
4(1 − β )2 1 − β
Therefore, we have that
x2 − x0 ≤ x2 − x1 + x1 − x0 ≤ bx1 − x0 + x1 − x0
= (1 + b)x1 − x0
1 − b2 x1 − x0 γ0
= x1 − x0 < ≤ = R0 .
1−b 1−b 1−b
Hence, x2 ∈ B(x0 , R0 ).
The above shows that for n = 0, the following statements hold
(i) F (xn+1 )−1 exists and
1
F (xn+1 )−1 F (x0 ) < ;
1−β
(ii)
F (x0 )−1 F(xn+1 ) ≤ dn+1 ;
(iii)
dn+1 ≤ ddn ≤ d n+1 d0 ;
(iv) (I − LF (xn+1 ))−1 exists and

1
(I − LF (xn+1 ))−1 ≤ ;
M
1− dn+1
2(1 − β )2
(v) xn+2 is well defined and

dn+1
xn+2 −xn+1 ≤ ≤ bxn+1 −xn ≤ bn+1 x1 −x0 ;
M
(1 − β )(1 − dn+1 )
2(1 − β )2
(vi) xn+2 − x0 < R0 .

Here, sequence {dn } is defined by
d0 = γ,
M M
dn+1 = dn xn+1 − xn + xn+1 − xn 2 n ≥ 0.
2(1 − β ) 2
The rest of the proof will be obtained using mathematical induction. Assum-
ing that (i)-(vi) are true for all natural integers n ≤ k, where k ≥ 0 is a fixed
integer. Then F (xk+2 ) exists, since xk+2 ∈ B(x0 , R0 ) and
I − F (x0 )−1 F (xk+2 ) ≤ Lxk+2 − x0 < LR0 = β < 1.
Hence, F (xk+2 )−1 exists and

1 1
F (xk+2 )−1 F (x0 ) ≤ < .
1 − Lxk+2 − x0 1 − β
Next, we shall estimate F (x0 )−1 F(xk+2 ). We have by Halley’s method that
F(xk+2 ) = F(xk+2 ) − F(xk+1 ) − F (xk+1 )(xk+2 − xk+1 )

1
+ F (xk+1 )F (xk+1 )−1 F(xk+1 )(xk+2 − xk+1 )
2
or
1
F (x0 )−1 F(xk+2 ) = F (x0 )−1 F (xk+1 )F (xk+1 )−1 F (x0 )F (x0 )−1
2
F(xk+1 )(xk+2 − xk+1 )
1
+ 0 F (x0 )−1 F (xk+1 + θ (xk+2 − xk+1 ))(1 − θ )
dθ (xk+2 − xk+1 )2 .
Hence, we get that

M M
F (x0 )−1 F(xk+2 ) ≤ dk+1 xk+2 − xk+1 + xk+2 − xk+1 2 = dk+2
2(1 − β ) 2
M dk+1
= ( + xk+2 − xk+1 )xk+2 − xk+1
2 1−β
M d k+1 γ
≤ ( + bk+1 x1 − x0 )xk+2 − xk+1
2 1−β
M dγ M dγ
≤ ( + bγ0 )xk+2 − xk+1 ≤ ( + bγ0 )bk+1
2 1−β 2 1−β
x1 − x0
M dγ
≤ ( + bγ0 )bγ0 ,
2 1−β
(1.17)
Halley’s method 9
M dγ
dk+2 ≤ ( + bγ0 )xk+2 − xk+1
2 1−β
M dγ dk+1
≤ ( + bγ0 )
2 1−β M
(1 − β )(1 − d k+1 γ)
2(1 − β )2
M dγ dk+1
≤ ( + bγ0 )
2 1−β M
(1 − β )(1 − dγ)
2(1 − β )2
M γ dk+1
≤ ( + γ0 ) = ddk+1 ≤ d k+2 d0
2 1−β M
(1 − β )(1 − γ)
2(1 − β )2
and
1
LF (xk+2 ) ≤ F (xk+2 )−1 F (x0 )2 F (x0 )−1 F(xk+2 )
2
F (x0 )−1 F (xk+2 )
M M2 γ
≤ d k+2 ≤ ( + γ0 )γ0 < 1.
2(1 − β )2 4(1 − β ) 1 − β
2
As a consequence, (I − LF (xk+2 ))−1 exists and

1 1
(I − LF (xk+2 ))−1 ≤ ≤ .
1−
M M 2
γ
dk+2 1− ( + γ0 )γ0
2(1 − β )2 4(1 − β )2 1 − β
Therefore, xk+3 is well defined. Moreover, we obtain that
xk+3 − xk+2 ≤ (I − LF (xk+2 ))−1 F (xk+2 )−1 F (x0 )F (x0 )−1 F(xk+2 )
dk+2
≤
M
(1 − β )(1 − dk+2 )
2(1 − β )2
M γ
( + γ0 )xk+2 − xk+1
2 1−β
≤ = bxk+2 − xk+1
M2 γ
(1 − β ) 1 − ( + γ0 )γ0
4(1 − β )2 1 − β
≤ bk+2 x1 − x0 .
Furthermore, we have that
xk+3 − x0 ≤ xk+3 − xk+2 + xk+2 − xk+1 + · · · + x1 − x0

≤ (bk+2 + bk+1 + · · · + 1)x1 − x0
1 − bk+3 γ0
= x1 − x0 < = R0 .
1−b 1−b
Hence, we deduce that xk+3 ∈ B(x0 , R0 ). The induction for (i)-(vi) is complete.
Let m be a natural integer. Then, we have that
xk+m − xk ≤ xk+m − xk+m−1 + xk+m−1 − xk+m−2 + · · · + xk+1 − xk

≤ (bm−1 + · · · + b + 1)xk+1 − xk
1 − bm k
≤ b x1 − x0 .
1−b
It follows that {xk } is complete sequence in a Banach space X and as such it
converges to some x ∈ B(x0 , R0 ) (since B(x0 , R0 ) is a closed set). By letting
k → ∞ in (1.17) we obtain F(x ) = 0. Moreover, we get
bk
x − xk ≤ x1 − x0 .
1−b
In order to show the uniqueness part, let y be a solution of F(x) = 0 in
1
B(x0 , R ). Denoting Q = 0 F (x0 )−1 F (x + θ (y − x ))dθ . Using (C5 ) we have
in turn that
1
I − Q = 0 F (x0 )−1 [F (x + θ (y − x )) − F (x0 )]dθ
1 L
≤ L 0 [(1 − θ )x − x0 + θ y − x0 ]dθ ≤ (R0 + R ) < 1.
2
It follows that Q−1 exists. Now, using
0 = F (x0 )−1 (F(y ) − F(x )) = F (x0 )−1 T (y − x )
we deduce y = x . 2
1.3 Numerical examples

In order to show the applicability of the main result we will apply the conditions
to two different examples.
Example 1.3.1 Consider the scalar function F(x) = x3 − 0.64 on D = (0, 1) and
choose as initial point x0 = 0.85. Then,
F (x) = 3x2 , F (x) = 6x. (1.18)
So, F(x0 ) = −0.025875, F (x0 ) = 2.1675 and γ = 0.011937716.

Moreover, we have for any x ∈ D that
2|x| 2
|F (x0 )−1 F (x)| = 2
≤ 2 = 2.76816609 (1.19)
x0 x0
Halley’s method 11
and
|x + x0 |
|F (x0 )−1 [F (x) − F (x0 )]| = |x − x0 | ≤ 2.560553633|x − x0 |. (1.20)
x02
That is, we obtain M = 2.76816609, L = 2.560553633, a = 0.016522791,
γ0 = 0.012138274 and 1 − Lγ0 = 0.968919297.
Furthermore, we have that b = 0.035021656 by using the secant method.
Then we get R0 = 0.012578805, β = 0.032208705 and d = 0.035628901. As a
consequence, conditions (C6 ) and (C7 ) are satisfied.
On the other hand,
B(x0 , R0 ) = [0.837421195, 0.862578805] ⊂ D.
So, we can ensure the convergence of Halley’s method by Theorem 1.1.
Now, we want to compare our uniqueness ball with the ones obtained in [11].
Using (1.4) we get that
√
4− 6
β0 = Mγ = 0.03304558 < .
5
So, conditions of [11] are also satisfied.
The uniqueness ball is B(x0 , r0 γ), where
2 − 3β0
r0 = = 1.053745860.
2(1 − 3β0 + β02 )
Then, we get that R1 = r0 γ = 0.012579319 > R0 .
So, it is clear that we provide a better information on the location of the
solution. Moreover by (1.14) and (1.15) we can set R = 0.15, which extends the
uniqueness ball from B(x0 , R0 ) to B(x0 , R ).
Example 1.3.2 In this example we provide an application of our results to
a special nonlinear Hammerstein integral equation of the second kind. In this
example we will use the max-norm.
1
4
x(s) = 1 + G(s,t)x(t)3 dt, s ∈ [0, 1], (1.21)
5 0
where, G is the Green’s kernel on [0, 1] × [0, 1] defined by

t(1 − s), t ≤ s;
G(s,t) = (1.22)
s(1 − t), s ≤ t.
Let X = Y = C[0, 1] and D be a suitable open convex subset of D := {x ∈ X :
x(s) > 0, s ∈ [0, 1]}, which will be given below. Define F : D → Y by

4 1
[F(x)](s) = x(s) − 1 − G(s,t)x(t)3 dt, s ∈ [0, 1]. (1.23)
5 0
We get that
1
12
[F (x)y](s) = y(s) − G(s,t)x(t)2 y(t)dt, s ∈ [0, 1], (1.24)
5 0
and
1
24
[F (x)yz](s) = − G(s,t)x(t)y(t)z(t)dt, s ∈ [0, 1]. (1.25)
5 0
Let x0 (s) = 1 for all s ∈ [0, 1], then, for any y ∈ D, we have
12 1
[(I − F (x0 ))y](s) = G(s,t)y(t)dt, s ∈ [0, 1], (1.26)
5 0
which means
12 1 12 3
I − F (x0 ) ≤ maxs∈[0,1] 0 G(s,t)dt = = < 1. (1.27)
5 5 × 8 10
It follows from the Banach lemma on invertible operators that F (x0 )−1 exists
and
1 10
F (x0 )−1 ≤ = . (1.28)
3 7
1−
10
4 1 1
Moreover, from (1.23) we obtain F(x0 ) = maxs∈[0,1] 0 G(s,t)dt = .
5 10
1
Then, we get γ = .
7
It is clear that F (x) is not bounded in X or its subset D. A solution x of
F(x) = 0 must satisfy
1
x − 1 − x 3 ≤ 0, (1.29)
10
i.e., x ≤ ρ1 = 1.153467305 and x ≥ ρ2 = 2.423622140, where ρ1 and ρ2
1
are the positive roots of the equation z − 1 − z3 = 0. Consequently, if we look
10
for a solution such that x < ρ1 ∈ X1 , we can consider D = {x : x ∈ X1 and x <
r}, with r ∈ (ρ1 , ρ2 ), as a nonempty open convex subset of X. For example,
choose r = 1.7.
Using (1.24) and (1.25), we have that for any x, y, z ∈ D
12 1
|[(F (x) − F (x0 ))y](s)| = | G(s,t)(x(t)2 − x0 (t)2 )y(t)dt|
5 0
12 1
≤ G(s,t)|x(t) + x0 (t)||x(t) − x0 (t)|y(t)dt
5 0
12 1
≤ G(s,t)(r + 1)|x(t) − x0 (t)|y(t)dt, s ∈ [0, 1]
5 0
(1.30)
and
24 1
|[F (x)yz](s)| = G(s,t)x(t)y(t)z(t)dt, s ∈ [0, 1]. (1.31)
5 0
Then, we have
12 1 81
F (x) − F (x0 ) ≤ × (r + 1)x − x0 = x − x0 (1.32)
5 8 100
and
24 r 51
F (x) ≤ × = . (1.33)
5 8 50
Choosing
51 81 51 70 M 66
M= , L= , a= , γ0 = , (L + )γ = ,
35 70 490 439 2 245
358
1 − Lγ0 = , (1.34)
439
b = 0.4266517573, R0 = 0.2781089940, β = 0.3218118359,
d = 0.5138822165.
We get
B(x0 , R0 ) ⊂ D. (1.35)
So, all conditions in Theorem are satisfied and we can state the convergence of
Halley’s method (1.2) starting in x0 . Moreover, by (1.14) and (1.15) we can set
R = 0.7, which extends the uniqueness ball from B(x0 , R0 ) to B(x0 , R ).
1.4 Basins of attraction

In this section, we present some basins of attraction related to Halley’s method
applied to different polynomials using the techniques that appear in [6, 7, 9, 10,
14–18].
1.4.1 2 roots
Throughout this subsection, we have used cyan color is related to the conver-
gence to r1 = −1 and magenta is related to the convergence to r2 = 1. Moreover,
black color is related to that pairs for which there is no convergence to any of the
roots. In Figures 1.1–1.3 some basins of attraction related to polynomials with
two different roots with different multiplicities are shown.
10
-5
-10
-10 -5 0 5 10
Figure 1.1: Basins of attraction associated to the Halley’s method applied

to p(z) = z2 − 1.
10
-5
-10
-10 -5 0 5 10
Figure 1.2: Basins of attraction associated to the Halley’s method applied to

p(z) = (z2 − 1)(z − 1).
10
-5
-10
-10 -5 0 5 10

p(z) = (z2 − 1)2 .
1.4.2 3 roots
Throughout this subsection, we have used cyan color is related to the √conver-
gence to r1 = −1, magenta is related to the convergence to r2 = − 12 − 23 i and
√
yellow to r3 = − 12 + 23 i. Moreover, black color is related to that pairs for which
there is no convergence to any of the roots. In Figures 1.4–1.11 some basins of
attraction related to polynomials with three different roots with different multi-
plicities are shown.
1.4.3 4 roots
Throughout this subsection, we have used cyan color is related to the conver-
gence to r1 = −1, magenta is related to the convergence to r2 = 1, yellow to
r3 = i and red to r4 = −i. Moreover, black color is related to that pairs for which
there is no convergence to any of the roots. In Figures 1.12–1.16 some basins of
attraction related to polynomials with four different roots with different multi-
plicities are shown.
10
-5
-10
-10 -5 0 5 10

p(z) = z3 − 1.
10
-5
-10
-10 -5 0 5 10
p(z) = (z3 − 1)(z − 1).
10
-5
-10
-10 -5 0 5 10
Figure 1.6: Basins of attraction associated to the√Halley’s method applied to

p(z) = (z3 − 1)(z − (− 12 − 23 i)).
10
-5
-10
-10 -5 0 5 10
Figure 1.7: Basins of attraction associated to the√Halley’s method applied to

p(z) = (z3 − 1)(z − (− 12 + 23 i)).
10
-5
-10
-10 -5 0 5 10
Figure 1.8: Basins of attraction associated to √

the Halley’s method applied to
p(z) = (z3 − 1)(z − (− 12 − 23 i))(z − 1).
10
-5
-10
-10 -5 0 5 10
Figure 1.9: Basins of attraction associated to √

the Halley’s method applied to
3
p(z) = (z − 1)(z − (− 2 + 23 i))(z − 1).
1
10
-5
-10
-10 -5 0 5 10

p(z) = (z3 − 1)(z2 + z + 1).
10
-5
-10
-10 -5 0 5 10

p(z) = (z3 − 1)2 .
10
-5
-10
-10 -5 0 5 10

p(z) = z4 − 1.
10
-5
-10
-10 -5 0 5 10
p(z) = (z4 − 1)(z − 1).
10
-5
-10
-10 -5 0 5 10

p(z) = (z4 − 1)(z2 − 1).
10
-5
-10
-10 -5 0 5 10

p(z) = (z4 − 1)(z2 − 1)(z − i).
10
-5
-10
-10 -5 0 5 10

p(z) = (z4 − 1)2 .
References
[1] S. Amat, S. Busquier and J. M. Gutiérrez, Geometric constructions of iter-

ative functions to solve nonlinear equations, J. Comput. Appl. Math., 157,
197–205, 2003.
[2] I. K. Argyros, The convergence of Halley-Chebyshev type method under
Newton-Kantorovich hypotheses, Appl. Math. Lett., 6, 71–74, 1993.
[3] I. K. Argyros, Convergence and Application of Newton-type Iterations,
Springer, 2008.
[4] I. K. Argyros and S. Hilout, Numerical methods in Nonlinear Analysis,
World Scientific Publ. Comp. New Jersey, 2013.
[5] I. K. Argyros and H. M. Ren, Ball convergence theorems for Halley’s
method in Banach spaces, J. Appl. Math. Computing, 38, 453–465, 2012.
[6] I. K. Argyros and Á. A. Magreñán, On the convergence of an optimal
fourth-order family of methods and its dynamics, Appl. Math. Comput. 252,
336–346, 2015.
[7] I. K. Argyros and Á. A. Magreñán, Local Convergence and the Dynamics
of a Two-Step Newton-Like Method, Int. J. Bifurcation and Chaos, 26 (5),
1630012 2016.
[8] V. Candela and A. Marquina, Recurrence relations for rational cubic meth-
ods I: The Halley method, Computing, 44, 169–184, 1990.
[9] A. Cordero, J. M. Gutiérrez, Á. A. Magreñán and J. R. Torregrosa, Stability
analysis of a parametric family of iterative methods for solving nonlinear
models, Appl. Math. Comput., 285, 26–40, 2016.
23
24 References
[10] A. Cordero, L. Feng, Á. A. Magreñán and J. R. Torregrosa, A new fourth-

order family for solving nonlinear problems and its dynamics, J. Math.
Chemistry, 53 (3), 893–910, 2015.
[11] P. Deuflhard, Newton Methods for Nonlinear Problems: Affine Invariance
and Adaptive Algorithms, Springer-Verlag, Berlin, Heidelberg, 2004.
[12] J. A. Ezquerro and M. A. Hernández, New Kantorovich-type conditions for
Halley’s method, Appl. Num. Anal. Comp. Math., 2, 70–77, 2005.
[13] A. Fraile, E. Larrodé, Á. A. Magreñán and J. A. Sicilia, Decision model
for siting transport and logistic facilities in urban environments: A method-
ological approach. J. Comput. Appl. Math., 291, 478–487, 2016.
[14] Y. H. Geum, Y. I., Kim and Á. A., Magreñán, A biparametric extension
of King’s fourth-order methods and their dynamics, Appl. Math. Comput.,
282, 254–275, 2016.
[15] T. Lotfi, Á. A., Magreñán, K. Mahdiani, K. and J. J. Rainer, A variant
of Steffensen-King’s type family with accelerated sixth-order convergence
and high efficiency index: Dynamic study and approach, Appl. Math. Com-
put., 252, 347–353, 2015.
[16] Á. A. Magreñán, Different anomalies in a Jarratt family of iterative root-
finding methods, Appl. Math. Comput., 233, 29–38, 2014.
[17] Á. A. Magreñán, A new tool to study real dynamics: The convergence
plane, Appl. Math. Comput., 248, 215–224, 2014.
[18] Á. A. Magreñán and I. K. Argyros, On the local convergence and the dy-
namics of Chebyshev-Halley methods with six and eight order of conver-
gence,J. Comput. Appl. Math., 298, 236–251, 2016.
[19] B. Royo, J. A. Sicilia, M. J. Oliveros and E. Larrodé, Solving a long-
distance routing problem using ant colony optimization. Appl. Math., 9
(2L), 415–421, 2015.
[20] J. A. Sicilia, C. Quemada, B. Royo and D. Escuı́n, An optimization algo-
rithm for solving the rich vehicle routing problem based on variable neigh-
borhood search and tabu search metaheuristics. J. Comput. Appl. Math.,
291, 468–477, 2016.
[21] J. A. Sicilia, D. Escuı́n, B. Royo, E. Larrodé and J. Medrano, A hybrid
algorithm for solving the general vehicle routing problem in the case of the
urban freight distribution. In Computer-based Modelling and Optimization
in Transportation (pp. 463–475). Springer International Publishing, 2014.
Chapter 2
Newton’s method for

k-Fréchet differentiable
operators
Ioannis K. Argyros

CONTENTS
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Semilocal convergence analysis for Newton’s method . . . . . . . . . . . . . . . 27
2.2.1 Uniqueness of solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.2 Special choices for function g . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.2.1 Choice 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.2.2 Choice 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
25
2.1 Introduction
We determine a solution x∗ of the equation
F(x) = 0,
where X and Y are Banach spaces, D ⊆ X a convex set and F : D → Y is a

Fréchet-differentiable operator. In particular, we expand the applicability of the
Newton’s method defined by
xn+1 = xn − [F (xn )]−1 F(xn ), for each n = 0, 1, 2, . . . , (2.1)
by considering weaker sufficient convergence criteria than in earlier studies [13].

We will denote along the chapter:
L(X,Y ) as the space of bounded linear operators from the Banach space
setting X into the Banach space setting Y .
U(x, r) as the open ball centered at the point x ∈ X of radius r > 0.
U(x, r) as the open ball centered at x ∈ X of radius r > 0.
The Newton-Kantorovich theorem [20] needed the following conditions to

be proved:
(C1 ) There exists Γ0 = [F (x0 )]−1 ∈ L(Y, X) for some x0 ∈ D and Γ0 F(x0 ) ≤
η,
(C2 ) Γ0 F (x) ≤ l, for x ∈ D,
√
1 1 − 1 − 2lη
(C3 ) h = lη ≤ and U x0 , ⊂ D.
2 l
The corresponding majorizing sequence {un } for {xn } is defined by
⎧
⎪
⎨ u0 = 0,
p(un ) (2.2)
⎪
⎩ un+1 = un − , for each n = 0, 1, 2, . . .
p (un )
where
l
p(t) = t 2 − t + η. (2.3)
2
There exists a plethora of results studies in which authors uses Lipschitz-
type conditions [1, 2, 4–6, 8–11, 13, 14, 17, 20]. The main idea of these works is
to extend the results of the Theorem by means of relaxing second condition. In
particular, the condition weakening (C2 ) is given by
Γ0 F (x) ≤ ω(x) for each x ∈ D,

Newton’s method for k-Fréchet differentiable operators 27
where ω : R+ ∪ {0} → R is non-decreasing continuous real function, this kind of

condition receives the name as ω-conditions.
Recently, in [13] for D0 = U(x0 , r0 ) and f ∈ C 2 ([0, s ]) a new result was given
using the following conditions:
(D1 ) There exists Γ0 = [F (x0 )]−1 ∈ L(Y, X) for some x0 ∈ D and Γ0 F(x0 ) ≤
f (0)
− ,
f (0)
(D2 ) Γ0 F (x) ≤ f (s) if x − x0 ≤ s ≤ s ,
(D3 ) the equation f (s) = 0 has a solution in [0, s ].
The majorizing sequence {un } for {xn } is defined as in (2.2) with f replacing
p.
In this chapter, we present:
a more flexible way of computing upper bounds on the inverses
[F (xn )]−1 F (x0 ) in order to avoid the problems presented in the previ-
ous conditions,
more precise majorizing sequence,
a tighter convergence analysis and weaker sufficient semilocal conver-
gence conditions in many interesting cases.
The chapter is organized as follows: In Section 2.2, we present the study of
the semilocal convergence of Newton’s method. Then, the numerical examples
are presented in concluding Section 2.3.
2.2 Semilocal convergence analysis for Newton’s method

1 )–(D
Conditions (D 5 ) introduced below will be denoted by (D).

Suppose there exist two scalar continuously differentiable and non-
decreasing functions f ∈ C (k) ([0,t1 ]) and g : [0,t1 ] → R, with t1 > 0, k ≥ 3, that
satisfy the following conditions
1 ) there exists Γ0 = [F (x0 )]−1 ∈ L(Y, X) for some x0 ∈ D, Γ0 F(x0 ) ≤
(D
f (0)
− and Γ0 F (i) (x0 ) ≤ f (i) (0), with i = 2, 3, . . . , k − 1 and k ≥ 3,
g(0) − 1
2 ) Γ0 F (k) (x) ≤ f (k) (t), for x − x0 ≤ t, x ∈ D and t ∈ [0,t1 ],
(D
3 ) Γ0 (F (x)−F (x0 )) ≤ g(x −x0 ), for x −x0 ≤ t, x ∈ D and t ∈ [0,t1 ],
(D
4 )
(D f ∈ C (k) ([0,t1 ]) be with t1 ∈ R, k ≥ 3, such that f (0) > 0, f (0) < 0,
f (i) (0) > 0, for i = 2, 3, . . . , k − 1, f (k) (t) > 0.
3 ) to find more precise upper bounds

The key of the idea is to use condition (D
−1
on the norms F (xn ) F (x0 ) than using (D 2 ) [13].
Suppose that g(t) < 1, x ∈ U(x0 ,t). Then, using (D 3 ), we get that
Γ0 (F (x) − F (x0 )) ≤ g(x − x0 ) ≤ g(t) < 1.
Then, it follows from the Banach lemma on invertible operators [20] and from
the preceding estimate that F (x)−1 ∈ L(Y, X) and F (x)−1 F (x0 ) is bounded
by
1
F (x)−1 F (x0 ) ≤ ·
1 − g(t)
2 ) is used instead, we obtain
However, if (D
1
F (x)−1 F (x0 ) ≤ − ·
f (t)
Hence, if g(t) ≤ 1 + f (t), the new estimate is at least as tight as the old one, so
it is easy to see the advantages of this idea.
Theorem 2.2.1 Let {tn } be the scalar sequence given by

⎧
⎪
⎨ t0 = 0,
f (tn−1 ) (2.4)
⎪
⎩ tn = N(tn−1 ) = tn−1 − , n ∈ N.
g(tn−1 ) − 1
where f ∈ C (1) ([0,t1 ]) and t1 ∈ R.

Suppose there exists a solution β ∈ (0,t1 ) of f (t) = 0 such that f (β ) ≤ 0 and
g is continuously differentiable and non-decreasing on [0,t1 ]. Moreover, suppose
that
5 )
(D
g(t) < 1 for each t ∈ [0, β ] (2.5)
and
(1 − g(t))2 + f (t)(1 − g(t)) + g (t) f (t) ≥ 0 for each t ∈ [t1 , β ]. (2.6)
Then, (2.4) is a non-decreasing sequence that converges to t ∗ .
Proof: As f (0) > 0, then t ∗ ≥ 0. Using the well-known mean value theorem,
we get that
t1 − t ∗ = N(0) − N(t ∗ ) = N (θ0 )(t0 − t ∗ ) with θ0 ∈ (0,t ∗ ),

that is t1 < t ∗ , since N (t) > 0 in the interval [θ0 ,t ∗ ) by (2.6). Moreover, by the
definition of the sequence, we obtain that
f (0)
t1 = − > 0,
g(0) − 1
since
g(0) < 1 (by (2.5))
and f (0) > 0.
Finally, using mathematical induction on n, we obtain tn < t ∗ and tn − tn−1 ≥
0, since (tn−1 ,t ∗ ) ⊂ (0,t ∗ ). As a consequence, we find that sequence (2.4) con-
verges to r ∈ [0,t ∗ ]. Furthermore, as t ∗ is the unique root of f (t) = 0 in [0,t ∗ ], it
clearly follows that r = t ∗ . 2.
The next step, is to prove that sequence {tn } is a majorizing sequence of
Newton’s method (2.1). We present this fact in the following result.
Lemma 2.1
conditions are satisfied; xn ∈ D, for all n ≥ 0, and f (β ) ≤
Suppose that the (D)
0, where β is a solution of equation f (t) = 0 in (0,t1 ), and {xn } is the sequence
generated by Newton’s method (2.1). Moreover, denote Γn = [F (xn )]−1 . Then, for
all n ≥ 1, we have
1
(in ) Γn F (x0 ) ≤ − ,
g(tn ) − 1
(iin ) Γ0 F (i) (xn ) ≤ f (i) (tn ), for i = 2, 3, . . . , k − 1,

(iiin ) Γ0 F(xn ) ≤ f (tn ),
(ivn ) xn+1 − xn ≤ tn+1 − tn .
Proof: We will use again mathematical induction on n to proof this result. Case
n = 1 is clear so we omit writing it. We suppose that (in )–(ivn ) are true for n =
1, 2, . . . , m − 1 and we will prove it for n = m.
From the induction hypothesis (ivm−1 ) we obtain that
xm − x0 ≤ xm − xm−1 + xm−1 − xm−2 + · · · + x1 − x0
≤ (tm − tm−1 ) + (tm−1 − tm−2 ) + · · · + (t1 − t0 ) = tm − t0 = tm ,
as t0 = 0.
3 ) and (2.5) we get that
(im ) Using (D
Γ0 (F (xm ) − F (x0 )) ≤ g(xm − x0 ) ≤ g(tm ) < 1. (2.7)
It follows the Banach lemma on invertible operators [7, 10, 20] and from (2.7)
1
that, Γm exists and Γm F (x0 ) ≤ − .
g(tm ) − 1
(iim ) For any i ∈ {2, 3, . . . , k −1} and Taylor’s series, and denoting x = xm−1 +
τ(xm − xm−1 ) we obtain
F (i) (xm ) = F (i) (xm−1 ) + F (i+1) (xm−1 )(xm − xm−1 )

1
+ F (i+2) (xm−1 )(xm − xm−1 )2
2!
1
+···+ F (k−1) (xm−1 )(xm − xm−1 )k−1−i
(k − 1 − i)!
xm
1
+ F (k) (x)(xm − x)k−1−i dx.
(k − 1 − i)! xm−1
Moreover, as x − x0 ≤ t and xm − xm−1 ≤ tm − tm−1 , it follows that

k−i
1
Γ0 F (i) (xm ) ≤ Γ0 F ( j+i−1) (xm−1 )xm − xm−1 j−1
j=1
( j − 1)!
1
1
+ Γ0 F (k) xm−1 + τ(xm − xm−1 )
(k − 1 − i)! 0
×(1 − τ)k−2 xm − xm−1 k−i dτ

k−i
1
≤ f ( j+i−1) (tm−1 )(tm − tm−1 ) j−1
j=1
( j − 1)!
tm
1
+ f (k) (t)(tm − t)k−i−1 dt
(k − 1 − i)! tm−1
= f (i) (tm ).
(iiim ) From Taylor’s series we have

k−1
1 ( j)
F(xm ) = F (xm−1 )(xm − xm−1 ) j
j=2
j!
1
1
+ F (k) xm−1 + t(xm − xm−1 )
(k − 1)! 0
×(1 − t)k−1 (xm − xm−1 )k dt,
since F(xm−1 ) + F (xm−1 )(xm − xm−1 ) = 0. As a consequence, we get

k−1
1 ( j)
Γ0 F(xm ) ≤ f (tm−1 )(tm − tm−1 ) j
j=2
j!
1
1
+ f (k) tm−1 + t(tm − tm−1 )
(k − 1)! 0
×(1 − t)k−1 (tm − tm−1 )k dt
= f (tm ) − f (tm−1 ) − f (tm−1 )(tm − tm−1 )
≤ f (tm ),
since f (s) + f (s)(t − s) ≥ 0 for each s,t ∈ [0,t1 ] with s ≤ t. Indeed, let ϕ(u) =
f (u) + f (u)(t − u) for each fixed t and u ∈ [s,t]. Then ϕ (u) = f (u)(t − u) ≥ 0,
since f (u) ≥ 0 and t ≥ u. Hence, function ϕ is increasing for u ∈ [s,t] and
ϕ(u) > 0, since ϕ(s) = f (s) > 0 (see also Lemma 2.2).
f (tm )
(ivm ) xm+1 − xm ≤ Γm F (x0 )Γ0 F(xm ) ≤ − = tm+1 − tm .
g(tm ) − 1
And now the proof is complete. 2
Then, the main result for Newton’s method (2.1) is presented.
Theorem 2.2.2 Let X and Y be two Banach spaces and F : D ⊆ X → Y be

a nonlinear k (k ≥ 3) times continuously differentiable operator on a non-empty
open convex domain D of the Banach space X, f ∈ C (k) ([0,t1 ]) with t1 ∈ R and g :
[0,t1 ] → R are continuous and non-decreasing functions. Suppose that conditions
hold, there exists β ∈ (0,t1 ) such that f (β ) = 0, f (β ) ≤ 0 and U(x0 ,t ∗ ) ⊆
(D)
D. Then, Newton’s sequence {xn }, given by (2.1), converges to a solution x∗ of
F(x) = 0 starting at x0 . Moreover, xn , x∗ ∈ U(x0 ,t ∗ ) and
x∗ − xn ≤ t ∗ − tn , for all n = 0, 1, 2, . . . ,
where {tn } is defined in (2.4).
Proof: From (D 1 ) and (D 2 ),we get that x1 is well-defined and x1 −x0 = t1 < t ∗ ,
∗
so that x1 ∈ U(x0 ,t ) ⊂ D. We now suppose that xn is well-defined and xn ∈
U(x0 ,t ∗ ) for n = 1, 2, . . . , m − 1. It follows from Lemma 2.1, the operator Γm−1
1
exists and Γm−1 F (x0 ) ≤ − . In addition, xm is well-defined. Moreover,
g(tm−1 )
since xn+1 − xn ≤ tn+1 − tn for n = 1, 2, . . . , m − 1, we have

m−1
m−1
xm − x0 ≤ xi+1 − xi ≤ (ti+1 − ti ) = tm < t ∗ ,
i=0 i=0
so that xm ∈ U(x0 ,t ∗ ). Therefore, Newton’s sequence {xn } is well-defined and

xn ∈ U(x0 ,t ∗ ) for all n ≥ 0. By the last lemma, we also have that F(xn ) ≤ f (tn )
and
f (tn )
xn+1 − xn ≤ Γn F (x0 )Γ0 F(xn ) ≤ − = tn+1 − tn ,
g(tn ) − 1
for all n ≥ 0, so that {tn } is a majorizing sequence of {xn } and is convergent.
Moreover, as lim tn = t ∗ , if x∗ = lim xn , then x∗ − xn ≤ t ∗ − tn , for all
n→+∞ n→+∞
n = 0, 1, 2, . . . Furthermore, as Γ0 F(xn ) ≤ f (tn ), for all n = 0, 1, 2, . . ., by letting
n → +∞, it follows Γ0 F(x∗ ) = 0. Finally, by the continuity of F we get that
F(x∗ ) = 0, since Γ0 ∈ L(Y, X). 2
2.2.1 Uniqueness of solution

In order to prove the uniqueness part we need the following auxiliary result
whose proof can be found in [15] (Lemma 2.10).
Lemma 2.2
Suppose the conditions of the Theorem 2.2.2 hold. Then, the scalar function f (t) has
two real zeros t ∗ and t ∗∗ such that 0 < t ∗ ≤ t ∗∗ and if x ∈ U(x0 ,t ∗∗ ) ∩ D, then
Γ0 F (x) ≤ f (t), for x − x0 ≤ t.
On the other hand, if f (t) has two real zeros t ∗ and t ∗∗ such that 0 < t ∗ ≤ t ∗∗ ,
then the uniqueness of solution follows from the next result.
Theorem 2.2.3 Under the conditions of Theorem 2.2.2, the scalar function
f (t ∗∗ )t ∗∗
f (t) has two real zeros t ∗ and t ∗∗ such that 0 < t ∗ ≤ t ∗∗ . Let μ ∗∗ = .
2(1 − g(t ∗∗ ))
Suppose that μ ∗∗ ∈ (0, 1). Then, the solution x∗ is unique in U(x0 ,t ∗∗ ) ∩ D if
t ∗ < t ∗∗ or in U(x0 ,t ∗ ) if t ∗∗ = t ∗ .
Proof: Suppose that t ∗ < t ∗∗ and y∗ is a solution of F(x) = 0 in U(x0 ,t ∗∗ ) ∩ D.

We get
y∗ − xn+1 = −Γn (F(y∗ ) − F(xn ) − F (xn )(y∗ − xn ))

1

= −Γn F (xn + τ(y∗ − xn )) (1 − τ)(y∗ − xn )2 dτ .
0
As xn + τ(y∗ − xn ) − x0 ≤ tn + τ(t ∗∗ − tn ) ≤ t ∗∗ , it follows from Lemma 2.1

1
∗ f (tn + τ(t ∗∗ − tn )) dτ
y − xn+1 ≤ − 0 (1 − τ)y∗ − xn 2
g(tn ) − 1
≤ μ ∗∗ y∗ − xn 2 < y∗ − xn .
Therefore, lim xn = y∗ . That is x∗ = y∗ .

n→+∞
∗∗
If t = t ∗ and y∗ is a solution of F(x) = 0 in U(x0 ,t ∗∗ ), then similarly
y∗ − xn+1 ≤ μ ∗∗ y∗ − xn 2 < y∗ − xn .
Then, the proof is complete. 2

Another result using only function g is presented next.
Theorem 2.2.4 Under conditions of Theorem 2.2.3, if the scalar function f (t)
has two real zeros t ∗ and t ∗∗ such that 0 < t ∗ ≤ t ∗∗ and there exists r0 ≥ t ∗∗ such
that 1
g((1 − τ)r0 + τt) dτ < 1 if t = t ∗∗
0
or 1
g((1 − τ)r0 + τt) dτ ≤ 1 if t = t ∗
0
then, the solution x∗ is unique in U(x0 , r0 ) ∩ D if t = t ∗∗ or in U(x0 , r0 ) if r0 =

t = t ∗.
Proof: Let y∗ ∈ U(x0 , r0 ) be a solution of equation F(x) = 0. Let A =
1
3 ) in the first case, we obtain in turn
F (y∗ + τ(x∗ − y∗ )) dτ. Using (D
0
1

Γ0 (A − F (x0 )) ≤ g(y∗ + τ(x∗ − y∗ ) − x0 ) dτ
0
1
≤ g((1 − τ)y∗ − x0 + τx∗ − x0 ) dτ
0
1
≤ g((1 − τ)r0 + τt) dτ < 1.
0
Moreover, if r0 = t = t ∗ and x ∈ U(x0 , r0 ), we have that

1
Γ0 (A − F (x0 )) < g((1 − τ)r0 + τt) dt ≤ 1.
0
In either case, it follows from the Banach lemma on invertible operators that
A−1 ∈ L(Y, X). Using the identity
0 = F(x∗ ) − F(y∗ ) = A(x∗ − y∗ )
we deduce that x∗ = y∗ . 2
2.2.2 Special choices for function g

In this section we present two choices for function g, whereas function f is as
previously determined in [15]. We also provide conditions on f and g that guar-
antee tighter majorizing sequences.
2.2.2.1 Choice 1
3 ) can be dropped, since only
We choose g(t) = 1 + f (t). Then, in this case (D

(D2 ) is needed for the derivation of the upper bounds on Γm F (x0 ). Moreover,
our scalar sequence {tn } specializes to the sequence {vn } considered in [15] and
defined by ⎧
⎪
⎨ v0 = 0,
f (vn−1 ) (2.8)
⎪
⎩ vn = N(vn−1 ) := vn−1 − ·
f (vn−1 )
Indeed, we have in turn from Taylor’s series that
k−1
1

I − Γm−1 F (xm ) = I − Γm−1 F (i) (xm−1 )(xm − xm−1 )i−1
i=1
(i − 1)!
xm
1 (k)
+ F (x)(xm − x) dx .k−2
(k − 2)! xm−1
2 ) are satisfied denote v = vm−1 + τ(vm − vm−1 ) with τ ∈ [0, 1],

If conditions (D
then
x − x0 ≤ τxm − xm−1 + xm−1 − xm−2 + · · · + x1 − x0

≤ τ(vm − vm−1 ) + vm−1 − vm−2 + · · · + v1 = v.
Consequently, we have in turn

k−1
1
I − Γm−1 F (xm ) ≤ Γm−1 F (i) (xm−1 )xm − xm−1 i−1
i=2
(i − 1)!
1
1 (k)
+ F xm−1 + τ(xm − xm−1 )
(k − 2)! 0

×(1 − τ) xm − xm−1 dτ
k−2 k−1
k−1
1 1
= − f (i) (vm−1 )(vm − vm−1 )i−1
f (vm−1 ) i=2 (i − 1)!
vm
1 (k)
+ f (v)(vm − v) dv k−2
(k − 2)! vm−1
f (vm )
= 1− < 1.
f (vm−1 )
Next we justify this choice for function g and then provide a comparison between
sequences {tn } and {vn }. It follows from the Banach lemma that
1
Γm F (x0 ) ≤ − ·
f (v m)

Notice also that (2.5) and (2.6) are satisfied and N (t) ≥ 0. Hence, we arrive at
the results obtained in [15] and the choice g(t) = 1 + f (t) for g is justified under
conditions. However, if
the (D)
f (t) + 1 ≥ g(t) for each t ∈ [0,t1 ] (2.9)
and
f (s)(1 − g(t)) + f (t) f (s) ≤ 0 for each s,t ∈ [0,t1 ] with s≤t (2.10)
hold, then an inductive arguments shows that
tn ≥ vn for each n = 1, 2, . . . (2.11)
and
tn+1 − tn ≤ vn+1 − vn for each n = 1, 2, . . . . (2.12)
Indeed, using (2.9) we have that
f (0) f (0)
t1 = − ≥− = v1 ⇒ t1 ≥ v1 ,
g(0) − 1 f (0)
f (v1 ) f (v1 )
v2 = N(v1 ) = v1 −
≤ v1 − = N(v1 ) ≤
f (v1 ) g(v1 ) − 1
f (t1 )
≤ t1 − = N(t1 ) = t2 ⇒ v2 ≤ t2 ,
g(t1 ) − 1
(since N is an increasing function) and by (2.10)
f (t1 ) f (v1 )
t 2 − t1 = − ≤− = v2 − v1 ⇒ t2 − t1 ≤ v2 − v1 .
g(t1 ) − 1 f (v1 )
Suppose that tm ≥ vm and tm −tm−1 ≤ vm −vm−1 for each m = 1, 2, . . . , n. Then,
in an analogous way we show that (2.11) and (2.12) hold for m = n + 1. The
induction is complete. Notice that (2.5), (2.6), (2.9) and (2.10) hold for g(t) =
f (t) + 1. Hence, the class of functions g that satisfy (2.5), (2.6), (2.9) and (2.10)
is non-empty. Moreover, if strict inequality holds in (2.9) or (2.10), then so does
in (2.11) and (2.12).
2.2.2.2 Choice 2
Define function g as follows:

k−1 (i) t (k)
f (0) i−1 f (u)
g(t) = t + (t − u)k−2 du.
i=2
(i − 1)! 0 (k − 2)!
Let us also define iteration {sn } as {tn } but using function g as above. Then, as
in Choice 1 we have in turn that
k−1
1

I − Γ0 F (xm ) = I − Γ0 F (i) (x0 )(xm − x0 )i−1
i=1
(i − 1)!
xm
1
+ F (k) (x)(xm − x)k−2 dx .
(k − 2)! x0
Then, for s = s0 + τ(sm − s0 ), τ ∈ [0, 1],

k−1
1

I − Γ0 F (xm ) ≤ Γ0 F (i) (x0 )xm − x0 i−1
i=2
(i − 1)!
1
1
+ Γ0 F (k) x0 + τ(xm − x0 )
(k − 2)! 0
×(1 − τ)k−2 xm − x0 k−1 dτ

k−1
1
= f (i) (0)(sm − s0 )i−1
i=2
(i − 1)!
sm
1
+ f (k) (s)(sm − s)k−2 ds
(k − 2)! 0
= g(sm ) < 1.
Then, we have again that

1
Γm Γ0 ≤ − ·
g(sm ) − 1
In this, since f (i) (0) ≤ f (i) (t) for each i = 2, 3, . . . , k − 1, t ∈ [0,t1 ] we have that
(2.5) is satisfied. Moreover, notice that f (i) (0) ≤ f (i) (t) for each n = 2, 3, . . . Then
again, if (2.9) and (2.10) hold we have as in Choice 2.2.2.1 that
sn ≤ vn (2.13)
and
sn+1 − sn ≤ vn+1 − vn (2.14)
for each n = 1, 2, 3, . . .
Then, again (2.13) and (2.14) hold as strict inequalities if (2.9) or (2.10) hold
as strict inequalities. Other choices of function g are possible such as g is any
polynomial of degree m ∈ N satisfying (D 3 ), (2.5), (2.6) (preferably as strict
inequalities),(2.9) and (2.10). This way sequence {tn } will be tighter than {vn }
as it was already shown in Choices 2.2.2.1 and 2.2.2.2.
Remark 2.2.5 It follows from the proof of Lemma 2.1 and Theorem 2.2.2 that
scalar sequence {rn } defined for each n = 0, 1, 2, . . . by
⎧
⎨ f (0)
r0 = 0, r1 = − ,
g(0) − 1
⎩
rn+2 = rn+1 − βn+1 (rn+1 − rn )2 ,
where
βn+1
1
βn+1 = ,
g(rn+1 ) − 1

k−1 ( j)
f (rn )
βn+1
1
= (rn+1 − rn ) j−2
j=2
j!
1
1
+ f (k) (rn + t(rn+1 − rn ))(1 − t)k−1 (rn+1 − rn )k dt,
(k − 1)! 0
is the tightest majorizing for {xn } (from the ones mentioned above in this paper)
which is obviously converging under the hypotheses of theorem 2.2.2. These hy-
potheses depend on function f having a positive zero. However this may no be the
case (see also the numerical examples). Then, one can simply directly study the
convergence of {rn } hoping to obtain weaker sufficient convergence conditions.
For example, if
0 ≤ −βn+1 (rn+1 − rn ) ≤ γ < 1 (2.15)
for each n = 0, 1, 2, . . . and some γ ∈ [0, 1) then, we obtain that sequence {rn }
is non-decreasing and converges to some r∗ ≥ r1 . Hypotheses (2.15) can then
replace the existence of the zero β of equation f (t) = 0. Note that we have pro-
vided sufficient convergence conditions which on the one hand imply (2.15); not
necessarily imply the existence of β and can be weaker [4, 5, 10]. For simplic-
ity we will only present the case when k = 2. Then, we have for g(t) = l0t and
l
f (t) = p(t) = t 2 − t + η that scalar sequence {un } defined by
2
⎧
⎪
⎪ l0 (u1 − u0 )2
⎨ u0 = 0, u1 = η, u2 = u1 − ,
2(l0 u1 − 1) (2.16)
⎪
⎪ l(un − un−1 ) 2
⎩ un+1 = un − , for each n = 2, 3, . . .
2(l0 un − 1)
is also a majorizing sequence for {xn } [10]. Note that
l0 ≤ l
l
holds in general and can be arbitrarily large [6, 9, 11]. Then, {un } converges
l0
provided that
1
h1 = Lη ≤ , (2.17)
2
where
1
L= 4l0 + l0 l + l0 l + 8l0 .
2
8
Note that
1 1
h≤ ⇒ h1 ≤
2 2
h1 l0
but not necessarily vice versa unless if l0 = l and → 0 as → 0. The sequence
h l
{un } used in [13] (see also (2.2)) is given by
⎧
⎪
⎨ u0 = 0,
f (un−1 )
⎪
⎩ un = un−1 − , for each n =, 1, 2, . . .
f (un−1 )
It can then easily be seen that (2.5) and (2.6) are satisfied if 2l0 ≤ l.
We also have that
g(t) ≤ 1 + f (t) for each t ≥ 0. (2.18)
Moreover, strict inequality holds in (2.18) if l0 < l.


1
Example 2.3.1 Let X = Y = R, x0 = 1 and D = U(1, 1 − ξ ) for ξ ∈ 0, .
2
Define the scalar function F on D by
F(x) = x3 − ξ . (2.19)
Then, we get
1
η = (1 − ξ ), l = 2(2 − ξ ) and l0 = 3 − ξ .
3
Then, the Kantorovich hypothesis is not satisfied

2 1 1
h = lη = (2 − ξ )(1 − ξ ) > for all ξ ∈ 0, .
3 2 2
1
Notice also that f (t) = p(t) = (2 − ξ )t 2 − t + (1 − ξ ), g(t) = l0t = (3 − ξ )t
3
and that function f has no real zeros. Hence, there
is no guarantee that Newton’s
method starting from x0 = 1 converges to x = 3 ξ . However, the new conditions
1
are satisfied for ξ ∈ 0.4271907643 . . . , . Notice also that γ used in [13] is
2
given by [10, 11]
2l
γ= ·
l + l 2 + 8l0 l
1
In order for us to compare the error bounds, let ξ = 0.6. Then, h < and
2
1
h1 < . Then, we can construct sequences {un } and {un } using (2.2) and (2.8)
2
respectively. Using Theorem 2.2.2 we see that these sequences converge. Be-
sides, we obtain the a priori error bounds for these sequences and we show that
un < un and |un+1 − un | < |un+1 − un |, as we can see in the Table 2.1. In addition,
u∗ = 0.166997 . . . < u∗ = 0.177385 . . ..
Table 2.1: Sequences un and un and a priori error bounds.
Iteration un un |un+1 − un | |un+1 − un |

1 0.133333 . . . 0.133333 . . . 0.002278 . . . 0.039716 . . .
2 0.164706 . . . 0.173050 . . . 0.000012 . . . 0.004284 . . .
3 0.166985 . . . 0.177334 . . . 3.4382 . . . × 10−10 0.000051 . . .
Example 2.3.2 In this example we provide an application of our results to

a special nonlinear Hammerstein integral equation of the second kind. In this
example we will use the max-norm.
1
1
x(s) = + G(s,t)x(t)3 dt, (2.20)
2 0
where x ∈ C([0, 1]), t ∈ [0, 1] and the kernel G is the Green’s function

(1 − s)t, t ≤ s,
G(s,t) = (2.21)
s(1 − t), s ≤ t.
Solving (2.20) is equivalent to solve F(x) = 0, where F : Ω ⊆ C([0, 1]) →

C([0, 1]),
1
1
[F(x)](s) = x(s) − − G(s,t)x(t)3 dt, s ∈ [0, 1],
2 0
and Ω is a suitable non-empty open convex domain. Moreover,

1
[F (x)y](s) = y(s) − 3 G(s,t)x(t)2 y(t) dt,
0
1
[F (x)(yz)](s) = −6 G(s,t)x(t)z(t)y(t) dt,
0
1

[F (x)(yzw)](s) = −6 G(s,t)w(t)z(t)y(t) dt.
0
Observe that F (x) is not bounded in a general domain Ω. Taking into

account that a solution x∗ (s) of (2.20) in C([0, 1]) must satisfy:
1 1 ∗ 3
x∗ − − x ≤ 0,
2 8
it follows that x∗ ≤ σ1 = 0.5173 . . . or x∗ ≥ σ2 = 2.5340 . . ., where σ1 and
t3 1
σ2 are the two real positive roots of − t + = 0. Thus, from Kantorovich’s
8 2
theory, we can only approximate one solution x∗ (s) by Newton’s method, which
satisfies x∗ ∈ [0, σ1 ], since we can consider Ω = U(0, σ ), with σ ∈ (σ1 , σ2 ),
where F (x) is bounded, and takes x0 ∈ U(0, σ ) as starting point. We see in
this example that we improve the a priori error bounds obtained by Kantorovich.
We study the case where Ω = U(0, σ ) with σ ∈ (σ1 , σ2 ). For example, we
take σ = 2 and choose, as it is usually done [16], the starting point x0 (s) = u(s) =
1 3
. Since I −F (x0 ) ≤ < 1, the operator Γ0 is well defined and Γ0 F(x0 ) ≤
2 32
1 48 24 1
= η. Moreover, Γ0 F (x) ≤ = l and lη = = 0.0285 . . . ≤ .
58 29 841 2
24 2 1
Hence, Kantorovich’s theory can be applied, so that p(s) = s −s+ ,
29 58
s∗ = 0.0174 . . ., and U(x0 , s∗ ) ⊆ Ω = U(0, σ ). Consequently, Newton’s method

can be applied to approximate a solution x∗ (s) in Ω = U(0, σ ).
3
For Theorem 2.2.1, if we choose t0 = 0, we obtain F (x0 ) ≤ , F (x) ≤
8
3 4 3 6 2 1
, f (t) = t + t − t + , β = 1.1329 . . . and f (q) = −0.6495 . . . ≤ 0, so
4 29 29 58
that the conditions of Theorem 2.2.1 hold. On the other hand, we can get the
12 36
function g(t) = (1 + t)t and l0 = . We construct sequences {un }, {tn } =
29 29
{vn } and {un } by (2.2), (2.4), (2.8) and (2.16) respectively. Once again, these
sequences converge because conditions of Theorem 2.2.2 hold.
After that, we apply Newton’s method for approximating a solution with the
features mentioned above. For this, we use a discretization process. Thus, we
approximate the integral of (2.20) by the following Gauss-Legendre quadrature
formula with 8 nodes:
1
8
φ (t) dt ω j φ (t j ),
0 j=1
where the nodes t j and the weights ω j are known. Moreover, we denote x(ti ) by
xi , i = 1, 2, . . . , 8, so that equation (2.20) is now transformed into the following
system of nonlinear equations:
1
8
xi = + ai j x3j ,
2 j=1
where
ω j t j (1 − ti ) if j ≤ i,
ai j =
ω j ti (1 − t j ) if j > i.
Then, we write the above system in the following matrix form:
F(x) = x − v − Aw = 0, (2.22)
T
1 1 1
where x = (x1 , x2 , . . . , x8 ) , v = T
, ,..., , A = (ai j )8i, j=1 and w =
2 2 2
(x13 , x23 , . . . , x83 )T . Besides,
F (x) = I − 3A diag{x12 , x22 , . . . , x82 }.

1
Since x0 (s) = has been chosen as starting point for the theoretical study, a
2
reasonable choice of initial approximation for Newton’s method seems to be the
T
1 1 1
vector x0 = , ,..., .
2 2 2
We observe in Table 2.2 that tn < un < un for the majorizing sequences given
Table 2.2: Sequences {tn }, {un } and {un }.
n tn un un
2 0.0173040441616297 . . . 0.0174299241630703 . . . 0.0174946186637403 . . .
3 0.0173040450083055 . . . 0.0174299542337089 . . . 0.0174946733196340 . . .
4 0.0173040450083055 . . . 0.0174299542337097 . . . 0.0174946733196365 . . .
by (2.4), (2.16) and (2.8) respectively. Besides, the corresponding majorizing se-
quences {un } and {tn } provide better a priori error estimates than the majorizing
sequence {un } obtained by Kantorovich or in [13] using the polynomial p(s).
The a priori error estimates are shown in Table 2.3. Observe the improvement
obtained from the majorizing sequences {tn } and {un } constructed in this work.
Table 2.3: A priori error estimates.
n |un+1 − un | |tn+1 − tn | |un+1 − un |

2 7.6488 . . . × 10−16 8.4667 . . . × 10−10 5.4655 . . . × 10−8
3 4.9489 . . . × 10−31 1.5457 . . . × 10−19 2.5459 . . . × 10−15
4 2.0717 . . . × 10−61 5.1520 . . . × 10−39 5.5242 . . . × 10−30
References
[1] S. Amat, C. Bermúdez, S. Busquier and D. Mestiri, A family of Halley-

Chebyshev iterative schemes for non-Fréchet differentiable operators. J.
Comput. Appl. Math., 228 (1), 486–493, 2009.
[2] S. Amat and S. Busquier, Third-order iterative methods under Kantorovich
conditions. J. Math. Anal. Appl., 336 (1), 243–261, 2007.
[3] S. Amat, S. Busquier and J. M. Gutiérrez, Third-order iterative methods
with applications to Hammerstein equations: a unified approach. J. Comput.
Appl. Math., 235, 2936–2943, 2011.
[4] I. K. Argyros, On the convergence of Newton’s method for polynomial
equations and applications in radiative transfer. Monatshefte für Mathe-
matik, 127 (4), 265–276, 1999.
[5] I. K. Argyros, A Newton-Kantorovich theorem for equations involving m-
Fréchet differentiable operators. J. Comput. Appl. Math., 131, 149–159,
2001.
[6] I. K. Argyros, On the Newton-Kantorovich hypothesis for solving equa-
tions. J. Comput. Appl. Math., 169 (2), 315–332, 2004.
[7] I. K. Argyros, Computational theory of iterative methods, Series: Studies
in Computational Mathematics 15. Editors: C. K. Chui and L. Wuytack.
Elsevier Publications, New York, USA, 2007.
[8] I. K. Argyros, A unified approach for the convergence of a certain numeri-
cal algorithm using recurrent functions. Computing,90 (3), 131–164, 2010.
[9] I. K. Argyros, On Newton’s method using recurrent functions and hy-
potheses on the first and second Fréchet derivatives. Atti del seminario
43
44 References
Matemático e Fı́sico Dell’Universitá de Modena, Regio Emilia, 57, 1–18,

2010.
[10] I. K. Argyros, Y. J. Cho and S. Hilout, Numerical methods for equations
and its applications. CRC Press/Taylor and Francis, New York, 2012.
[11] I. K. Argyros and S. Hilout, Weaker conditions for the convergence of New-
ton’s method. Journal of Complexity, 28, 364–387, 2012.
[12] K. E. Atkinson, The numerical solution of a nonlinear boundary integral
equation on smooth surfaces. IMA Journal of Numerical Analysis, 14, 461–
483, 1994.
[13] J. A. Ezquerro, D. González and M. A. Hernández, Majorizing sequences
for Newton’s method from initial value problems. J. Comput. Appl. Math.,
236, 2246–2258, 2012.
[14] J. A. Ezquerro, D. González and M. A. Hernández, A modification of
the classic conditions of Newton-Kantorovich for Newton’s method. Math.
Comput. Modelling, 57 (3-4), 584–594, 2013.
[15] J. A. Ezquerro, D. González and M. A. Hernández, A semilocal conver-
gence result for Newton’s method under generalized conditions of Kan-
torovich, J. Complexity, 30 (3), 309–324, 2014.
[16] J. A. Ezquerro and M. A. Hernández, Generalized differentiability condi-
tions for Newton’s method. IMA Journal of Numerical Analysis, 22 (2),
187–205, 2002.
[17] J. A. Ezquerro and M. A. Hernández, Halley’s method for operators with
unbounded second derivative. Appl. Numer. Math., 57 (3), 354–360, 2007.
[19] J. M. Gutiérrez, A new semilocal convergence theorem for Newton’s
method. J. Comput. Appl. Math., 79, 131–145, 1997.
[20] L. V. Kantorovich and G. P. Akilov, Functional analysis. Pergamon Press,
Oxford, 1982.
[21] A. D. Polyanin and A. V. Manzhirov, Handbook of integral equations. CRC
Press, Boca Raton, 1998.
(2L), 415–421, 2015.
References 45

291, 468–477, 2016.
Chapter 3
Nonlinear Ill-posed
equations
Ioannis K. Argyros

CONTENTS
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Error bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4 Implementation of adaptive choice rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.1 Introduction
In this chapter we provide an extended the analysis of the Lavrentiev regulariza-
tion for nonlinear ill-posed problems
F(x) = y, (3.1)
where F : D(F) ⊆ X → X is a nonlinear monotone operator considered in [22].

46
Nonlinear Ill-posed equations 47
First of all, let us recall that F is a monotone operator if, for all x1 , x2 ∈ D(F)
it satisfies the relation
F(x1 ) − F(x2 ), x1 − x2 ≥ 0. (3.2)
Throughout all the chapter we will denote by X as a real Hilbert space with
inner product ., . and the norm .. Let U(x, ρ) and U(x, ρ) stand respectively,
for the open and closed ball in the Hilbert space X with center x ∈ X and radius
ρ > 0. We also denote by L(X) the space of all bounded linear operators from X
into X.
Our first assumption is that x∗ is a solution. Moreover, we also assume that
F has a locally uniformly bounded Fréchet derivative F in a ball around the
solution.
In application, usually only noisy data yδ are available, such that
y − yδ ≤ δ . (3.3)
Then the problem of recovery of x∗ from noisy equation F(x) = yδ is ill-posed.

In previous papers, we considered the iterative regularization method given
by
δ δ
xn+1,α = xn,α − (F (x0 ) + αI)−1 (F(xn,α
δ
) − yδ + α(xn,α
δ
− x0 )), (3.4)
δ
and proved that (xn,α ) converges linearly to the unique solution xαδ of
F(x) + α(x − x0 ) = yδ . (3.5)
It is known, see Theorem 1.1 in [23], that the equation (3.5) has a unique solu-
tion xαδ for α > 0, provided F is Fréchet differentiable and monotone in the ball
Br (x∗ ) ⊂ D(F) with radius r = x∗ − x0 + δ /α. However the regularized equa-
tion (3.5) remains nonlinear and it could be complicated to solve it numerically.
In the last years, several authors [8–11, 17] have considered iterative regular-
ization methods for obtaining stable approximate solutions for equation (3.5).
As it appears in [18], an iterative method with iterations defined by
δ
xk+1 = Φ(x0δ , x1δ , · · · , xkδ ; yδ ),
where x0δ := x0 ∈ D(F) is a known initial approximation of x∗ , for a known

function Φ together with a stopping rule which determines a stopping index kδ ∈
N is called an iterative regularization method if the condition xkδδ − x∗ → 0 as
δ → 0 is satisfied.
Bakushinskii and Smirnova in [8] considered the iteratively regularized
Lavrentiev method defined as
δ
xk+1 = xkδ − (Aδk + αk I)−1 (F(xkδ ) − yδ + αk (xkδ − x0 )), k = 0, 1, 2, · · · , (3.6)
where Aδk := F (xkδ ) and {αk } is a sequence of positive real numbers such that
limk→∞ αk = 0. Actually, the stopping index kδ in [8] was chosen according to
the discrepancy principle
F(xkδδ ) − yδ ≤ τδ < F(xkδ ) − yδ , 0 ≤ k < kδ
for some τ > 1 and showed that xkδδ → x∗ as δ → 0 under the following well-
known assumptions:
There exists L > 0 such that for all x, y ∈ D(F)
F (x) − F (y) ≤ Lx − y
There exists p > 0 such that

αk − αk+1
≤ p, ∀k ∈ N, (3.7)
αk αk+1
√
(2 + Lσ )x0 − x∗ td ≤ σ −2x0 −x∗ t ≤ dα0 , where σ := ( τ − 1)2 ,
t := pα0 + 1 and d = 2(tx0 − x∗ + pσ ).
However, authors didn’t give error estimate for xkδδ − x∗ . Later in [18],
Mahale and Nair considered method (3.6) and obtained an error estimate for
xkδδ − x∗ under a weaker condition than (3.7). Precisely they choose the stop-
ping index kδ as the first nonnegative integer such that xkδ in (3.7) is defined for
each k ∈ {0, 1, 2, · · · , kδ } and
αkδ (Aδkδ + αkδ I)−1 (F(xkδ ) − yδ ) ≤ c0 with c0 > 4.
In [18], Mahale and Nair showed that xkδδ → x∗ as δ → 0 and an optimal order
error estimate for xkδδ − x∗ was obtained under the following assumptions:
Assumption 1 There exists r > 0 such that U(x∗ , r) ⊆ D(F) and F is Fréchet
differentiable at all x ∈ U(x∗ , r).
Assumption 2 (see Assumption 3 in [22]) There exists a constant k0 > 0 such

that for every x, u ∈ U(x∗ , r)) and v ∈ X there exists an element Φ(x, u, v) ∈ X
such that [F (x) − F (u)]v = F (u)Φ(x, u, v), Φ(x, u, v) ≤ k0 vx − u.
Assumption 3 There exists a continuous, strictly monotonically increasing

function ϕ : (0, a] → (0, ∞) with a ≥ F (x∗ ) satisfying
1. limλ →0 ϕ(λ ) = 0
2. for α ≤ 1, ϕ(α) ≥ α
3. supλ ≥0 αϕ(λ )
λ +α ≤ cϕ ϕ(α), ∀λ ∈ (0, a]
4. there exists w ∈ X such that
x0 − x∗ = ϕ(F (x∗ ))w. (3.8)
Assumption 4 There exists a sequence {αk } of positive real numbers such that
limk→∞ αk = 0 and there exists μ > 1 such that
αk
1≤ ≤ μ, ∀k ∈ N. (3.9)
αk+1
It is worth noticing that (3.9) is weaker than (3.7).
In [14] motivated by iteratively regularized Lavrentiev method, we showed
the quadratic convergence of the method defined by
δ δ
xn+1,α = xn,α − (F (xn,α
δ
) + αI)−1 (F(xn,α
δ
δ
− x0 )), (3.10)
δ
where x0,α := x0 is a starting point of the iteration. Let Rα (x) = F (x) + αI and
G(x) = x − Rα (x)−1 (F(xn,α

δ
δ
− x0 )). (3.11)
δ δ
With the above notation xn+1,α = G(xn,α ).
The alternative assumptions used instead of Assumptions 1–2 in [14] are,
respectively
Assumption 5 There exists r > 0 such that U(x0 , r) ∪ U(x∗ , r)) ⊆ D(F) and F
is Fréchet differentiable at all x ∈ U(x0 , r) ∪U(x∗ , r)).
Assumption 6 There exists a constant k0 > 0 such that for every x, u ∈ U(x0 , r)∪
U(x∗ , r)) and v ∈ X there exists an element Φ(x, u, v) ∈ X satisfying [F (x) −
F (u)]v = F (u)Φ(x, u, v), Φ(x, u, v) ≤ k0 vx − u.
The condition [F (x)−F (u)]v = F (u)Φ(x, u, v), Φ(x, u, v) ≤ k0 vx−u

in Assumption 6 is basically a Lipschitz-type condition (cf. [4, 16] and the refer-
ences therein). However, it is in general very difficult to verify or may not even
be satisfied [1]– [6]. In order for us to expand the applicability of the method, we
consider the following weaker assumptions.
Assumption 7 There exists r > 0 such that U(x0 , r) ⊆ D(F) and F is Fréchet
differentiable at all x ∈ U(x0 , r).
Assumption 8 Let x0 ∈ X be fixed. There exists a constant L > 0 such that for
every x, u ∈ U(x0 , r) ⊆ D(F) and v ∈ X there exists an element Φ(x, u, v) ∈ X sat-
isfying [F (x) − F (u)]v = F (u)Φ(x, u, v), Φ(x, u, v) ≤ Lv(x − x0 + u −
x0 ).
Note that from the estimate
x − u ≤ x − x0 + x0 − u,
Assumption 6 clearly implies Assumption 8 with k0 = L but not necessarily vice

versa. We will assume, throughout the chapter, that operator F satisfies Assump-
tions 7 and 8.
Remark 3.1.1 If Assumptions 7 and 8 are fulfilled only for all x, u ∈ U(x0 , r) ∩
Q = ∅, where Q is a convex closed a priori set for which x∗ ∈ Q, then we can
modify the method (3.10) as:
δ δ
xn+1,α = PQ (G(xn,α ))
to obtain the same estimates in this chapter; where PQ is the metric projection
onto the set Q.
The chapter is organized as follows. In Section 3.2, we provide a convergence
result for the method. Then, in Section 3.3, we give error bounds. In Section
3.4, we deal with the starting point and the algorithm. Finally, some numerical
examples are presented in Section 3.5.
3.2 Convergence analysis

We will use majorizing sequences in order to let us prove our results.
[1, Definition 1.3.11 ]: A nonnegative sequence {tn } is said to be a majorizing
sequence of a sequence {xn } in X if
xn+1 − xn ≤ tn+1 − tn , ∀n ≥ 0.
We will use the following reformulation of Lemma 1.3.12 in [1] in order to

prove our results.
Lemma 3.2.1 (cf. [13], Lemma 2.1) Let {tn } be a majorizing sequence for {xn }
in X. If limn→∞ tn = t ∗ , then x∗ = limn→∞ xn exists and
x∗ − xn ≤ t ∗ − tn , ∀n ≥ 0. (3.12)
Moreover, we need the following auxillary result on majorizing sequences

for method (3.10).
Lemma 3.2.2 Suppose that there exist non-negative numbers L, η such that
16Lη ≤ 1. (3.13)
Let √
1 − 1 − 16Lη
q= . (3.14)
2
Then, scalar sequence {tn } given by

L
t0 = 0, t1 = η, tn+1 = tn + (5tn + 3tn−1 )(tn −tn−1 ) for each n = 1, 2, · · · (3.15)
2
is well defined, nondecreasing, bounded from above by t ∗∗ given by
η
t ∗∗ = (3.16)
1−q
and converges to its unique least upper bound t ∗ which satisfies
η ≤ t ∗ ≤ t ∗∗ . (3.17)
Moreover the following estimates hold for each n = 1, 2, · · · .
tn+1 − tn ≤ q(tn − tn+1 ) (3.18)
and
qn
t ∗ − tn ≤ η. (3.19)
1−q
Proof: It is clear that q ∈ (0, 1). Now using induction we obtain
L
(5tm + 3tm−1 ) ≤ q. (3.20)
2
Estimate is true for m = 1 by the definition of sequence {tn } and (3.13). Then,
2
we have by (3.15) that t2 −t1 ≤ q(t1 −t0 ) and t2 ≤ η + qη = (1 + q)η = 1−q
1−q η <
η ∗∗
1−q = t . If we assume that condition (3.20) is satisfied for all integers k ≤ m.
As a consequence, we have
tm+1 − tm ≤ q(tm − tm−1 ). (3.21)
and
1 − qm+1
tm+1 ≤ η. (3.22)
1−q
Now, we must prove that (3.20) is satisfied for m + 1. Using (3.21) and (3.22),
(3.20) shall be true if
L 1 − qm 1 − qm−1
[5( ) + 3( )]η ≤ q. (3.23)
2 1−q 1−q
Estimate (3.23) motivates us to define recurrent functions fm on [0, 1) by
fm (t) = L[5(1 + t + · · · + t m−1 ) + 3(1 + t + · · · + t m−2 )]η − 2t. (3.24)
We need n relationship between two consecutive functions fm . Using (3.24) we

get that
fm+1 (t) = fm (t) + (5t + 3)Lηt m−1 ≥ fm (t). (3.25)
Define function f∞ on [0, 1) by
f∞ (t) = lim fm (t). (3.26)

m→∞
Then, using (3.24) we obtain

8L
f∞ (t) = η − 2t. (3.27)
1−t
It is obvious that, (3.23) is true if
f∞ (t) ≤ 0, (3.28)
since
fm (q) ≤ fm+1 (q) ≤ · · · ≤ f∞ (q). (3.29)
But (3.28) is true by (3.23) and (3.24). The induction for (3.20) is complete.
Hence, sequence {tn } is nondecreasing, bounded from above by t ∗∗ and such it
converges to some t ∗ which satisfies (3.17). 2
Now we must have to take into account the following Lemma.
Lemma 3.2.3 ( [14], Lemma 2.3) For u, v ∈ Br0 (x0 )
1

F(u) − F(v) − F (u)(u − v) = F (u) Φ(v + t(u − v), u, u − v)dt.
0
From now on we assume that for r > 0, L > 0 and q given in (3.14):
δ 1
< η ≤ min{ , r(1 − q)} (3.30)
α 16L
and x∗ − x0 ≤ ρ, where
1
ρ≤ ( 1 + 2L(η − δ /α) − 1) (3.31)
L
The following remark is also interesting for our study.
Remark 3.2.4 Note that (3.30) and (3.31) imply
L 2 δ 1
ρ + ρ + ≤ η ≤ min{ , r(1 − q)}. (3.32)
2 α 16L
Now, we present the main result for this chapter.
Theorem 3.2.5 Suppose Assumption 8 is satisfied. Let the assumptions in

δ
Lemma 3.2.2 hold with η as in (3.32). Then the sequence {xn,α } defined in (3.11)
δ ∗ δ
is well defined and xn,α ∈ U(x0 ,t ) for all n ≥ 0. Further {xn,α } is a Cauchy
sequence in U(x0 ,t ∗ ) and hence converges to xαδ ∈ U(x0 ,t ∗ ) ⊂ U(x0 ,t ∗∗ ) and

F(xαδ ) = yδ + α(x0 − xαδ ).
Moreover, the following estimates hold for all n ≥ 0 :
δ δ
xn+1,α − xn,α ≤ tn+1 − tn , (3.33)
δ qn η
xn,α − xαδ ≤ t ∗ − tn ≤ (3.34)
1−q
and
δ δ L δ δ δ δ δ δ
xn+1,α − xn,α ≤ [5xn,α − x0,α + 3xn−1,α − x0,α ]xn,α − xn−1,α . (3.35)
2
Proof: We begin by proving that
xn+1,α − xn,α ≤ [5xn,α − x0,α + 3xn−1,α − x0,α ]xn,α − xn−1,α . (3.36)
2
With G as in (3.11), we have for u, v ∈ Bt ∗ (x0 ),
G(u) − G(v) = u − v − Rα (u)−1 [F(u) − yδ + α(u − x0 )] + Rα (v)−1
×[F(v) − yδ + α(v − x0 )]
= u − v − [Rα (u)−1 − Rα (v)−1 ](F(v) − yδ + α(v − x0 ))
−Rα (u)−1 (F(u) − F(v) + α(u − v))
= Rα (u)−1 [F (u)(u − v) − (F(u) − F(v))]
−Rα (u)−1 [F (v) − F (u)]Rα (v)−1 (F(v) − yδ + α(v − x0 ))
= Rα (u)−1 [F (u)(u − v) − (F(u) − F(v))]
−Rα (u)−1 [F (v) − F (u)](v − G(v))
1
−1
= Rα (u) [F (u)(u − v) + (F (u + t(v − u))(v − u))dt]
0
−Rα (u)−1 [F (v) − F (u)](v − G(v))
1
= Rα (u)−1 [(F (u + t(v − u)) − F (u))(v − u)dt]
0
−Rα (u)−1 [F (v) − F (u)](v − G(v)).
The last, but one step follows from the Lemma 3.2.1. So by Assumption 8
and the estimate Rα (u)−1 F (u) ≤ 1, we have
1
G(u) − G(v) ≤ L [u + t(v − u) − x0 + u − x0 ]dtv − u
0
+L[v − x0 + u − x0 ]v − u
L
≤ [3u − x0 + v − x0 ] + L[v − x0 + u − x0 ]
2
L
≤ [5u − x0 + 3v − x0 ]v − G(u). (3.37)
2
δ δ
Now by taking u = xn,α and v = xn−1,α in (3.37), we obtain (3.36).
Next we shall prove that the sequence (tn ) defined in Lemma 3.2.1 is a ma-
δ
jorizing sequence of the sequence (xn,α ). Note that F(x∗ ) = y, so by Lemma
3.2.2,
δ
x1,α − x0 = Rα (x0 )−1 (F(x0 ) − yδ )
= Rα (x0 )−1 (F(x0 ) − y + y − yδ )
= Rα (x0 )−1 (F(x0 ) − F(x∗ ) − F (x0 )(x0 − x∗ )
+F (x0 )(x0 − x∗ ) + y − yδ )
≤ Rα (x0 )−1 (F(x0 ) − F(x∗ ) − F (x0 )(x0 − x∗ ))
+Rα (x0 )−1 F (x0 )(x0 − x∗ ) + Rα (x0 )−1 (y − yδ )
1
≤ Rα (x0 )−1 F (x0 ) Φ(x∗ + t(x0 − x∗ ), x0 , (x0 − x∗ ))dt
0
−1 δ
+Rα (x0 ) F (x0 ))(x0 − x∗ ) +

α
L δ
≤ x0 − x∗ 2 + x0 − x∗ +
2 α
L 2 δ
≤ ρ +ρ +
2 α
≤ η = t1 − t0 .
δ δ
Assume that xi+1,α − xi,α ≤ ti+1 − ti for all i ≤ k for some k. Then
δ δ δ δ δ δ
xk+1,α − x0 ≤ xk+1,α − xk,α + xk,α − xk−1,α + · · · + x1,α − x0
≤ tk+1 − tk + tk − tk−1 + · · · + t1 − t0
= tk+1 ≤ t ∗ .
δ
As a consequence, xi+1,α ∈ Bt ∗ (x0 ) for all i ≤ k, and therefore, by (3.36),
xk+2,α − xk+1,α ≤ [5xn,α − x0,α + 3xn−1,α − x0,α ]xn,α − xn−1,α
2
L
≤ (5tk+1 + 3tk−1 ) = tk+2 − tk+1 .
2
δ δ
Hence, by induction xn+1,α − xn,α ≤ tn+1 − tn for all n ≥ 0 and therefore
δ δ
{tn }, n ≥ 0 is a majorizing sequence of the sequence {xn,α }. In particular xn,α −
∗ δ ∗ δ
x0 ≤ tn ≤ t , i.e., xn,α ∈ U(x0 ,t ), for all n ≥ 0. So, {xn,α }, n ≥ 0 is a Cauchy
sequence and converges to some xαδ ∈ U(x0 ,t ∗ ) ⊂ U(x0 ,t ∗∗ ) and by Lemma 3.2.3
qn η
xαδ − xn,α
δ
≤ t ∗ − tn ≤ .
1−q
In order to prove (3.35), we observe that G(xαδ ) = xαδ , so (3.35) follows from
δ
(3.37), by taking u = xn,α and v = xαδ in (3.37). Now letting n → ∞ in (3.10) we
obtain F(xαδ ) = yδ + α(x0 − xαδ ). 2
Remark 3.2.6 The convergence of the method is quadratic as it has been shown
in [14], under Assumption 6. Since the error bounds shown in Theorem 3.2.5
are too pessimistic we will use the famous computational order of convergence
(COC) ( [5]) defined by

xn+1 − xαδ xn − xαδ
ρ ≈ ln / ln .
xn − xαδ xn−1 − xαδ
3.3 Error bounds

δ
We will provide an error estimate for xn,α − x∗ under the assumption
Assumption 9 There exists a continuous, strictly monotonically increasing
function ϕ : (0, a] → (0, ∞) with a ≥ F (x0 ) satisfying;
(i) limλ →0 ϕ(λ ) = 0,
(ii) supλ ≥0 αϕ(λ )

λ +α ≤ ϕ(α) ∀λ ∈ (0, a] and
(iii) there exists v ∈ X with v ≤ 1 (cf. [20]) such that
x0 − x∗ = ϕ(F (x0 ))v.
Proposition 3.3.1 Let F : D(F) ⊆ X → X be a monotone operator in X. Let xαδ

be the solution of (3.5) and xα := xα0 . Then
δ
xαδ − xα ≤.
α
Proof: The result follows from the monotonicity of F and the following relation
F(xαδ ) − F(xα ) + α(xαδ − xα ) = yδ − y.
Theorem 3.3.2 ( [22, Proposition 4.1], [23, Theorem 3.3]) Suppose that As-
sumptions 7, 8 and hypotheses of Proposition 3.3.1 hold. Let x∗ ∈ D(F) be a
solution of (3.1). Then, the following assertion holds
xα − x∗ ≤ (Lr + 1)ϕ(α).
Theorem 3.3.3 Suppose hypotheses of Theorem 3.2.5 and Theorem 3.3.2 hold.
Then, the following assertion are satisfied
δ qn η δ
xn,α − x∗ ≤ + c1 (ϕ(α) + )
1−q α
where c1 = max{1, (L0 r + 1)}.
Let
η
c̄ := max{ + 1, (L0 r + 2)}, (3.38)
1−q
and let
δ
}.
nδ := min{n : qn ≤ (3.39)
α
Theorem 3.3.4 Let c̄ and nδ be as in (3.38) and (3.39) respectively. Suppose
that hypotheses of Theorem 3.3.3 hold. Then, the following assertions hold
δ
xnδδ ,α − x∗ ≤ c̄(ϕ(α) + ). (3.40)
α
Note that the error estimate ϕ(α) + αδ in (3.32) is of optimal order if α := αδ
satisfies, ϕ(αδ )αδ = δ .
Now using the function ψ(λ ) := λ ϕ −1 (λ ), 0 < λ ≤ a we have δ =
αδ ϕ(αδ ) = ψ(ϕ(αδ )), so that αδ = ϕ −1 (ψ −1 (δ )).
Now, we present the following result.
Theorem 3.3.5 Let ψ(λ ) := λ ϕ −1 (λ ) for 0 < λ ≤ a, and the assumptions in
Theorem 3.3.4 hold. For δ > 0, let α := αδ = ϕ −1 (ψ −1 (δ )) and let nδ be as in
(3.39). Then
xnδδ ,α − x∗ = O(ψ −1 (δ )).
Now, we present a parameter choice rule based on the famous balancing prin-
ciple studied in [12, 19, 21]. In this method, the regularization parameter α is
chosen from some finite set
DM (α) := {αi = μ i α0 , i = 0, 1, · · · , M}
where μ > 1, α0 > 0 and let
δ
ni := min{n : e−γ0 n ≤ }.
αi
Then for i = 0, 1, · · · , M, we have
δ
xnδi ,αi − xαδ i ≤ c , ∀i = 0, 1, · · · M.
αi
Let xi := xnδi ,αi . The parameter choice strategy that we are going to consider in
this chapter, we select α = αi from DM (α) and operate only with corresponding
xi , i = 0, 1, · · · , M.
Theorem 3.3.6 ( [22, Theorem 3.1], see also the proof) Assume that there exists
i ∈ {0, 1, 2, · · · , M} such that ϕ(αi ) ≤ αδi . Suppose the hypotheses of Theorem
3.3.3 and Theorem 3.3.4 hold and let
δ
l := max{i : ϕ(αi ) ≤ } < M,
αi
δ
k := max{i : xi − x j ≤ 4c̄ , j = 0, 1, 2, · · · , i}.
αj
Then l ≤ k and
x∗ − xk ≤ cψ −1 (δ )
where c = 6c̄μ.
3.4 Implementation of adaptive choice rule

In this section we provide a starting point for the iteration approximating the
unique solution xαδ of (3.5) and a determination of the parameter algorithm ful-
filling the aforementioned balancing principle.
First of all, in order to select the starting point, we must consider the follow-
ing steps:
√
For q = 1− 1−16Lη
2 choose 0 < α0 < 1 and μ > 1.
Choose η such that η satisfies (3.13).
Choose ρ such that ρ satisfies (3.31).
Choose x0 ∈ D(F) such that x0 − x∗ ≤ ρ.
Choose the parameter αM = μ M α0 big enough with μ > 1, not too large.
δ
Choose ni such that ni = min{n : qn ≤ αi }.
On the other hand, the adaptive algorithm associated with the choice of the
parameter specified in Theorem 3.3.6 involves the following steps:
3.4.1 Algorithm
Set i ← 0.
Solve xi := xnδi ,αi by using the iteration (3.10).
√
δ
If xi − x j > 4c μj , j ≤ i, then take k = i − 1.
Set i = i + 1 and return to Step 2.

In this section we present the numerical examples.
Example 3.5.1 Let X = Y = R, D = [0, ∞), x0 = 1 and define function F on D

by
1
x1+ i
F(x) = + c1 x + c2 , (3.41)
1 + 1i
where c1 , c2 are real parameters and i > 2 an integer. Then F (x) = x1/i + c1 is
not Lipschitz on D. However central Lipschitz condition (C2) holds for L0 = 1.
In fact, we have
F (x) − F (x0 ) = |x1/i − x0 |

1/i
|x − x0 |
= i−1 i−1
x0 i + · · · + x i
so
F (x) − F (x0 ) ≤ L0 |x − x0 |.
Example 3.5.2 ( [7, Example 7.5]) We consider the integral equations
b
u(s) = f (s) + λ G(s,t)u(t)1+1/n dt, n ∈ N. (3.42)
a
Here, f is a given continuous function satisfying f (s) > 0, s ∈ [a, b], λ is a real
number, and the kernel G is continuous and positive in [a, b] × [a, b].
Equation of the form (3.42) generalize equations of the form
b
u(s) = G(s,t)u(t)n dt (3.43)
a
studied in [1]– [6]. Instead of (3.42) we can try to solve the equation F(u) = 0
where
F : Ω ⊆ C[a, b] → C[a, b], Ω = {u ∈ C[a, b] : u(s) ≥ 0, s ∈ [a, b]},
and b
F(u)(s) = u(s) − f (s) − λ G(s,t)u(t)1+1/n dt.
a
We consider the max-norm.
The derivative F is given by
b
1
F (u)v(s) = v(s) − λ (1 + ) G(s,t)u(t)1/n v(t)dt, v ∈ Ω.
n a

It is worth noticing that F does not satisfy a Lipschitz-type condition in Ω. Let
us choose [a, b] = [0, 1], G(s,t) = 1 and y(t) = 0. Then F (y)v(s) = v(s) and
b
1
F (x) − F (y) = |λ |(1 + ) x(t)1/n dt.
n a
If F were a Lipschitz function, then
F (x) − F (y) ≤ L1 x − y,
or, equivalently, the inequality

1
x(t)1/n dt ≤ L2 max x(s), (3.44)
0 x∈[0,1]
would hold for all x ∈ Ω and for a constant L2 . But this is not true. Consider, for
example, the functions
t
x j (t) = , j ≥ 1, t ∈ [0, 1].
j
If these are substituted into (3.44)
1 L2
≤ ⇔ j1−1/n ≤ L2 (1 + 1/n), ∀ j ≥ 1.
j1/n (1 + 1/n) j
This inequality is not true when j → ∞.
Therefore, Assumption 6 is not satisfied in this case. However, Assumption
8 holds. To show this, let x0 (t) = f (t) and γ = mins∈[a,b] f (s), α > 0. Then for
v ∈ Ω,
b
1
[F (x) − F (x0 )]v = |λ |(1 + ) max | G(s,t)(x(t)1/n − f (t)1/n )v(t)dt|
n s∈[a,b] a
1
≤ |λ |(1 + ) max Gn (s,t)
n s∈[a,b]
G(s,t)|x(t)− f (t)|
where Gn (s,t) = x(t)(n−1)/n +x(t)(n−2)/n f (t)1/n +···+ f (t)(n−1)/n
.
Hence,

|λ |(1 + 1/n) b
[F (x) − F (x0 )]v = max G(s,t)v(t)dtx − x0
γ (n−1)/n s∈[a,b] a
≤ L0 vx − x0 ,
|λ |(1+1/n) b
where L0 = γ (n−1)/n
N and N = maxs∈[a,b] a
G(s,t)dt. Then by the following
inequality;
[F (x)−F (u)]v ≤ [F (x)−F (x0 )]v+[F (x0 )−F (u)]v ≤ L0 v(x−x0 +u−x0 )
Assumption 8 holds for sufficiently small λ .

References
[1] I. K. Argyros, Convergence and Applications of Newton-type Iterations,

Springer, New York, 2008.
[2] I. K. Argyros, Approximating solutions of equations using Newton’s
method with a modified Newton’s method iterate as a starting point. Rev.
Anal. Numer. Theor. Approx., 36, 123–138, 2005.
[3] I. K. Argyros, A Semilocal convergence for directional Newton methods,
Math. Comput. (AMS), 80, 327–343, 2011.
[4] I. K. Argyros, D. González and Á. A. Magreñán, A Semilocal Convergence
for a Uniparametric Family of Efficient secant-Like Methods, J. Funct.
Spaces, Volume 2014, Article ID 467980, 10 pages.http://dx.doi.org/
10.1155/2014/467980.
ton’s method, J. Complexity, 28, 364–387, 2012.
and its applications, CRC Press, Taylor and Francis, New York, 2012.
[7] I. K. Argyros, Y. J. Cho and S. George, Expanding the applicability of
Lavrentiev regularization method for ill-posed problems, Boundary Value
Problems, 2013:114, 2013.
[8] A. Bakushinsky and A. Smirnova, On application of generalized discrep-
ancy principle to iterative methods for nonlinear ill-posed problems, Numer.
Funct. Anal. Optim., 26, 35–48, 2005.
[9] A. B. Bakushinskii, The problem of convergence of the iteratively regular-
ized Gauss-Newton method, Comput. Math. Phys., 32, 1353–1359, 1992.
60
References 61
[10] A. B. Bakushinskii, Iterative methods without saturation for solving degen-

erate nonlinear operator equations, Dokl. Akad. Nauk., 344, 7–8, 1995.
[11] A. B. Bakushinsky and M. Yu. Kokurin, Iterative methods for approximate
solution of inverse problems, volume 577 of Mathematics and Its Applica-
tions (New York). Springer, Dordrecht, 2004.
[12] F. Bauer and T. Hohage, A Lepskij-type stopping rule for regularized New-
ton methods. Inverse Problems, 21(6), 1975–1991, 2005.
[13] S. George and A. I. Elmahdy, An analysis of Lavrentiev regularization for
nonlinear ill-posed problems using an iterative regularization method, Int.
J. Comput. Appl. Math., 5 (3), 369–381, 2010.
[14] S. George and A. I. Elmahdy, A quadratic convergence yielding iterative
method for nonlinear ill-posed operator equations, Comput. Methods Appl.
Math., 12(1), 32–45, 2012.
[15] C. W. Groetsch, J. T. King and D. Murio, Asymptotic analysis of a finite
element method for Fredholm equations of the first kind. In Treatment of
Integral Equations by Numerical Methods, Eds.: C. T. H. Baker and G. F.
Miller, Academic Press, London, 1–11, 1982.
[16] J. M. Gutiérrez, Á. A. Magreñán and N. Romero, On the semilocal conver-
gence of Newton-Kantorovich method under center-Lipschitz conditions,
Appl. Math. Comput., 221, 79–88, 2013.
[17] B. Kaltenbacher, A. Neubauer and O. Scherzer, Iterative regularization
methods for nonlinear ill-posed problems, volume 6 of Radon Series on
Computational and Applied Mathematics. Walter de Gruyter GmbH & Co.
KG, Berlin, 2008.
[18] P. Mahale and M. T. Nair, Iterated Lavrentiev regularization for nonlinear
ill-posed problems. ANZIAM Journal, 51, 191–217, 2009.
[19] P. Mathe and S. V. Perverzev, Geometry of linear ill-posed problems in
variable Hilbert scales, Inverse Problems, 19 (3), 789–803, 2003.
[20] M. T. Nair and P. Ravishankar, Regularized versions of continuous New-
ton’s method and continuous modified Newton’s method under general
source conditions, Numer. Funct. Anal. Optim., 29 (9-10), 1140–1165,
2008.
[21] S. V. Perverzev and E. Schock, On the adaptive selection of the parameter
in regularization of ill-posed problems, SIAM J. Numer. Anal., 43, 2060–
2076, 2005.
62 References
[22] E. V. Semenova, Lavrentiev regularization and balancing principle for solv-

ing ill-posed problems with monotone operators, Comput. Methods Appl.
Math., 4, 444–454, 2010.
[23] U. Tautanhahn, On the method of Lavrentiev regularization for nonlinear
ill-posed problems, Inverse Problems, 18, 191–207, 2002.
Chapter 4
Sixth-order iterative
methods
Ioannis K. Argyros

CONTENTS
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Scheme for constructing sixth-order iterative methods . . . . . . . . . . . . . . . 65
4.3 Sixth-order iterative methods contained in family USS . . . . . . . . . . . . . . 69
4.4 Numerical work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.4.1 Solving nonlinear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5 Dynamics for method SG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5.1 Study of the fixed points and their stability . . . . . . . . . . . . . . . . . 73
4.5.2 Study of the critical points and parameter spaces . . . . . . . . . . . 74
4.1 Introduction
We develop sixth-order iterative methods in order to approximate zeros x∗ of
the function f defined on the real line. This method can be used to solve many
63
problems from computational sciences and other disciplines [14, 17, 25, 34, 36,
37].
Sharma and Guha [35], through a modification of the well-known the Os-
trowski method [32], developed the following sixth-order convergent method
(SG) ⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎨ f (xn ) f (yn )
zn = yn − , (4.1)
⎪
⎪ f (x ) − 2 f (y ) f (x )
⎪
⎪
n n
f (xn ) + a f (yn )
n
f (zn )
⎪
⎪
⎩ xn+1 = zn − ,
f (xn ) + (a − 2) f (yn ) f (xn )
where a ∈ R.
Neta proposed a three-step uniparametric family of sixth-order (N)
⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎨ f (xn ) + α f (yn ) f (yn )
zn = yn − , (4.2)
⎪
⎪ f (xn ) + (α − 2) f (yn ) f (xn )
⎪
⎪ f (xn ) − f (yn ) f (zn )
⎪
⎪
⎩ xn+1 = zn − ,
f (xn ) − 3 f (yn ) f (xn )
[31].
If α = 0 the second steps of both methods are the same.

If a = −1 the third steps of both methods are the same.
Grau and Dı́az in [18] developed another sixth-order variant of the Os-
trowski’s method GD
⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎨ f (xn ) f (yn )
zn = yn − , (4.3)
⎪
⎪ f (xn ) − 2 f (yn ) f (xn )
⎪
⎪ f (xn ) f (zn )
⎪
⎪
⎩ xn+1 = zn − .
f (xn ) − 2 f (yn ) f (xn )
Chun and Ham [9] derived the following family of sixth-order variant of the
Ostrowski’s method CH
⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎨ f (xn ) f (yn )
zn = yn − , (4.4)
⎪
⎪ f (x ) − 2 f (y ) f (x )
⎪
⎪
n
f (zn )
n n
⎪
⎪
⎩ xn+1 = zn − H(un ) ,
f (xn )
Sixth-order iterative methods 65
where un = f (yn )/ f (xn ) and H(t) represents a real valued function satisfying:
H(0) = 1 and H (0) = 2.
1+β t
If H(t) = , then (CH)=(SG)
1 + (β − 2)t
1
If H(t) = , then (CH)=(GD)
1 − 2t
Chun and Neta [10] proposed the following sixth-order iterative method
based on a modification of the Kung and Traub method [24] CN
⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎪
⎪ f (yn )
⎪
⎨ zn = yn −
1
2 ,
f (xn ) f (yn ) (4.5)
⎪ 1 − f (xn )
⎪
⎪
⎪
⎪ f (zn ) 1
⎪
⎪ xn+1 = zn − 2 .
⎪
⎪ f (xn )
⎩ 1 − ff (yn)
− f (zn )
(xn ) f (xn )
In this chapter, we propose a three-step unification scheme of the sixth-order

iterative methods. The rest of the chapter is organised as follows: Section 4.2
presents the scheme. In Section 4.3, through the scheme we generate various
well-known sixth-order iterative methods. In the Section 4.4, numerical and dy-
namical comparisons of various methods is shown. Finally in Section 4.5 we
present the conclusions drawn to this chapter.
4.2 Scheme for constructing sixth-order iterative

methods
Our main purpose is to develop a unifying scheme of sixth-order methods
presented by other authors in literature. For this purpose, we consider the follow-
ing three-step iterative scheme
⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎪
⎪ j
⎨ f (yn ) m f (yn )
zn = yn − 1 + j=1 a j , (4.6)
⎪ f (xn ) f (xn )
⎪
⎪
⎪
⎪ f (zn ) l μ1 f (yn ) + μ2 f (zn )
k
⎪
⎪
⎪
⎩ xn+1 = z n − 1 + bk .
f (xn ) k=1
f (xn )
Here, a j , bk , μ1 , μ2 , m and l are independent parameters. The parameters a j ,

bk , μ1 , μ2 ∈ R while the parameters m, l ∈ N.
In order to establish the values of the parameters we have the following result.
Theorem 4.2.1 Let x∗ be a simple zero of a sufficiently differentiable function

f : D ⊂ R → R in an open interval D. If the starting point x0 is sufficiently close
to the solutions x∗ , then the scheme (4.6) defines six order iterative methods if and
only if a1 = 2 and b1 = 2/μ1 . And, the error equation for the family of methods
is given as

c2 c3 c1 − 5 c22 + a2 c22 −6 c22 + c22 b2 μ12 + c3 c1 6
en+1 = 5
en + O en 7 , (4.7)
c1
f m (x∗ )
where en = xn − x∗ and cm = m! with m ≥ 1.
Proof
Expanding f (xn ), around the solution x∗ , through the Taylor series

f (xn ) = c1 en + c2 e2n + c3 e3n + c4 e4n + O e5n , (4.8)
From the preceding equation we have

f (xn ) = c1 + 2 c2 en + 3 c3 e2n + 4 c4 e3n + O e4n (4.9)
Now, we consider the following division
f (x) c2 2 c3 c1 − c2 2 3 3 c4 c1 2 − 7 c2 c3 c1 + 4 c2 3 4

= en − en − 2 en − en + O en 5 .
f (x) c1 c1 2 c1 3
(4.10)
From the first step, we get
f (xn )
yn − x∗ = en − , (4.11)
f (xn )
now substituting, we obtain
c2 2 c3 c1 − c2 2 3 3 c4 c1 2 − 7 c2 c3 c1 + 4 c2 3 4
yn − x ∗ = en + 2 2
en + 3
e n + O en 5 .
c1 c1 c1
(4.12)
Using Taylor expansion on f (yn ) around the solution x∗ we obtain the following
and using f (x∗ ) = 0 we get
∞

f (yn ) = c j (yn − x∗ ),
j=1
again, substituting into the above equation, we get
c3 c1 − c2 2 3 3 c4 c1 2 − 7 c2 c3 c1 + 5 c2 3 4
f (yn ) = c2 en 2 + 2 en + 2
en + O e n 5
c1 c1
(4.13)
And considering the division
f (yn ) c2 2 c3 c1 − 3 c2 2 2 3 c4 c1 2 − 10 c2 c3 c1 + 8 c2 3 3
= en + en + en + O en 4
f (xn ) c1 c1 2 c1 3
(4.14)
From the second step of our scheme (4.6), we have

2
f (xn ) f (yn ) f (yn ) f (yn )
zn = xn − 1+ 1 + a1 + a2 +··· ,
f (xn ) f (xn ) f (xn ) f (xn )
(4.15)
substituting f (xn )/ f (xn ) from the equation (4.10), and f (yn )/ f (xn ) from the
equation (4.14), into the above equation we obtain

∗ c2 2 (a1 − 2) 3 c2 4 a1 c3 c1 − 7 a1 c2 2 + a2 c2 2 − 7 c3 c1 + 9 c2 2 4
∗
zn − x = x − en − en
c1 2 c1 3

+ O en 5 . (4.16)
Expanding f (zn ), around the solution x∗ , through the Taylor series

∞

f (zn ) = ck (zn − x∗ )k , (4.17)
k=1
and substituting, into the above equation we get

c2 2 (a1 − 2) 3 c2 4 a1 c3 c1 − 7 a1 c2 2 + a2 c2 2 − 7 c3 c1 + 9 c2 2 4
f (zn ) = − en − en
c1 c1 2

+ O en 5 (4.18)
From the third step of our scheme (4.6), we obtain
2
f (zn ) f (yn ) f (zn ) f (yn ) f (zn )
xn+1 = zn − 1 + b 1 μ1 + μ2 + b2 μ1 + μ2 +··· ,
f (xn ) f (xn ) f (xn ) f (xn ) f (xn )
substituting from the equations (4.9), (4.14), (4.18) into the above equation we
obtain the error relation
c32 (−2 + b1 μ1 ) (a1 − 2) e4n c22

xn+1 = x∗ + + 4 ((−11 a1 − 11 b1 μ1 + 6 b1 a1 μ1 + 20)
c1 3 c1

c1 c3 + (−2 + a1 ) μ1 b2 + (−12 a1 + a2 + 19) μ1 + −4 + 4 a1 − a1 2 μ2 b1
2
c2
− 2 a2 + 18 a1 − 26)c22 )e5n + 5 (−16 a1 − 16 b1 μ1 + 28 + 9 a1 b1 μ1 ) c1 2 c2 c4
c1
+ (12 a1 b1 μ1 − 20 b1 μ1 + 33 − 20 a1 ) c1 2 c3 2 + (131 b1 μ1 − 89 a1 b1 μ1 − 28 b1 μ2
− 15 b2 μ1 2 + 8 a1 b2 μ1 2 + 8 b1 μ1 a2 + 30 b1 μ2 a1 − 8 b1 μ2 a1 2 − 15 a2 + 125 a1

− 167)c1 c2 2 c3 + 86 a1 b1 μ1 − 15 b1 μ1 a2 − 58 b1 μ2 a1 + 17 b1 μ2 a1 2 − 15 a1 b2 μ1 2
+ 4 a2 b1 μ2 + a2 b2 μ1 2 − 8 b2 μ1 μ2 + a1 b3 μ1 3 + b1 μ1 a3 − 2 a2 b1 μ2 a1
+ 8 b2 μ1 μ2 a1 − 2 b2 μ1 μ2 a1 2 − a1 2 − 98 a1 + 24 a2 − 2 a3 + 108 + 48 b1 μ2
− 111 b1 μ1 + 25 b2 μ1 2 − 2 b3 μ1 3 )c2 4 )e6n + O(e7n ).
(4.19)
Three-step scheme (4.6) define a family of sixth order methods if the following
three equations are satisfied simultaneously
⎧
⎪
⎪ (a1 − 2)(b1 μ1 − 2) = 0,
⎨
−11 a1 − 11 b1 μ1 +6 b1 a1 μ1 + 20 = 0,
⎪
⎪ (−2 + a 1 ) μ 1
2
b2 + (−12 a 1 + a2 + 19) μ1 + −4 + 4 a1 − a1
2
μ 2
⎩
b1 − 2 a2 + 18 a1 − 26 = 0.
It is easy to get
2
a1 = 2 and b1 = .
μ1
Substituting a1 = 2 and b1 = 2/μ1 in the equation (4.19) produces the required
error equation (4.7).

Therefore, we present the following three-step sixth-order iterative scheme
USS
⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎪
⎪ j
⎪
⎪ f (y ) f (y ) m f (y )
⎪
⎪ = yn −
n
1+2
n
+ j=2 a j
n
,
⎨ zn f (xn ) f (x ) f (x )
n n (4.20)
⎪
⎪ f (zn ) 2 μ1 f (yn ) + μ2 f (zn )
⎪
⎪ xn+1 = z n − 1 +
⎪
⎪ f (xn ) μ1 f (xn )
⎪
⎪ k
⎪
⎪ l μ f (y ) + μ f (zn)
⎪
⎪ + k=2 bk
1 n 2
.
⎩ f (xn )
It is worth noticing that parameters a j , bk (with j ≥ 2 and k ≥ 2) and μm (with

m = 1, 2) are free to choose.
In the next sections we will present some values of the parameter which we
obtain well-known methods.
4.3 Sixth-order iterative methods contained in family

USS
We begin with Sharma et al. method ( [35]). First of all, let us consider
aj = 2j for j ≥ 2, m = ∞, μ1 = 1,
bk = 2 (2 − β ) k−1
for k ≥ 2, l = ∞, μ2 = 0.
Where β ∈ R. Substituting these values in scheme USS, we obtain

⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎪
⎪
⎨ f (yn ) f (yn ) ∞ j f (yn ) j
zn = yn − 1+2 + j=2 2 ,
⎪ f (xn ) f (xn ) f (xn )
⎪
⎪ k
⎪
⎪ f (zn ) f (yn ) ∞ f (yn )
⎪
⎪
⎪
⎩ xn+1 = zn − 1+2 + k=2 2(2 − β )k−1
.
f (xn ) f (xn ) f (xn )
Using the equivalence 1 + r + r2 + r3 + · · · = 1/(1 − r) for |r| < 1 in the second

and third steps, we obtain have SG.
The next one to be studied is Chun et al. method [9]. We begin considering
aj = 2j for j ≥ 2, m = ∞, μ1 = 1,
b k = ωk for k ≥ 2, l = ∞, μ2 = 0.
Where a ∈ R. Substituting in USS, we get

⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎪
⎪
⎨ f (yn ) f (yn ) ∞ j f (yn ) j
zn = yn − 1+2 + j=2 2 , (4.21)
⎪ f (xn ) f (xn ) f (xn )
⎪
⎪ k
⎪
⎪ f (zn ) f (yn ) ∞ f (yn )
⎪
⎪
⎩ xn+1 = zn − f (xn ) 1 + 2 f (xn ) + k=2 ωk f (xn )
⎪ .
using again the equivalence 1 + r + r2 + r3 + · · · = 1/(1 − r) for |r| < 1 in the

second step, we get the second step of method CH. The third step of method
CH is given by xn+1 = zn − f (zn )/ f (xn ) H(un ) (see the equation (4.4)). Here
un = f (yn )/ f (xn ) and H(t) is a real valued function satisfying H(0) = 1 and
H (0) = 2. Moreover, notice that function H(un ) can be expressed as
∞ k
f (yn ) f (yn ) f (yn )
H = 1+2 + ωk .
f (xn ) f (xn ) f (xn )
k=2
Hence, method (4.21), that comes from scheme (USS) is method CH.
Method (GC) can be obtained using
aj = 2j for j ≥ 2, m = ∞, μ1 = 1,
bk = 2 k
for k ≥ 2, l = ∞, μ2 = 0.
in the second and third step of scheme USS.

Next, to derive the family (N), we consider
a j = 2 (−(a − 2)) j−1 for j ≥ 2, m = ∞, μ1 = 1,

bk = 2 (−(−3))k−1 for k ≥ 2, l = ∞, μ2 = 0.
Where, a ∈ R.
Finally, to obtain CN, parameters are
a j = ( j + 1) for j ≥ 2, m = ∞, μ1 = 1,
bk = (k + 1) for k ≥ 2, l = ∞, μ2 = 1.
Substituting in USS, we get

⎧
⎪
⎪ f (xn )
⎪
⎪ yn = xn − ,
⎪
⎪ f (xn )
⎪
⎪
⎨ f (yn ) f (yn ) ∞ j f (yn ) j
zn = yn − 1+2 + j=2 2 , (4.22)
⎪ f (xn ) f (xn ) f (xn )
⎪
⎪ k
⎪
⎪ f (zn ) f (yn ) ∞ f (yn )
⎪
⎪
⎩ xn+1 = zn − f (xn ) 1 + 2 f (xn ) + k=2 ωk f (xn )
⎪ .
Using the equivalence 1 + 2r + 3r2 + 4r3 + · · · = 1/(1 − r)2 for |r| < 1 in the
second and third steps of preceding method we get method (CN).
4.4 Numerical work

Let {xn }∞
n=0 be a sequence, generated by an iterative method, converging to x
∗
∗
and en = xn − x . If there exists a real number ξ ∈ [1, ∞) and a nonzero constant
C such that
lim |en+1 |/|en |ξ = C,
n→∞
then ξ is called the convergence order of the sequence and the constant C is called
the asymptotic error constant. From the preceding relation, the computational
order of convergence (COC) is approximated as follows
log |(xn+1 − x∗ )/(xn − x∗ )|

ρ≈ .
log |(xn − x∗ )/(xn−1 − x∗ )|
All the computations are performed in the programming language C++ . For
numerical precision, the C++ library ARPREC [1] is being used. For conver-
gence of the method, it is required that the distance of two consecutive iterates
(|xn+1 − xn |) and the absolute value of the function (| f (xn )|), also referred to as
residual, be less than 10−300 . The maximum allowed iterations are 200.
4.4.1 Solving nonlinear equations

The methods are tested for the functions
f1 (x) = x3 + 4 x2 − 10, x∗ ≈ 1.365.

f2 (x) = x2 − exp(x) − 3 x + 2, x∗ ≈ 0.257.
f3 (x) = x exp(x2 ) − sin2 (x) + 3 cos(x) + 5, x∗ ≈ − 1.207.
π √
f4 (x) = x4 + sin 2 − 5, x∗ = 2.
x
f5 (x) = ex sin x + log (1 + x2 ), x∗ = 0.
√
π 1 17 3+1
f6 (x) = 2 + x2 sin 2 + 4 − , x∗ = − 2.
x x +1 17
Table 4.1: The (number of function evaluations, COC) for various sixth order iterative
methods.
f (x) x0 SG N GD CH CN
f1 (x) 100 (36, 6) (36, 6) (28, 6) (36, 6) (28, 6)
f2 (x) 100 (172, 6) (884, 6) (160, 6) (172, 6) (108, 6)
f3 (x) −1 (20, 6) (20, 6) (20, 6) (20, 6) (32, 6)
f4 (x) 1 (20, 6) (20, 6) (20, 6) (20, 6) (20, 6)
f5 (x) 1 (24, 6) (24, 6) (24, 6) (24, 6) (24, 6)
f5 (x) 3 (24, 6) (28, 6) (20, 6) (24, 6) (20, 6)
f6 (x) −1.5 (24, 6) (28, 6) (20, 6) (24, 6) (20, 6)
Various free parameters are choosen as, in the method SG: a = 2, in the
method N: α = 5 and in the method CH: β = 3. Outcome of numerical experi-
mentation is presented in the Table 4.1. The Table 4.1 reports (number of function
evaluations, COC during the second last iterative step). COC is rounded to the
nearest significant digits. In the Table 4.1, we see that the methods GD and CN
are showing better results.
4.5 Dynamics for method SG

In this Section we will use the dynamical tools presented in related works as
[7, 8, 11, 12, 16, 27–30].
First of all, let us recall some basic concepts related to dynamics. Given a
rational function R : Ĉ → Ĉ, where Ĉ is the Riemann sphere, the orbit of a point
z0 ∈ Ĉ is defined by
{z0 , R (z0 ) , R2 (z0 ) , ..., Rn (z0 ) , ...}.
A point z0 ∈ C̄, is called a fixed point of R(z) if it verifies that R(z) = z.

Moreover, z0 is called a periodic point of period p > 1 if it is a point such that
R p (z0 ) = z0 but Rk (z0 ) = z0 , for each k < p. Moreover, a point z0 is called pre-
periodic if it is not periodic but there exists a k > 0 such that Rk (z0 ) is periodic.
There exist different types of fixed points depending on its associated mul-
tiplier |R (z0 )|. Taking the associated multiplier into account a fixed point z0 is
called:
superattractor if |R (z0 )| = 0
attractor if |R (z0 )| < 1
repulsor if |R (z0 )| > 1
and parabolic if |R (z0 )| = 1.
The fixed points that do not correspond to the roots of the polynomial p(z)
are called strange fixed points. On the other hand, a critical point z0 is a point
which satisfies that, R (z0 ) = 0. Moreover, we call free critical point those critical
points which are no roots of the polynomial p(z).
The basin of attraction of an attractor α is defined as
A (α) = {z0 ∈ Ĉ : Rn (z0 ) →α, n→∞}.
The Fatou set of the rational function R, F (R) , is the set of points z ∈ Ĉ
whose orbits tend to an attractor (fixed point, periodic orbit or infinity). Its com-
plement in Ĉ is the Julia set, J (R). That means that the basin of attraction of
any fixed point belongs to the Fatou set and the boundaries of these basins of
attraction belong to the Julia set.
By applying this operator on a quadratic polynomial with two different roots
A and B, p(z) = (z − A)(z − B) and using the Möebius map h(z) = z−A z−B , which
carries root A to 0, root B to ∞ and ∞ to 1, we obtain the rational operator

associated to the family of iterative schemes is finally

z6 az3 + 2az2 + 2az + 2a + z4 + 2z3 + 3z2 + 4z + 2
G(z, a) = . (4.23)
2az4 + 2az3 + 2az2 + az + 2z4 + 4z3 + 3z2 + 2z + 1
4.5.1 Study of the fixed points and their stability

It is clear that z = 0 and z = ∞ are fixed points of G(z, a). Moreover, there exist
some strange fixed points which are:
z = 1 related to divergence to ∞
The roots of
p(z) = az7 + 3az6 + 5az5 + 7az4 + 5az3 + 3az2 + az + z8
+3z7 + 6z6 + 10z5 + 12z4 + 10z3 + 6z2 + 3z + 1
These solutions of this polynomial depend on the value of the parameter a.

In Figure 4.1 the bifurcation diagram of the fixed points is shown
Figure 4.1: Bifurcation diagram of the fixed points.
Theorem 4.5.0
The character of the strange fixed point z = 1 is:
√
−975 (a)2 −3928 (a)−3952
i) If −(52/25) < Re(a) < − 39 and −
76 √ < (a) <
√ 5 39
−975 (a)2 −3928 (a)−3952
√
5 39
, then z = 1 is an attractor. Moreover, if a = −2
it is superattractor.
√
−975 (a)2 −3928 (a)−3952
ii) When −(52/25) < Re(a) < − 76 and (a) = − √ ∨
√ 39 5 39
−975 (a) −3928 (a)−3952
2
(a) = √
5 39
z = 1 is a parabolic point.
iii) In other case, z = 1 is a repulsor.
Proof It follows from the expression

−975(a)2 − 3928(a) − 3952
G (z, 1) = (a) = − √ ∨ (a)
5 39

−975(a)2 − 3928(a) − 3952
= √
5 39

In Figure 4.2 the region in which z = 1 is attractor is shown.
Figure 4.2: Region of stability of z = 1.
4.5.2 Study of the critical points and parameter spaces

It is a well-known fact that there is at least one critical point associated with
each invariant Fatou component [13, 19]. The critical points of the family are the
solutions of is G (z, a) = 0, where
2z5 (z+1)2 (z2 +1)(5a2 z3 +4a2 z2 +5a2 z+6az4 +10az3 +20az2 +10az+6a+6z4 +12z3 +12z2 +12z+6)
G (z, a) = (2az4 +2az3 +2az2 +az+2z4 +4z3 +3z2 +2z+1)2
It is clear that z = 0 and z = ∞ are critical points. Furthermore, the free

critical points are the roots of the polynomial:
q(z) = 5a2 z3 + 4a2 z2 + 5a2 z + 6az4 + 10az3
+20az2 + 10az + 6a + 6z4 + 12z3 + 12z2 + 12z + 6
which will be called cri (a) for i = 1, 2, 3, 4 and
cr5 (a) = −1
cr6 (a) = i
It is easy to see that cr5 and cr6 are preimages of the extraneous fixed point
z = 1, so we will not consider them in the dynamical study, as the stability of this
fixed point has been already analyzed.
In Figure 4.3 the bifurcation diagram of the critical points is shown
Figure 4.3: Bifurcation diagram of the critical points.
So, there are four independent free critical points in which we are interested,
without loss of generality, we consider in this paper the free critical points cr1 (a)
and cr2 (a), since its behavior is representative. Now, we are going to look for the
best members of the family by means of using the parameter space associated to
the free critical points.
The study of the orbits of the critical points gives rise about the dynamical
behavior of an iterative method. In concrete, to determinate if there exists any
attracting periodic orbit different to the roots of the polynomial p(z), the follow-
ing question must be answered: For which values of the parameter, the orbits
of the free critical points are attracting periodic orbits? One way of giving re-
sponse to that questions is drawing the parameter spaces associated to the free
critical points. There will exist open regions of the parameter space for which the
iteration of the free critical points does not converge to any of the roots of the
polynomial p(z), that is, in these regions the critical points converges to attracting
cycles or to an strange fixed point or even they diverge.
In Figure 4.4, the parameter spaces associated to cr1 (a) are shown in which
we observe that there exists a zone with no convergence to the roots, this zone can
be better seen in Figure 4.8, and in Figure 4.5 the parameter spaces associated to
cr2 (a) is shown. A point is painted in cyan if the iteration of the method starting
in z0 = cr1 (a) converges to the fixed point 0 (related to root A), in magenta if it
converges to ∞ (related to root B) and in yellow if the iteration converges to 1
(related to ∞). Moreover, it appears in red the convergence, after a maximum of
2000 iterations and with a tolerance of 10−6 , to any of the strange fixed points,
in orange the convergence to 2-cycles, in light green the convergence to 3-cycles,
in dark red to 4-cycles, in dark blue to 5-cycles, in dark green to 6-cycles, dark
yellow to 7-cycles, and in white the convergence to 8-cycles. The regions in
black correspond to zones of convergence to other cycles. As a consequence,
every point of the plane which is neither cyan nor magenta is not a good choice
of α in terms of numerical behavior.
Once the anomalies has been detected, the next step consist on describing
them in the dynamical planes. In these dynamical planes the convergence to 0
will appear in magenta, in cyan it appears the convergence to ∞ and in black
the zones with no convergence, after a maximum of 2000 iterations and with a
tolerance of 10−6 to the roots.
Then, focussing the attention in the region shown in Figure 4.8 it is evident
that there exist members of the family with complicated behavior. In Figure 4.9,
the dynamical planes of a member of the family with regions of convergence to
some of the strange fixed points is shown.
In Figures 4.10, 4.15 dynamical planes of members of the family with regions
of convergence to an attracting 2-cycle is shown, those regions with convergence
to 2-cycles are painted in black since there is no convergence to any of the roots.
Both values of parameter correspond to red zones in the parameter planes.
On the other hand, in Figure 4.12, the dynamical plane of a member of the
family with regions of convergence to z = 1, related to ∞ is presented, painted
in black. This value of parameter a is located in a yellow zone of the parameter
planes.
Other special cases are shown in Figures 4.13, 4.14, 4.15 and 4.16 where
there are no convergence problems, as the only basins of attraction correspond to
the roots of p(z).
Figure 4.4: Parameter space associated to the free critical point cr1 (a).
Figure 4.8: Detail of the parameter space associated to the free critical point cr1 (a).
Figure 4.9: Basins of attraction associated to the method with a = −2.125.


Figure 4.12: Basins of attraction associated to the method with a = −2.
Figure 4.13: Basins of attraction associated to the method with a = 0.

Figure 4.14: Basins of attraction associated to the method with a = −5.
Figure 4.15: Basins of attraction associated to the method with a = 5.

Figure 4.16: Basins of attraction associated to the method with a = 2i.
As a consequence, we can state that this family is very stable and that the
existing anomalies are very rare in practice.
References
[1] ARPREC. C++/Fortran-90 arbitrary precision package. Available at http:

//crd.lbl.gov/~dhbailey/mpdist/.
[2] S. Amat, S. Busquier and S. Plaza, Dynamics of the King and Jarratt itera-
tions, Aequationes Math., 69, 212–223, 2005.
[3] G. Ardelean, A comparison between iterative methods by using the basins
of attraction, Appl. Math. Comput., 218 (1), 88–95, 2011.
in Computational Mathematics, Editors: C. K. Chui and L. Wuytack, Vol.
15, Elsevier Publ. Comp., New York, 2007.
[5] I. K. Argyros and S. Hilout, A unifying theorem for Newton’s method on
spaces with a convergence structure. J. Complexity, 27(1), 39–54, 2011.
[6] I. K. Argyros, D. Chen and Q. Qian, The Jarrat method in Banach space
setting, J. Computing. Appl. Math., 51, 103–106, 1994.
fourth-order family of methods and its dynamics, Appl. Math. Comput.,
252, 336–346, 2015.
of a Two-Step Newton-Like Method, Int. J. Bifurcation and Chaos, 26 (5)
1630012, 2016.
[9] C. Chun and Y. Ham, Some sixth-order variants of Ostrowski root-finding
methods, Appl. Math. Comput., 193, 389–394, 2007.
[10] C. Chun and B. Neta, A new sixth-order scheme for nonlinear equations,
Appl. Math. Let., 25 (2), 185–189, 2012.
84
References 85
[11] A. Cordero, J. M. Gutiérrez, Á. A. Magreñán and J. R. Torregrosa, Stability

Chemistry, 53 (3), 893–910, 2015.
[13] P. Fatou, Sur les équations fonctionelles, Bull. Soc. Math. France, 47, 161–
271, 1919.
[15] M. Garcı́a-Olivo, J. M. Gutiérrez and Á. A. Magreñán, A complex dynam-
ical approach of Chebyshev’s method,SeMA Journal, 71 (1), 57–68, 2015.
[16] Y. H. Geum, Y. I. Kim, and Á. A. Magreñán, A biparametric extension
282, 254–275, 2016.
[17] R. González-Crespo, R. Ferro, L. Joyanes, S. Velazco and A. G. Castillo,
Use of ARIMA mathematical analysis to model the implementation of ex-
pert system courses by means of free software OpenSim and Sloodle plat-
forms in virtual university campuses, Expert Systems with Applications, 40
(18), 7381–7390, 2013.
[18] M. Grau and J. L. Dı́az-Barrero, An improvement to Ostrowski root-finding
method, Appl. Math. Comput., 173 (1), 450–456, 2006.
[19] G. Julia, Memoire sur l’iteration des fonctions rationelles, J. de Math. pures
et appliquées, 8 (1), 47–215, 1918.
[20] S. K. Khattri and I. K. Argyros, Sixth order derivative free family of itera-
tive methods. Appl. Math. Comput., 217 (12), 5500–5507, 2011.
[21] S. K. Khattri and I. K. Argyros, How to develop fourth and seventh order
iterative methods? Novi. Sad. J. Math., 40, 61–67, 2010.
[22] S. K. Khattri and T. Log, Constructing third-order derivative-free iterative
methods. Int. J. Comput. Math., 88 (7), 1509–1518, 2011.
[23] Y. I. Kim, A new two-step biparametric family of sixth-order iterative meth-
ods free from second derivatives for solving nonlinear algebraic equations,
Appl. Math. Comput. 215, 3418–3424, 2009.
[24] H. T. Kung and J. F. Traub, Optimal order of one-point and multipoint iter-
ations, J. Assoc. Comput. Mach., 21, 643–651, 1974.
86 References
[25] W. Lorenzo, R. G. Crespo and A. Castillo, A prototype for linear features

generalization. Int. J. Interac. Multim. Artif. Intell., 1 (3), 60–66, 2010.
put., 252, 347–353, 2015.
[27] T. Lotfi, Á. A. Magreñán, K. Mahdiani and J. J. Rainer, A variant of
Steffensen-King’s type family with accelerated sixth-order convergence
put., 252, 347–353, 2015.
[30] Á. A. Magreñán and I. K. Argyros On the local convergence and the dy-
[31] B. Neta, A sixth-order family of methods for nonlinear equations, Int. J.
Comput. Math., 7, 157–161 (1979).
[32] A. M. Ostrowski, Solutions of Equations and System Equations, Academic
Press, New York, 1960.
[33] H. Ren, Q. Wu and W. Bi, New variants of Jarratt’s method with sixth-order
convergence, Numer. Algorithms, 52, 585–603, 2009.
(2L), 415–421, 2015.
[35] J. R. Sharma and R. K. Guha, A family of modified Ostrowski methods with
accelerated sixth order convergence, Appl. Math. Comput., 190, 111–115,
2007.
291, 468–477, 2016.
References 87
[38] H. Susanto and N. Karjanto, Newton’s method basin of attraction revisited,

Appl. Math. Comput., 215, 1084–1090, 2009.
[39] J. F. Traub, Iterative Methods for the Solution of Equations, Prentice Hall,
1964. Reissued Chelsea Publishing Company, 1982.
[40] X. Wang, J. Kou and Y. Li, A variant of Jarratt method with sixth-order
convergence, Appl. Math. Comput., 204, 14–19, 2008.
Chapter 5
Local convergence and

basins of attraction of a
two-step Newton-like
method for equations with
solutions of multiplicity
greater than one
Ioannis K. Argyros

CONTENTS
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Local convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3 Basins of attraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.3.1 Basins of F(x) = (x − 1)2 (x + 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
88
Local convergence and basins of attraction of a two-step Newton-like method 89
5.3.2 Basins of F(x) = (x − 1)3 (x + 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.3.3 Basins of F(x) = (x − 1)4 (x + 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.1 Introduction
Let S = R or S = C, D ⊆ S be convex and let F : D → S be a differentiable
function. We shall approximate solutions x∗ of the equation
F(x) = 0, (5.1)
Many problems from Applied Sciences including engineering can be solved
finding the solutions of equations in a form like (5.1) [4, 5, 20, 25, 31, 39, 43–45].
We study the local convergence and the dynamics of the two-step Newton-
like method defined for each n = 0, 1, 2, . . . by
yn = xn − αF (xn )−1 F(xn )

(5.2)
xn+1 = xn − φ (xn )F (xn )−1 F(xn )
where x0 is an initial point, α ∈ S is a given parameter and φ : S → S is a given

function.
Special choices of parameter α and function φ lead to well known methods.
Newton’s method.
Choose α = 0 and φ (x) = 1 for each x ∈ D (5.3)
to obtain
xn+1 = xn − F (xn )−1 F(xn ), for each n = 0, 1, 2, . . . . (5.4)
Laguerre’s method. Choose α = 1 and

q
φ (x) = −1 + ,
q−m F(x)F (x)
1 + sign(q − m) (q − 1) − q
m F (x)2
where q = 0, m is a real parameter. In particular for special choices of q,

we obtain: q = 2m.
Euler-Chebyshev method. Choose α = 1 and
m(3 − m) m2 F(x)F (x)

φ (x) = −1 + + .
2 2 F (x)2
Osada’s method. Choose α = 1 and
1 (m − 1)2 F (x)2
φ (x) = −1 + m(m + 1) − .
g 2 F (x)F(x)
Other choices of φ can be found in [4, 5, 13, 29, 30, 33, 34, 36–41, 45, 46] and
the references therein.
The rest of the chapter is organized as follows: Section 5.2 contains the lo-
cal convergence analysis of method (5.2). The dynamics of method (5.2) are
discussed in Section 5.3. Finally, the numerical examples are presented in the
concluding Section 5.4.
5.2 Local convergence

We present the local convergence analysis of method (5.2) in this section. It is
convenient for the local convergence analysis that follows to introduce some con-
stants and functions. Let L0 > 0, L > 0 α ∈ S be given parameters and m > 1 be
m−2
a given integer. Set r0 = L20 32 .
Define functions g0 , hm and H1 on the interval [0, r0 ) by
m−2 m−2
L0 1 1
g0 (t) = t +1−
2 3 2
H0 (t) = g0 (t) − 1,
⎧
⎪ L0 1
⎪
⎪ t+ ,m = 2
⎨ 3 2
hm (t) = m−1 m−3
⎪
⎪ L0 1 1 1
⎪
⎩ 2 3 t + ,m ≥ 3
3 2
1 m−2
hm (t) + |1−α|L
g1 (t) = m 2
1 − g0 (t)
and m−2
|1 − α|L 1
H1 (t) = hm (t) + + g0 (t) − 1.
m 2
Notice that
g0 (r0 ) = 1, H0 (r0 ) = 0 and 0 ≤ g0 (t) < 1 for each t ∈ [0, r0 ).
Hence, functions g1 and H1 are well defined on the interval [0, r0 ).

Suppose that
m−2
1 |1 − α|L
hm (0) + − 1 < 0.
2 m
Then, we have that

m−2
1 |1 − α|L
H1 (0) = hm (0) + −1 < 0
2 m
and
m−2 m−2 m−2
1 |1 − α|L L0 1 1
H1 (r0 ) = hm (r0 ) + + r0 −
2 m 2 3 2
m−2
1 |1 − α|L
= hm (r0 ) + > 0,
2 m
since g0 (r0 ) = 1. It follows from the intermediate value theorem that function H1
has zeros in the interval (0, r0 ). Denote by r1 the smallest such zero. Then, we
have that
H1 (r1 ) = 0, r1 < r0 and 0 ≤ g1 (t) < 1 for each t ∈, (0, r1 ).
Suppose that there exists a continuous function ψ : [0, r1 ) → [0, +∞) such
that m−2
1 L
hm (r0 ) + (|1 − α| + ψ(0)) − 1 < 0. (5.5)
2 m
Define function on the interval [0, r1 ) by
m−2
1 1 L
g2 (t) = hm (t) + (|1 − α| + ψ(t))
1 − g0 (t) 2 m
and function on the interval [0, r1 ] by

m−2
L 1
H2 (t) = H1 (t) + ψ(t)
m 2
m−2
1 L
= hm (t) + (|1 − α|L + ψ(t))
2 m
m−2 m−2
L0 1 1
+ t− .
2 3 2
By (5.5) we have that H2 (0) < 0. Moreover, we have that

m−2
L 1
H2 (r1 ) = H1 (r1 ) + ψ(r1 ) ≥ 0.
m 2
Then, function H2 has zeros in the interval (0, r1 ]. Denote the smallest such zero
by r. Then, we have that
r ≤ r1 < r0 , (5.6)
0 ≤ g0 (t) < 1, (5.7)

0 ≤ g1 (t) < 1 (5.8)
and
0 ≤ g2 (t) < 1 for each t ∈ [0, r). (5.9)
Let U(v, ρ), Ū(v, ρ) stand, respectively for the open and closed balls in S, respec-
tively with center v ∈ S and of radius ρ > 0. Next, using the preceding notation
we present the local convergence analysis of method (5.2).
Theorem 5.2.1 Let m > 1 be an integer and let F : D ⊂ S → S be an m-times
differentiable function. Suppose that there exist L0 > 0, L > 0, α ∈ S, x∗ ∈ D,
functions ϕ : S → S, ψ : [0, r1 ) → [0, +∞) such that Φ is continuous, ψ is con-
tinuous and non-decreasing satisfying
m−2
1 L
hm (0) + (|1 − α| + ψ(0)) − 1 < 0,
2 m
F(x∗ ) = F (x∗ ) = · · · = F (m−1) (x∗ ) = 0, F (m) (x∗ ) = 0, (5.10)

|1 − φ (x)| ≤ ψ(x − x∗ ), for each x ∈ Ū(x∗ , r1 ), (5.11)
(m) ∗ −1 (m) (m) ∗ ∗
F (x ) (F (x) − F (x ) ≤ L0 x − x , for each x ∈ D (5.12)
(m) ∗ −1 (m)
F (x ) F (x) ≤ L, for each x ∈ D (5.13)
and
Ū(x∗ , r) ⊆ D, (5.14)
where r1 , r are defined above this Theorem. Then, the sequence {xn } generated
for x0 ∈ U(x∗ , r) \ {x∗ } by method (5.2) is well defined, remains in Ū(x∗ , r) for
each n = 0, 1, 2, . . . and converges to x∗ . Moreover, the following estimates hold
for each n = 0, 1, 2, . . .
yn − x∗ ≤ g1 (xn − x∗ )xn − x∗ ≤ xn − x∗ < r (5.15)
and
xn+1 − x∗ ≤ g2 (xn − x∗ )xn − x∗ < xn − x∗ , (5.16)
where the “g” functions are defined previously. Furthermore, for R ∈ [r, r0 ) the
limit point x∗ is the only solution of equation F(x) = 0 in Ū(x∗ , R) ∩ D, where r0
is defined previously.
Proof: We shall show estimates (5.15) and (5.16) using mathematical induction.
By hypothesis x0 ∈ U(x∗ , r) \ {x∗ }. Then, we have in turn the approximation
F (x0 ) − F (m) (x∗ )(x0 − x∗ )m−1
= F (x0 ) − F (x∗ ) − F (m) (x∗ )(x0 − x∗ )(m−1)

1
= 0 F (x∗ + θ1 (x0 − x∗ ))(x0 − x∗ )dθ1 − F (m) (x∗ )(x0 − x∗ )m−1
1
= 0 [F (x∗ + θ1 (x0 − x∗ )) − F (x∗ )] (x0 − x∗ )dθ1 − F (m) (x∗ )(x0 − x∗ )m−1
11
= 0 0 F (x∗ + θ2 (x∗ + θ1 (x0 − x∗ ) − x∗ ))θ1 (x0 − x∗ )2 dθ1 dθ2
−F (m) (x∗ )(x0 − x∗ )m−1
11
= 0 0 F (x∗ + θ2 θ1 (x0 − x∗ ))θ1 (x0 − x∗ )dθ1 dθ2 − F (m) (x∗ )(x0 − x∗ )m−1
= ···
11 1
= 0 0 · · · 0 F (m) (x∗ + θm−1 θm−2 . . . θ1 (x0 − x∗ )) − F (m) (x∗ )
×θm−2 · · · θ1 (x0 − x∗ )(m−1) dθ1 dθ2 . . . dθm−1

11 1
− 0 0 · · · 0 (1 − θm−2 . . . θ1 ) F (m) (x∗ )(x0 − x∗ )(m−1) dθ1 dθ2 . . . dθm−1
(5.17)
Then, using (5.6), (5.7), (5.12) and (5.17), we get for x0 = x∗ that
−1
F (m) (x∗ )(x0 − x∗ )m−1 F (x0 ) − F (m) (x∗ )(x0 − x∗ )m−1
! 1 −1 (m) ∗
1 1
≤ x0 − x∗ 1−m 0 0 . . . 0 F (m) (x∗ ) F (x + θm−1 θm−2 . . . θ1
(x0 − x∗ )) − F (m) (x∗ )]
×θm−2 · · · θ1 (x0 − x∗ )(m−1) dθ1 dθ2 . . . dθm−1

11 1 "
+ 0 0
(1 − θm−2 . . . θ1 ) x0 − x∗ m−1 dθ1 dθ2 . . . dθm−1
··· 0
1
1 1
≤ x0 − x∗ 1−m L0 0 0 · · · 0 θm−1 (θm−2 · · · θ1 )2 x0 − x∗ m dθ1 · · · θm−1
11
+ 0 0 (1 − θm−2 · · · θ1 )x0 − x∗ m−1 dθ1 . . . dθm−1
m−2 m−2
L0 1 ∗ 1
= x0 − x + 1 − = g0 (x0 − x∗ ) < g0 (r)
2 3 3
m−2 m−2
L0 1 1
= r+1− < 1.
2 3 3
(5.18)
It follows from (5.18) and the Banach lemma on invertible functions [4,5,39,
45] that F (x0 )−1 ∈ Ł(R, R) and
1
F (x0 )−1 F (m) (x∗ ) ≤ m−2 m−2
L0 1 1
x0 − x∗ m−1 1 − ∗
x0 − x + 1 −
2 3 2
(5.19)
Hence, y0 is well defined by the first substep of method (5.2) for n = 0. Using
method (5.2) for n = 0 and (5.10) we obtain the approximation
y0 − x∗ = x0 − x∗ − F (x0 )−1 F(x0 ) + (1 − α)F (x0 )−1 F(x0 )
= F (x0 )−1 [F (x0 )(x0 − x∗ ) − F(x0 )] + (1 − α)F (x0 )−1 F(x0 )

1
= − F (x0 )−1 F (m) (x∗ ) 0 F (m) (x∗ )−1 [F (x∗ + θ1 (x0 − x∗ )) − F (x0 )]

(x0 − x∗ )dθ1 + (1 − α) F (x0 )−1 F (m) (x∗ ) F (m) (x∗ )−1 F(x0 ).
(5.20)
We shall find some upper bounds on the norms of the integral expression and
F(x0 ) in (5.20). We have in turn the following
1
[F (x∗ + θ1 (x0 − x∗ )) − F (x0 )] (x0 − x∗ )dθ1
0
11
= − 0 0 F (x0 + θ2 (x∗ + θ1 (x0 − x∗ ) − x0 ))(1 − θ1 )(x0 − x∗ )2 dθ1 dθ2
11
= − 0 0 [F (x0 + θ2 (θ1 − 1)(x0 − x∗ )) − F (x∗ )] (1 − θ1 )(x0 − x∗ )2 dθ1 dθ2
111
= 0 0 0 F (x∗ + θ3 (x0 + θ2 (θ1 − 1)(x0 − x∗ ) − x∗ ))(1 − (1 − θ1 )θ2 )(1 − θ1 )
(x0 − x∗ )3 dθ1 dθ3

111
= 0 0 0 [F (x∗ + θ3 (x0 + θ2 (θ1 − 1)(x0 − x∗ ) − x∗ )) − F (x∗ )]
×(1 − (1 − θ1 )θ2 )(1 − θ1 )(x0 − x∗ )3 dθ1 dθ2 θ3
= ···
11 1
= 0 0 . . . 0 F (m) (x∗ + θm θm−2 . . . θ3 (1 − (1 − θ1 )θ2 )(x0 − x∗ )) − F (m) (x∗ )
×θm−1 . . . θ3 (1 − (1 − θ1 )θ2 )(1 − θ1 )(x0 − x∗ )m dθ1 dθ2 . . . θm

11 1
+ 0 0 . . . 0 F (m) (x∗ + θm−1 . . . θ3 (1 − (1 − θ1 )θ2 )(1 − θ1 )(x0 − x∗ )m dθ1 dθ2 . . . θm
(5.21)
Then, using (5.21) the definition of function hm and (5.12) we get that
1
F (m) (x∗ )−1 F (x∗ + θ1 (x0 − x∗ )) − F (x0 ) (x0 −x∗ )dθ1 ≤ hm (x0 −x∗ )x0 −x∗ m .
0
(5.22)
We also have that
1
F (x0 ) = F (x0 ) − F (x∗ ) = 0 F (x∗ + θ1 (x0 − x∗ ))(x0 − x∗ )dθ1
11 1
= · · · = 0 0 . . . 0 F (m) (x∗ + θm−1 θm−2 . . . θ1 (x0 − x∗ )) (5.23)
×θm−2 . . . θ1 (x0 − x∗ )m−1 dθ1 dθ2 . . . dθm−1
Using (5.13) and (5.23), we get that

m−2
1
F (m) (x∗ )−1 F (x0 ) ≤ L x0 − x∗ m−1 . (5.24)
2
Therefore, from (5.24), we get that since
1
∗
F(x0 ) = F(x0 ) − F(x ) = F (x∗ + θ (x0 − x∗ ))(x0 − x∗ )dθ (5.25)
0
1
F (m) (x∗ )−1 F(x0 ) = 0
F (m) (x∗ )−1 F (x∗ + θ (x0 − x∗ ))(x0 − x∗ )dθ
m−2 m−2
1 1 L 1
≤L 0
θ m−1 x0 − x∗ m = x0 − x∗ m .
2 m 2
(5.26)
Then, using the definition of function g1 , (5.6), (5.8), (5.19), (5.20), (5.22) and
(5.26) we get in turn that
1
y0 − x∗ ≤ F (x0 )−1 F (m) (x∗ ) 0 F (m) (x∗ )−1 [F (x∗ + θ (x0 − x∗ )) − F (x0 )]
(x0 − x∗ )dθ
+|1 − α|F (x0 )−1 F (m) (x∗ )F (m) (x∗ )−1 F(x0 )
m−2
∗ |1 − α|L 1
hm (x0 − x ) + x0 − x∗ m
m 2
≤
x0 − x∗ m−1 (1 − g0 (x0 − x∗ ))
= g1 (x0 − x∗ )x0 − x∗ < x0 − x∗ < r,
(5.27)
which shows (5.15) for n = 0 and y0 ∈ U(x∗ , r). We also have that x1 is well
defined by the second substep of method (5.2) for n = 0. Using method (5.2) for
n = 0, we have that
x1 − x∗ = x0 − x∗ − φ (x0 )F (x0 )−1 F(x0 )
= x0 − x∗ − F (x0 )−1 F(x0 ) + (1 − φ (x0 ))F (x0 )−1 F(x0 ) (5.28)

= y0 − x∗ + (1 − φ (x0 )) F (x0 )−1 F (m) (x∗ ) F (m) (x∗ )−1 F(x0 )
In view of the definition of function g2 , (5.6), (5.9), (5.11), (5.19), (5.26) and
(5.28) we get that
x1 − x∗ ≤ y0 − x∗ + |1 − φ (x0 )|F (x0 )−1 F (m) (x∗ )F (m) (x∗ )−1 F(x0 )
m−1
∗
ψ(x0 − x∗ ) mL 12 x0 − x∗ m−1
≤ g1 (x0 − x ) + x0 − x∗
x0 − x∗ m−1 (1 − g0 (x0 − x∗ ))
= g2 (x0 − x∗ )x0 − x∗ < x0 − x∗ < r,
(5.29)
∗
which shows (5.16) for n = 0 and x1 ∈ U(x , r). By simply replacing x0 , y0 , x1
by xk , yk , xk+1 in the preceding estimates we arrive at estimates (5.15) and (5.16).
Then, from the estimates xk+1 −x∗ ≤ cxk −x∗ < r, x = g2 (x0 −x∗ ) ∈ [0, 1),
we get that lim xk = x∗ and xk+1 ∈ U(x∗ , r). Finally to show the uniqueness part,
k→∞
let y∗ ∈ Ū(x∗ , R) be such that F(y∗ ) = F (y∗ ) = · · · = F (m−1) (y∗ ) = 0, F (m) (y∗ ) =
1
0. Set T = 0 F (y∗ + θ (x∗ − y∗ ))dθ . Then, in view of (5.18) for x0 replaced by
z = y∗ + θ (x∗ − y∗ ) we get in turn that
F (m) (x∗ )−1 (z − x∗ )−1 (T − F (m) (x∗ )(z − x∗ )m−1 )

m−2 m−2
L0
1 1 ∗ ∗ 1
≤ 0
(1 − θ )y − x dθ + 1 −
23 2
m−2 m−2
L0 1 1
≤ R+1− < 1.
2 3 2
It follows that T −1 exists. Then, from the identity 0 = F(y∗ ) − F(x∗ ) = T (y∗ −
x∗ ), we conclude that x∗ = y∗ .
5.3 Basins of attraction

The dynamical properties related to an iterative method applied to polynomials
give important information about its stability and reliability. Many authors such
as Amat et al. [1–3], Gutiérrez et al. [21], Magreñán [33, 34] and many others
[6–13, 15–19, 21–24, 26–29, 32, 35, 38–41, 45] have found interesting dynamical
planes, including periodical behavior and others anomalies. One of our main
interests in this chapter is the study of the parameter spaces associated to a family
of iterative methods, which allow us to distinguish between the good and bad
methods in terms of its numerical properties.
Firstly, some dynamical concepts of complex dynamics that are used in this
work are shown. Given a rational function R : Ĉ → Ĉ, where Ĉ is the Riemann
sphere, the orbit of a point z0 ∈ Ĉ is defined as
{z0 , R (z0 ) , R2 (z0 ) , ..., Rn (z0 ) , ...}.
A point z0 ∈ Ĉ, is called a fixed point of R(z) if it verifies that R(z) = z.

Moreover, z0 is called a periodic point of period p > 1 if it is a point such that
R p (z0 ) = z0 but Rk (z0 ) = z0 , for each k < p. Moreover, a point z0 is called pre-
periodic if it is not periodic but there exists a k > 0 such that Rk (z0 ) is periodic.
There exist different types of fixed points depending on its associated mul-
tiplier |R (z0 )|. Taking the associated multiplier into account, a fixed point z0 is
called:
superattractor if |R (z0 )| = 0,
attractor if |R (z0 )| < 1,
repulsor if |R (z0 )| > 1,
and parabolic if |R (z0 )| = 1.
The fixed points that do not correspond to the roots of the polynomial p(z)
are called strange fixed points. On the other hand, a critical point z0 is a point
which satisfies that R (z0 ) = 0.
The basin of attraction of an attractor α is defined as
A (α) = {z0 ∈ Ĉ : Rn (z0 ) →α, n→∞}.
The Fatou set of the rational function R, F (R), is the set of points z ∈ Ĉ
whose orbits tend to an attractor (fixed point, periodic orbit or infinity). Its com-
plement in Ĉ is the Julia set, J (R). That means that the basin of attraction of
any fixed point belongs to the Fatou set and the boundaries of these basins of
attraction belong to the Julia set.
Now, we are going to see some of the basin of attractions related to different
choices of α and φ (x). In concrete we have chosen 3 values of α
1
α=
2
α =1
α = −1
and three choices of functions φ
φ (x) = x2
x
φ (x) =
2
φ (x) = x4 + 1
when method (5.2) is applied to three different polynomials
F(x) = (x − 1)2 (x + 1)
F(x) = (x − 1)3 (x + 1)
F(x) = (x − 1)4 (x + 1).
A point is painted in cyan if the iteration of the method starting in x0 con-
verges to the fixed point 1 (multiplicity greater than one), in magenta if it con-
verges to −1 (simple one) and in yellow if the iteration diverges to ∞. If after
100 iterations the sequence does not converge to any of the roots with a tolerance
of 10−3 or diverges to infinity that point is painted in black.
5.3.1 Basins of F(x) = (x − 1)2 (x + 1)

In this subsection we are going to see how basins of attraction change when α
and function φ (x) change when F(x) = (x − 1)2 (x + 1).
In Figure 5.1 we see how the basin change when α change with φ (x) = x2 .
α = 0.5 α =1 α = −1
Figure 5.1: Basins of attraction of the method (5.2) associated to the roots of
F(x) = (x − 1)2 (x + 1) with φ (x) = x2 .
In Figure 5.2 we see how the basin change when α change with φ (x) = 2x .
In Figure 5.3 we see how the basin change when α change with φ (x) = x4 +1.
α = 0.5 α =1 α = −1
F(x) = (x − 1)2 (x + 1) with φ (x) = 2x .
α = 0.5 α =1 α = −1
F(x) = (x − 1)2 (x + 1) with φ (x) = x4 + 1.
5.3.2 Basins of F(x) = (x − 1)3 (x + 1)

5.3.3 Basins of F(x) = (x − 1)4 (x + 1)

α = 0.5 α =1 α = −1
F(x) = (x − 1)3 (x + 1) with φ (x) = x2 .
α = 0.5 α =1 α = −1
F(x) = (x − 1)3 (x + 1) with φ (x) = 2x .
α = 0.5 α =1 α = −1
F(x) = (x − 1)3 (x + 1) with φ (x) = x4 + 1.
α = 0.5 α =1 α = −1
F(x) = (x − 1)4 (x + 1) with φ (x) = x2 .
α = 0.5 α =1 α = −1
F(x) = (x − 1)4 (x + 1) with φ (x) = 2x .

We present numerical examples in this section.
Example 5.4.1 Let S = R, D = [−1, 1] and define function F on D by
F(x) = x(ex − 1)
Then, we have that x∗ = 0, m = 2, F (x∗ ) = 2
3
L= e
2
and
1 3e
L0 = (3e − 4) = − 2,
2 2
α = 0.5 α =1 α = −1
F(x) = (x − 1)4 (x + 1) with φ (x) = x4 + 1.
1 3
since L = max |(x + 2)ex | = e and
2 −1≤x≤1 2

1 x xn−1 1 1 1
L0 = max |2(1+ +· · ·+ +· · · )+e | =
x
2(1 + + · · · + + · · · ) + e
2 −1≤x≤1 2! x! 2 2 n!
1
= (3e − 4).
2
Now choosing
α = 0.85,
φ (x) = φ = 1 − x
and
ψ(x) = |x|
we obtain that conditions of Theorem hold and method (5.2) with α = 0.85 and
φ (x) = 1 − x converges to x∗ = 0. Moreover,
r0 = 0.962731 . . . ,
r1 = 0.112174 . . .
and
r = 0.051511 . . . .
Example 5.4.2 Let S = R, D = (1, 3) and define function F on D by
F(x) = (x − 2)6 (x − 3)4
Then, we have that x∗ = 2, m = 6, F (6) (x∗ ) = 720 and
L = 743
and
L0 = 742.
Now choosing
α = 1,
φ (x) = 3x
and
ψ(x) = t − 3
we obtain that conditions of Theorem hold and method (5.2) with α = 1 and
φ (x) = 3x converges to x∗ = 2. Moreover,
r0 = 0.0136456 . . . ,
r1 = 0.00341139 . . .
and
r = 0.000710411 . . . .
References
[1] S. Amat, S. Busquier and S. Plaza, Dynamics of the King and Jarratt itera-
tions, Aequationes Math., 69 (3), 212–223, 2005.
[2] S. Amat, S. Busquier and S. Plaza, Chaotic dynamics of a third-order
Newton-type method, J. Math. Anal. Appl., 366 (1), 24–32, 2010.
[3] S. Amat, M. A. Hernández and N. Romero, A modified Chebyshev’s itera-
tive method with at least sixth order of convergence, Appl. Math. Comput.,
206 (1), 164–174, 2008.
Springer, 2008.
[5] I. K. Argyros and S. Hilout, Numerical methods in Nonlinear Analysis,
World Scientific Publ. Comp. New Jersey, 2013.
336–346, 2015.
of a Two-Step Newton-Like Method, Int. J. Bifurcation and Chaos, 26 (5),
1630012 2016.
[8] R. Behl, Development and analysis of some new iterative methods for nu-
merical solutions of nonlinear equations (PhD Thesis), Punjab University,
2013.
[9] D. D. Bruns and J. E. Bailey, Nonlinear feedback control for operating a
nonisothermal CSTR near an unstable steady state, Chem. Eng. Sci., 32,
257–264, 1977.
104
References 105
ods II: The Chebyshev method, Computing, 45 (4), 355–367, 1990.
[12] F. Chicharro, A. Cordero and J. R. Torregrosa, Drawing dynamical and
parameters planes of iterative families and methods, The Scientific World
Journal, Volume 2013, Article ID 780153.
[13] C. Chun, Some improvements of Jarratt’s method with sixth-order conver-
gence, Appl. Math. Comput., 190 (2), 1432–1437, 1990.
[14] A. Cordero, J. Garcı́a-Maimó, J. R. Torregrosa, M. P. Vassileva and P. Vin-
del, Chaos in King’s iterative family, Appl. Math. Let., 26, 842–848, 2013.
[15] A. Cordero, J. M. Gutiérrez, Á. A., Magreñán and J. R. Torregrosa, Stability
[16] A. Cordero, L. Feng, L., Á. A., Magreñán and J. R. Torregrosa, A new
fourth-order family for solving nonlinear problems and its dynamics, J.
Math. Chemistry, 53 (3), 893–910, 2015.
[17] J. A. Ezquerro and M. A. Hernández, Recurrence relations for Chebyshev-
type methods, Appl. Math. Optim., 41(2), 227–236, 2000.
[18] J. A. Ezquerro and M. A. Hernández, New iterations of R-order four with
reduced computational cost, BIT Numer. Math., 49, 325–342, 2009.
[19] J. A. Ezquerro and M. A. Hernández, On the R-order of the Halley method,
J. Math. Anal. Appl., 303, 591–601, 2005.
[21] J. M. Gutiérrez and M. A. Hernández, Recurrence relations for the super-
Halley method, Comput. Math. Appl., 36 (7), 1–8, 1998.
[22] M. Ganesh and M. C. Joshi, Numerical solvability of Hammerstein integral
equations of mixed type, IMA J. Numer. Anal., 11, 21–31, 1991.
[23] M. Garcı́a-Olivo, J. M. Gutiérrez and Á. A. Magreñán, A complex dynam-
ical approach of Chebyshev’s method,SeMA Journal, 71 (1), 57–68, 2015.
282, 254–275, 2016.
106 References
[25] R. González-Crespo, R. Ferro, L. Joyanes, S. Velazco and A. G. Castillo,

Use of ARIMA mathematical analysis to model the implementation of ex-
pert system courses by means of free software OpenSim and Sloodle plat-
forms in virtual University campuses, Expert Systems with Applications, 40
(18), 7381–7390, 2013.
[26] M. A. Hernández, Chebyshev’s approximation algorithms and applications,
Computers Math. Applic., 41 (3-4), 433–455, 2001.
[27] M. A. Hernández and M. A. Salanova, Sufficient conditions for semilocal
convergence of a fourth order multipoint iterative method for solving equa-
tions in Banach spaces, Southwest J. Pure Appl. Math., 1, 29–40, 1999.
[28] P. Jarratt, Some fourth order multipoint methods for solving equations,
Math. Comput., 20 (95), 434–437, 1966.
[29] J. Kou and Y. Li, An improvement of the Jarratt method, Appl. Math. Com-
put., 189, 1816–1821, 2007.
[30] J. Kou and X. Wang, Semilocal convergence of a modified multi-point Jar-
ratt method in Banach spaces under general continuity conditions, Numer.
Algorithms, 60, 369–390, 2012.
generalization, Int. J. Interac. Multim. Artif. Intell., 1(3), 60–66, 2010.
put., 252, 347–353, 2015.
[35] Á. A. Magreñán and I. K. Argyros, On the local convergence and the dy-
[36] B. Neta and C. Chun, A family of Laguerre methods to find multiple roots
of nonlinear equations, Appl. Math. Comput., 219, 10987–11004, 2013.
[37] B. Neta and C. Chun, Erratum on a family of Laguerre methods to find
multiple roots of nonlinear equations, Appl. Math. Comput., 248, 693–696,
2014.
References 107
[38] M. Petković, B. Neta, L. Petković and J. Džunić, Multipoint Methods for

Solving Nonlinear Equations, Elsevier, 2013.
[39] L. B. Rall, Computational solution of nonlinear operator equations, Robert
E. Krieger, New York, 1979.
[40] H. Ren, Q. Wu and W. Bi, New variants of Jarratt method with sixth-order
convergence, Numer. Algorithms, 52(4), 585–603, 2009.
[41] W. C. Rheinboldt, An adaptive continuation process for solving systems
of nonlinear equations, Polish Academy of Science, Banach Ctr. Publ., 3,
129–142, 1978.
(2L), 415–421, 2015.
291, 468–477, 2016.
[45] J. F. Traub, Iterative methods for the solution of equations, Prentice-Hall
Series in Automatic Computation, Englewood Cliffs, N. J., 1964.
[46] X. Wang, J. Kou and C. Gu, Semilocal convergence of a sixth-order Jarratt
method in Banach spaces, Numer. Algorithms, 57, 441–456, 2011.
Chapter 6
Extending the
Kantorovich theory for
solving equations
Ioannis K. Argyros

CONTENTS
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.2 First convergence improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.3 Second convergence improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.1 Introduction
Let X, Y be Banach spaces, D ⊂ X be convex, F : D ⊂ X → Y be a Fréchet
differentiable operator. We shall determine a solution x∗ of the equation
F(x) = 0, (6.1)
108
Extending the Kantorovich theory for solving equations 109
Many problems from Applied Sciences can be solved finding the solutions of
equations in a form like (6.1) [5, 9, 12, 13, 27, 31–33].
The most popular iterative method is undoubtedly Newton’s method defined
by
xn+1 = xn − F (xn )−1 F(xn ), for each n = 0, 1, 2, · · · , (6.2)
where x0 is an initial point. Newton’s method has order of convergence 2 if the
initial point x0 is chosen sufficiently close to the solution x .
The computation of the inverse may be expensive in general. That is why the
modified Newton’s method defined as
yn+1 = yn − F (y0 )−1 F(yn ), for each n = 0, 1, 2, · · · , (6.3)
where y0 = x0 . The modified Newton’s method is only linearly convergent.

The famous Kantorovich theorem [26] is an important result in numerical
analysis and in Nonlinear Functional Analysis, where it is also used as a semilo-
cal result for establishing the existence of a solution of a nonlinear equation in
an abstract space. The well-known Kantorovich hypothesis
1
h = Lη ≤ . (6.4)
2
is the sufficient criterion for the semilocal convergence of Newton’s method.
Where, L > 0 and η > 0 are defined by
F (x0 )−1 (F (x) − F (y)) ≤ Lx − y for each x, y ∈ D.
and
F (x0 )−1 F(x0 ) ≤ η
respectively.
In a series of papers [1–11, 14–17], we provided weaker sufficient conver-
gence criteria. One of the most recent and weakest criterion is given by
1
h1 = L1 η ≤ , (6.5)
2
where
1 √
L1 = 4L0 + L0 L + L2 + 8L0 L ,
8
where L0 is the center-Lispchitz constant given in
F (x0 )−1 (F (x) − F (x0 )) ≤ L0 x − x0 for each x ∈ D.
Notice that L0 ≤ L,
1 1
h≤ ⇒ h1 ≤ (6.6)
2 2
and
h1 L0
→ 0 as → 0. (6.7)
h L
It is well known that in practice the computation of constant L requires that of
constant L0 . Throughout the chapter we denote by U(x, ρ) and Ū(x, ρ) the open
and closed balls with center x ∈ D and radius ρ > 0, respectively.
In this chapter we present two different improvements of criterion (6.5) based
on modified Newton’s method and Newton’s method in Sections 6.2 and 6.3
respectively.
6.2 First convergence improvement

In this section, using the modified Newton’s method previously defined, we pro-
vide an improvement of the convergence criteria for Newton’s method.
Remark 6.2.1 As it is shown in [8], if
1
η≤
L0
then, equation F(x) = 0 has a unique solution in Ū(x0 , r0 ) where
√
1 − 1 − 2L0 η
r0 = .
L0
On the other hand, modified Newton’s method
yn+1 = yn − F (y0 )−1 F(yn ), y0 = x0 , for each n = 0, 1, 2, · · ·
remains in Ū(x0 , r0 ) for each n = 0, 1, 2, · · · and converges to x . Moreover, the
following estimates hold for each n = 0, 1, 2, · · ·
yn+1 − yn+1 ≤ qyn+1 − yn
and
qn η
yn − x∗ ≤
1−q
where
q = 1− 1 − L0 η.
The iterates xn may belong in Ū(x0 , r) for some radius r. This radius r perhaps
such that r > r0 . However the convergence will be quadratic. Below we present
such a case.
Let us now define b, r, η1 , η2 and η0 by
2
2 − (L + 2L0 )r0 η0 1 L 1−δ2
b= , r= , η1 = 1− , η2 = ,
2 − Lr0 1−b 2L0 L + 2L0 2L0
where

L + 3L0 − L2 + 9L02 + 2L0 L
δ = 1−
L
and
η0 = min{η1 , η2 }.
If
η < η1
then
b ∈ (0, 1) and r > r0 .

Moreover, if
η < η2
then
L0 r0 2 − (L + 2L0 )r0
≤ .
2 2 − Lr0
Due to the choice of b, r and r0 we get
Lbk−1 r0 Lr0
≤ ≤ b for each k = 2, 3, · · · .
2(1 − L0 (1 + b )r0 ) 2(1 − L0 r)
k−1
Now, we present the following result.
Theorem 6.2.1 Let F : D ⊆ X → Y be Fréchet differentiable. Suppose that there

exists x0 ∈ D and parameters L0 > 0, L ≥ L0 , η > 0 such that
F (x0 )−1 ∈ Ł(Y, X), (6.8)

−1
F (x0 ) F(x0 ) ≤ η, (6.9)
F (x0 )−1 [F (x) − F (x0 )] ≤ L0 x − x0 for each x ∈ D, (6.10)
F (x0 )−1 [F (x) − F (y)] ≤ Lx − y for each x, y ∈ D. (6.11)
η < η0 (6.12)
U(x0 , r) ⊆ D, (6.13)
Then, the sequence {xn } generated by Newton’s method (6.2) is well defined,
remains in the open ball U(x0 , r) for each n = 0, 1, 2, · · · and converges to a
unique solution x∗ located in the closed ball Ū(x0 , r) of equation F(x) = 0. More-
over the following estimates hold for each n = 0, 1, 2, · · ·
L
xn+1 − x∗ ≤ xn − x∗ 2 . (6.14)
2(1 − L0 xn − x0 )
Proof: It follows from condition (6.12) that
1
h0 = L0 η ≤ (6.15)
2
which implies r0 is well defined and equation F(x) = 0 has a unique solution
x ∈ Ū(x0 , r0 ).
By means of considering Newton’s method we get
xn+1 − x∗ = xn − x∗ − F (xn )−1 F(xn )

1

= −F (xn ) −1
[F (x∗ + θ (xn − x∗ )) − F (xn )](xn − x∗ )dθ .
0
(6.16)
It is clear that x1 = y1 ∈ Ū(x0 , r0 ) ⊆ Ū(x0 , r). Now considering (6.11), (6.15) and
(6.16) for n = 0 that
L0 L r
x1 − x∗ ≤ x0 − x∗ 2 ≤ 0 0 x0 − x∗ < bx0 − x∗ ,
2 2
Lx1 − x∗ 2 Lx1 − x∗ Lbr0
x2 − x∗ ≤ ≤ x1 − x∗ ≤ < bx1 − x∗ ,
2(1 − L0 x1 − x0 ) 2(1 − L0 η) 2(1 − L0 η)
1 − b2
x2 − x0 ≤ x2 − x∗ + x∗ − x0 ≤ b2 r0 + r0 = (b2 + 1)r0 ≤ (b + 1)r0 = r0 < r
1−b
and
F (x0 )−1 (F (x1 ) − F (x0 )) ≤ L0 x1 − x0 < 1.

Now, from the Banach Lemma on invertible operators [26] that the inverse
exists F (x1 )−1 ∈ Ł(Y, X ) and the following holds
1 1
F (x1 )−1 F (x0 ) ≤ ≤ .
1 − L0 x1 − x0 1 − L0 η
In a similar way, we get that F (xk )−1 ∈ Ł(Y, X ) and

1
F (xk )−1 F (x0 ) ≤ .
1 − L0 xk − x0
Finally, for k ≥ 2 we have
Lxk −x∗
xk+1 − x∗ ≤ ∗
2(1−L0 xk −x0 ) xk − x
Lbk−1 r0 xk −x∗
≤ 2(1−L0 (xk −x∗ +x∗ −x0 ))
≤ 2(1−L0 (bk−1 +1)r0 )
≤ 2(1−L0 r0 )
< bxk − x∗
and
xk+1 − x0 ≤ xk+1 − x∗ + x∗ − x0
≤ bk x∗ − x0 + x∗ − x0
≤ (bk + 1)x∗ − x0
≤ (b + 1)x∗ − x0
1−b2
≤ 1−b r0 ≤ r0 .
As a consequence, xk+1 ∈ U(x0 , r) and lim xk+1 = x∗ .
k→∞
Remark 6.2.2 It follows from (6.16) that Newton’s method converges quadrat-
ically to x , r is given in closed form and (6.12) can be weaker than the Kan-
torovich hypothesis [26]
1
η≤
2L
for sufficiently small L0 .
6.3 Second convergence improvement

In this section we provide another improvement of the convergence criteria for
Newton’s method (6.2) using (6.5) and a tight majorizing sequence for Newton’s
method.
Remark 6.3.1 We have from the proof of semilocal convergence of Theorem 3.2
in [4] that sequence {rn } defined by
r0 = 0, r1 = η,
1 L0 (r2 −r1 )
2
r2 = r1 + b2(1−a 1 L0 r1 )
bn+1 L(rn+1 −rn ) 2

rn+2 = rn+1 2(1−a n+1 L0 rn+1 )
for each n = 0, 1, 2, · · ·
where
xn+1 − x0
an+1 =
rn+1
and
2
xn+1 − xn
bn+1 = for each n = 0, 1, 2, · · ·
rn+1 − rn
is a tighter majorizing sequence than {tn }. Therefore, it may converge under
weaker convergence criteria than {tn }. As an example, let us consider there exists
a ∈ [0, 1] and b ∈ [0, 1] such that
an+1 ≤ a and bn+1 ≤ b
for each n = 0, 1, 2, · · · . It is clear that one can always choose a = b = 1.
We consider the sequence {sn } defined by
s0 = 0, s1 = η,
0 (s1 −s0 )
2
s2 = s1 + L̄2(1− L̄0 s1 )
L̄(sn+1 −sn ) 2
sn+2 = sn+1 2(1− L̄0 sn+1 )
for each n = 0, 1, 2, · · ·
where
L̄ = bL and L̄0 = aL0 .

Then, it is clear that {sn } is less tight than {rn } but tighter than {tn } and also
still a majorizing sequence for sequence {xn } generated by Newton’s method
(6.2). Taking this fact into account we will find a weaker convergence criteria
using L̄0 and L̄ instead of L0 and L. The new convergence criteria are weaker if
a = 1 or b = 1. If a = b = 1 then, they are the same as the earlier ones. Now,
setting
K = max{aL0 , bL}
and define function
K 2
f (t) = t −t +η
2
and sequence {un } by
u0 = 0,
f (un )
un+1 = un − f (un ) for each n = 0, 1, 2, · · ·
we obtain the Kantorovich-type convergence criterion [26]

1
h = Kη ≤
2
which is at least as weak as the classical Kantorovich criterion given by (6.4),
since
K ≤ L.
Furthermore, it is clear that the classical error bounds in the literature for this
setting [3, 5, 6] are also improved if K < L.
References
[1] S. Amat, S. Busquier and M. Negra, Adaptive approximation of nonlinear

operators, Numer. Funct. Anal. Optim., 25, 397–405, 2004.
[2] I. K. Argyros, Computational theory of iterative methods. Series: Studies
in Computational Mathematics, 15, Editors: C. K. Chui and L. Wuytack,
Elsevier Publ. Co. New York, U.S.A., 2007.
[3] I. K. Argyros, Y. J. Cho and S. Hilout, Numerical method for equations and
its applications. CRC Press/Taylor and Francis, New York, 2012.
[5] I. K. Argyros, A unifying local-semilocal convergence analysis and appli-
cations for two-point Newton-like methods in Banach space, J. Math. Anal.
Appl., 298, 374–397, 2004.
[6] I. K. Argyros, On the Newton-Kantorovich hypothesis for solving equa-
tions, J. Comput. Appl. Math., 169, 315–332, 2004.
[7] I. K. Argyros, Concerning the “terra incognita” between convergence re-
gions of two Newton methods, Nonlinear Anal., 62, 179–194, 2005.
method with a modified Newton’s method iterate as a starting point, Rev.
Anal. Numér. Théor. Approx., 36 (2), 123–138, 2007.
[9] I. K. Argyros, S. George and Á. A. Magreñán, Local convergence for multi-
point-parametric Chebyshev-Halley-type methods of high convergence or-
der, J. Comp. App. Math., 282, 215–224, 2015.
[10] I. K. Argyros, On a class of Newton-like methods for solving nonlinear
equations, J. Comput. Appl. Math., 228, 115–122, 2009.
116
References 117
[11] I. K. Argyros, A semilocal convergence analysis for directional Newton

methods, Math. Comput., 80, 327–343, 2011.
[12] I. K. Argyros, D. González and Á. A. Magreñán, A semilocal conver-
gence for a uniparametric family of efficient Secant-like methods. J. Func.
Spaces, 2014.
[13] I. K. Argyros and Á. A. Magreñán, A unified convergence analysis for
Secant-type methods. J. Korean Math. Soc., 51 (6), 1155–1175, 2014.
[14] I. K. Argyros, Y. J. Cho and S. Hilout, Numerical method for equations and
its applications. CRC Press/Taylor and Francis, New York, 2012.
[15] I. K. Argyros and S. Hilout, Enclosing roots of polynomial equations and
their applications to iterative processes, Surveys Math. Appl., 4, 119–132,
2009.
[16] I. K. Argyros and S. Hilout, Extending the Newton-Kantorovich hypothesis
for solving equations, J. Comput. Appl. Math., 234, 2993–3006, 2010.
[17] I. K. Argyros and S. Hilout, Computational methods in nonlinear analysis,
World Scientific Pub. Comp., New Jersey, 2013.
[18] W. Bi, Q. Wu and H. Ren, Convergence ball and error analysis of
Ostrowski-Traub’s method, Appl. Math. J. Chinese Univ. Ser. B, 25, 374–
378, 2010.
[19] E. Cătinaş, The inexact, inexact perturbed, and quasi-Newton methods are
equivalent models, Math. Comp., 74 (249), 291–301, 2005.
[20] P. Deuflhard, Newton methods for nonlinear problems. Affine invariance
and adaptive algorithms, Springer Series in Computational Mathematics,
35, Springer-Verlag, Berlin, 2004.
[21] J. A. Ezquerro, J. M. Gutiérrez, M. A. Hernández, N. Romero and M. J.
Rubio, The Newton method: from Newton to Kantorovich (Spanish), Gac.
R. Soc. Mat. Esp., 13 (1), 53–76, 2010.
[22] J. A. Ezquerro and M. A. Hernández, On the R-order of convergence of
Newton’s method under mild differentiability conditions, J. Comput. Appl.
Math., 197(1), 53–61, 2006.
[23] J. A. Ezquerro and M. A. Hernández, An improvement of the region of
accessibility of Chebyshev’s method from Newton’s method, Math. Comp.,
78(267), 1613–1627, 2009.
[24] J. A. Ezquerro, M. A. Hernández and N. Romero, Newton-type methods of
high order and domains of semilocal and global convergence, Appl. Math.
Comput., 214 (1), 142–154, 2009.
118 References
[25] M. A. Hernández, A modification of the classical Kantorovich conditions

for Newton’s method, J. Comp. Appl. Math., 137, 201–205, 2001.
[26] L. V. Kantorovich and G. P. Akilov, Functional Analysis, Pergamon Press,
Oxford, 1982.
[28] P. D. Proinov, General local convergence theory for a class of iterative pro-
cesses and its applications to Newton’s method, J. Complexity, 25, 38–62,
2009.
[29] P. D. Proinov, New general convergence theory for iterative processes and
its applications to Newton-Kantorovich type theorems, J. Complexity, 26,
3–42,2010.
[30] H. Ren and Q. Wu, Convergence ball of a modified secant method with
convergence order 1.839 . . . , Appl. Math. Comput., 188, 281–285, 2007.
(2L), 415–421, 2015.
291, 468–477, 2016.
[35] Q. Wu and H. Ren, A note on some new iterative methods with third-order
convergence, Appl. Math. Comput., 188, 1790–1793, 2007.
Chapter 7
Robust convergence for

inexact Newton method
Ioannis K. Argyros

CONTENTS
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2 Standard results on convex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.3 Semilocal convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.4 Special cases and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.1 Introduction
The main task this chapter is to use the iterative methods to find solutions x∗ of
the equation
F(x) = 0, (7.1)
where D : D ⊂ X → Y is a Fréchet-differentiable operator X,Y are Banach spaces
and D ⊂ X. Many problems from Applied Sciences can be solved finding the
solutions of equations in a form like (7.1) [9, 12, 14, 17, 18, 23, 25, 28–34].
119
The most well-known and used iterative method is Newton’s method due to
its simplicity and goodness
xn+1 = xn − F (xn )−1 F(xn ) for each n = 0, 1, 2, · · · . (7.2)

where, F (x) denotes the Fréchet-derivative of F at x ∈ D.
Concerning the semilocal convergence of Newton’s method, one of the most
important results is the famous Kantorovich theorem [24] for solving nonlinear
equations. This theorem provides a simple and transparent convergence criterion
for operators with bounded second derivatives F or the Lipschitz continuous
first derivatives. Each Newton iteration requires the solution of a linear system
which may be very expensive to compute. The linear systems are solved inexactly
in floating point computations. Therefore, it is very important to work with an
error tolerance on the residual. The convergence domain of Newton-like methods
used to be very small in general. So, it is one of the main purpose of researcher in
area is to enlarge the domain without additional hypotheses. In particular, using
the elegant work by Ferreira and Svaiter [22] and optimization considerations,
we will expand this domain. In this chapter, the semilocal convergence analysis
depends on a scalar majorant function. Relevant work in this field can be found
in [10–22].
The main idea is that we introduce a center-majorant function (see (7.3))
that can be used to provide more precise information on the upper bounds of
F (x)−1 F (x0 ) than in [22], where the less precise majorant function is used
(see (7.4)). This modification provides a flexibility that leads to more precise er-
ror estimates on the distances xn+1 − xk and xn − x for each n = 1, 2, · · ·
under the same hypotheses and computational cost as in [22]. Furthermore, we
show that these advantages can even be obtained under weaker convergence cri-
teria in some interesting special cases.
In order to make the paper as self contained as possible we present results
taken from other studies such as [12, 22, 23] for continuity without the proofs.
We only provide the proofs of the results related to our new center-majorizing
function.
The chapter is organized as follows: In Section 7.2 we present some results
on convex functions. The main result is presented in Section 7.3. Finally, some
special cases and applications are provided in Section 7.4.
7.2 Standard results on convex functions

Let us denote U(x, R) and U(x, R), respectively, for the open and closed ball in
the Banach X with center x ∈ X and radius R > 0. Let also denote L(X , Y) for
the space of bounded linear operators from the Banach space X into the Banach
space Y.
We need the following auxiliary results of elementary convex analysis.
Robust convergence for inexact Newton method 121
Proposition 7.2.1 (cf. [23]) Let I ⊂ R be an interval and ϕ : I → R be convex.

1. For each u0 ∈ int(I), the function
ϕ(u0 ) − ϕ(u)
u→ , u ∈ I, u = u0 ,
u0 − u
is increasing and there exist (in R)
ϕ(u0 ) − ϕ(u) ϕ(u0 ) − ϕ(u)
D− ϕ(u0 ) = lim = sup .
−
u→u u0 − u u<u0 u0 − u
2. If u, v, w ∈ I, u < w and u ≤ v ≤ w then

v−u
ϕ(v) − ϕ(u) ≤ [ϕ(w) − ϕ(u)] .
w−u
Proposition 7.2.2 ( [22]) If h : [a, b) → R is convex, differentiable at a, h (a) < 0

and
lim h(t) = 0,
t→b−
then
h(a)
a− ≤ b,
h (a)
with equality if and only if h is affine in [a, b).
7.3 Semilocal convergence

We present the main semilocal convergence result as follows.
Theorem 7.3.1 Let X and Y be Banach spaces, R ∈ R, D ⊆ X and F : D → Y

be continuous and continuously differentiable on int(D). Let x0 ∈ int(D) with
F (x0 )−1 ∈ L(Y, X ). Suppose that there exist f0 : [0, R) → R and f : [0, R) → R
continuously differentiable, such that U(x0 , R) ⊆ D,
F (x0 )−1 [F (x) − F (x0 )] ≤ f0 (x − x0 ) − f0 (0), (7.3)
for each x ∈ U(x0 , R),
F (x0 )−1 [F (y) − F (x)] ≤ f (y − x + x − x0 ) − f (x − x0 ), (7.4)
for each x, y ∈ U(x0 , R), x − x0 + y − x < R and
F (x0 )−1 F(x0 ) ≤ f (0), (7.5)
(C1 ) f0 (0) = −1, f (0) > 0, f (0) = −1, f0 (t) ≤ f (t), f0 (t) ≤ f (t) for each
t ∈ [0, R);
(C2 ) f0 , f are strictly increasing and convex;

(C3 ) f (t) < 0 for some t ∈ (0, R).
Let
β := sup − f (t), t∗ := min f −1 ({0}), τ̄ := sup{t ∈ [0, R) : f (t) < 0}.

t∈[0,R)
Choose 0 ≤ ρ < β /2. Define
−( f (t) + 2ρ) κρ
κρ := sup (ρ)|(t − ρ) , λρ := sup{t ∈ [ρ, R) : κρ + f (t) < 0}, Θρ := 2 − κρ .
|
ρ<t<R 0 f
(7.6)
Then for any R ∈ [0, Θρ ] and z0 ∈ U(x0 , ρ), the sequence {zn } generated
by the inexact Newton method with initial point z0 and residual relative error
tolerance θ for k = 0, 1, . . . , zn+1 = zn + Sn ,
F (z0 )−1 [F(zn ) + F (zn )Sn ] ≤ θ F (z0 )−1 F(zn ), (7.7)
is well defined (for any particular choice of each Sn ),

n
1+θ2
F (z0 )−1 F(zn ) ≤ [ f (0) + 2ρ] , k = 0, 1, . . . , (7.8)
2
the sequence zn is contained in U(z0 , λρ ) and converges to a point x∗ ∈ Ū(x0 ,t∗ ),

which is the unique zero of F on U(x0 , τ̄). Moreover, if
(C4 ) λρ < R − ρ,
then the sequence {zn } satisfies, for k = 0, 1, . . . ,

1 + θ D− f0 (λρ ) f0 (λρ + ρ) + 2| f0 (ρ)|
x∗ − zn+1 ≤ x∗ − zn + θ x∗ − zn .
2 | f0 (λρ )| | f0 (λρ + ρ)|
If, additionally, 0 ≤ θ < κρ/(4 + κρ), then {zn } converges Q-linearly as follows

1 + θ 2θ
x∗ − zn+1 ≤ κ + x∗ − zn , k = 0, 1, . . . .
2 κρ
Remark 7.3.2 (a) If f0 = f , then, Theorem 7.3.1 reduces to the corresponding
one [22, Theorem 2.1]. On the other hand, if f0 < f it constitutes a clear
1 1
improvement, since < . So our theorem is at least as good
| f0 (t)| | f (t)|
as the previous one.
(b) If one choose θ = 0 in Theorem 7.3.1 we have the exact Newton method and
its convergence properties. If θ = θn in each iteration and letting θn goes
to zero as n goes to infinity, the penultimate inequality of the Theorem
7.3.1 implies that the generated sequence converges to the solution with
superlinear rate.
Here after we assume that the hypotheses of Theorem 7.3.1 hold. The scalar
function f in the above theorem is called a majorant function for F at point x0 ,
where as the function f0 is the center-majorant function for F at point x0 .
Next, we analyze some basic properties of functions f0 and f . Condition (C2 )
implies strict convexity of f and f0 . It is worth noticing that t∗ is the smallest root
of f (t) = 0 and, since f is convex, if the equation has another one, this second
root is τ̄.
Define
t¯ := sup{t ∈ [0, R) : f (t) < 0}. (7.9)
Proposition 7.3.3 (cf. [22]) The following statements on the majorant function
hold
i) f (t) < 0 for any t ∈ [0, t¯), (and f (t) ≥ 0 for any t ∈ [0, R) [0, t¯));
ii) 0 < t∗ < t¯ ≤ τ̄ ≤ R;
iii) β = − lim f (t), 0 < β < t¯.
t→t¯−
Now, we will prove a special case of Theorem 7.3.1. This case is ρ = 0 and
z0 = x0 . As in [22], in order to simplify the notation in the case ρ = 0, we will
use κ, λ and θ instead of κ0 , λ0 and θ0 respectively:
− f (t) κ
κ := sup , λ := sup{t ∈ [0, R) : κ + f (t) < 0}, Θ := . (7.10)
0<t<R t 2−κ
Proposition 7.3.4 (cf. [22]) For κ, λ , θ as in (7.9) it holds that
0 < κ < 1, 0 < Θ < 1, t∗ < λ ≤ t¯ ≤ τ̄, (7.11)

and
f (t) + κ < 0, ∀t ∈ [0, λ ), inf f (t) + κt = lim− f (t) + κt = 0, (7.12)

0≤t<R t→λ
Firstly, we need a Banach-type result for invertible operators. We will use the
following one that appears in [24].
Proposition 7.3.5 If x − x0 ≤ t < t¯ then F (x)−1 ∈ L(Y, X ) and

1
F (x)−1 F (x0 ) ≤ .
− f0 (t)
Proof: We have by (7.9) that f0 (t) < 0, since f0 (t) ≤ f (t). Using (7.3) and (C1 )
we get
F (x0 )−1 F (x) − I = F (x0 )−1 [F (x) − F (x0 )] ≤ f0 (x − x0 ) − f0 (0)
= f0 (t) + 1 < 1.
Using Banach Lemma [24] and the preceding inequality above we conclude that
F (x)−1 ∈ L(Y, X ) and
1 1
F (x)−1 F (x0 ) = (F (x0 )−1 F (x))−1 ≤ =− .
1 − ( f0 (t) + 1) f0 (t)

Notice that the corresponding result in [22] uses f instead of f0 to obtain
1
F (x)−1 F (x0 ) ≤ − .
f (t)
But
1 1
− ≤− .
f0 (t) f (t)
Therefore, if f0 (t) < f (t), then our result is more precise and less expensive to
arrive at. Furthermore, this modification influences error bounds and the suffi-
cient convergence criteria.
The linearization errors on F and f are given respectively by
EF (y, x) := F(y) − [F(x) + F (x)(y − x)] , x ∈ U(x0 , R), y∈D (7.13)
e f (v,t) := f (v) − [ f (t) + f (t)(v − t)], t, v ∈ [0, R). (7.14)

The linearization error of the majorant function bounded the linearization
error of F.
Lemma 7.3.6 (cf. [22]) If x, y ∈ X and x − x0 + y − x < R then
F (x0 )−1 EF (y, x) ≤ e f (x − x0 + y − x, x − x0 ).
Convexity of f and f guarantee that e f (t + s,t) is increasing in s and t.

Lemma 7.3.7 (cf. [22]) If 0 ≤ b ≤ t, 0 ≤ a ≤ s and t + s < R then
e f (a + b, b) ≤ e f (t + s,t),
1 f (t + s) − f (t) 2
e f (a + b, b) ≤ a , s = 0.
2 s
Next, we bound the linearization error EF using the linearization error on the
majorant function.
Corollary 7.3.8 (cf. [22]) If x, y ∈ X , x − x0 ≤ t, y − x ≤ s and s + t < R

then
F (x0 )−1 EF (y, x) ≤ e f (t + s,t),
1 f (s + t) − f (t)
F (x0 )−1 EF (y, x) ≤ y − x2 , s = 0.
2 s
In the following result, it is worth noticing to say that the first inequality is
very useful to obtain the asymptotic bounds by the inexact Newton method and
the second one will be used to show the robustness of the method with respect to
the initial iterate.
Corollary 7.3.9 ( [22]) For each y ∈ U(x0 , R),
− f (y − x0 ) ≤ F (x0 )−1 F(y) ≤ f (y − x0 ) + 2y − x0 .
From the first inequality showed above this corollary proves that F has no zeroes
in the region t∗ < x − x0 < τ̄.
Lemma 7.3.10 If x ∈ X , x − x0 ≤ t < R then
F (x0 )−1 F (x) ≤ 2 + f0 (t).
Proof: From (7.3) and (C1 ) we obtain
F (x0 )−1 F (x) ≤ I + F (x0 )−1 [F (x) − F (x0 )]

(7.15)
≤ 1 + f0 (x − x0 ) − f0 (0) = 2 + f0 (t).
Lemma 7.3.11 Take θ = 0, 0 ≤ t ≤ λ , x∗ , x, y ∈ X . If λ < R, x − x0 ≤ t, x∗ −

x ≤ λ − t, F(x∗ ) = 0 and
F (x0 )−1 [F(x) + F (x)(y − x)] ≤ θ F (x0 )−1 F(x) (7.16)
then
1 + θ 2θ
x∗ − y ≤ + x∗ − x, (7.17)
2 κ

1 + θ D− f (λ ) 2 + f0 (λ )
x∗ − y ≤ x ∗ − x + θ x∗ − x. (7.18)
2 | f0 (λ )| | f0 (λ )|
Proof: In view of (7.13) and F(x∗ ) = 0, we get in turn that
y − x∗ = F (x)−1 [EF (x∗ , x) + [F(x) + F (x)(y − x)]] .
and
−F (x0 )−1 F(x) = F (x0 )−1 [EF (x∗ , x) + F (x)(x∗ − x)] .
It follows from (7.16) in the preceding identities that,

x∗ − y ≤ F (x)−1 F (x0 ) F (x0 )−1 EF (x∗ , x) + θ F(x0 )−1 F(x)
and
F (x0 )−1 F(x) ≤ F (x0 )−1 EF (x∗ , x) + F (x0 )−1 F (x)x∗ − x.
Combining two above inequalities with Proposition 7.3.5, Corollary 7.3.8 with
y = x∗ and s = λ − t and Lemma 7.3.10 we have that

1 1 + θ f (λ ) − f (t)
x∗ − y ≤ x∗ − x + θ [2 + f0 (t)] x∗ − x.
| f0 (t)| 2 λ −t
Since x∗ − x ≤ λ − t, f < −κ < 0 in [0, λ ) and f0 , f are increasing the first
inequality follows from last inequality. Using Proposition 7.2.1 and taking in
account that f0 ≤ f < 0 in [0, λ ) and increasing we obtain the second inequality
from above inequality.
In the following result we present a single inexact Newton iteration with rel-
ative error θ .
Lemma 7.3.12 Take t, ε, θ = 0 and x ∈ D such that

f (t) + ε
x − x0 ≤ t < t¯, F (x0 )−1 F(x) ≤ f (t) + ε,t − (1 + θ ) < R. (7.19)
f0 (t)
If y ∈ X and
F (x0 )−1 [F(x) + F (x)(y − x)] ≤ θ F (x0 )−1 F(x), (7.20)
then the following estimates hold

f (t) + ε
1. y − x ≤ −(1 + θ ) ;
f0 (t)
f (t) + ε
2. y − x0 ≤ t − (1 + θ ) < R;
f0 (t)

−1 f (t) + ε
3. F (x0 ) F(y) ≤ f t − (1 + θ ) + ε + 2 θ ( f (t) + ε).
f0 (t)
Proof: From 7.3.5 and the first inequality in (7.19) we conclude that F (x)−1 ∈
L(Y, X ) and F (x)−1 F (x0 ) ≤ −1/ f0 (t). Therefore, using also triangular in-
equality, (7.20) and the identity

y − x = F (x)−1 F (x0 ) F (x0 )−1 [F(x) + F (x)(y − x)] − F (x0 )−1 F(x) ,
we conclude that
−1
y − x ≤ (1 + θ )F (x0 )−1 F(x).
f0 (t)
To end the proof of the first item, use the second inequality on (7.19) and the
above inequality. The second item follows from the first one, the first and the
third inequalities in (7.19) and triangular inequality. Using the definition of the
error (7.13) we have that

F(y) = EF (y, x) + F (x0 ) F (x0 )−1 [F(x) + F (x)(y − x)] .
Hence, using (7.20), the second inequality on (7.19) and the triangle inequality
we have that
F (x0 )−1 F(y) ≤ F (x0 )−1 EF (y, x) + θ F (x0 )−1 F(x)
≤ F (x0 )−1 EF (y, x) + θ ( f (t) + ε).
Using the first item, (7.19) and Lemma 7.3.6 with s = −(1 + θ )( f (t) + ε)/ f0 (t)
we get

−1 f (t) + ε
F (x0 ) EF (y, x) ≤ e f t − (1 + θ ) ,t
f0 (t)

f (t) + ε
= f t − (1 + θ ) + ε + θ ( f (t) + ε).
f0 (t)
Using the two above equation we obtain the latter inequality in the third item.
In view of Lemma 7.3.12, define, for θ = 0, the auxiliary map nθ : [0, t¯) ×
[0, 8) → R × R,

f (t) + ε
nθ (t, ε) := t − (1 + θ ) , ε + 2θ ( f (t) + ε) . (7.21)
f0 (t)
Let
Ω := {(t, ε) ∈ R × R : 0 ≤ t < λ , 0 ≤ ε ≤ κt, 0 < f (t) + ε}. (7.22)
Lemma 7.3.13 If 0 ≤ θ ≤ Θ, (t, ε) ∈ Ω and (t+ , ε+ ) = nθ (t, ε), that is,

f (t) + ε
t+ = t − (1 + θ ) , ε+ = ε + 2θ ( f (t) + ε),
f0 (t)
then nθ (t, ε) ∈ Ω, t < t+ , ε ≤ ε+ and

1+θ2
f (t+) + ε+ < ( f (t) + ε).
2
Proof: Since 0 ≤ t < λ , according to (7.10) we have f (t) < −κ < 0. Therefore
t < t+ and ε ≤ ε+ .
As ε ≤ κt, f (t) + ε > 0 and −1 ≤ f (t) < f (t) + κ < 0,
f (t) + ε f (t) + κt
− ≤ −
f0 (t) f0 (t)
(7.23)
f (t) + κt κ f (t) + κt
= − 1+ ≤− (1 − κ)
f0 (t) + κ f0 (t) f0 (t) + κ
The function h(s) := f (s) + κs is differentiable at the point t, h (t) < 0, is strictly
convex and
lim h(t) = 0.
s→λ −
Hence, using Proposition 7.2.2, we have t − h(t)/h (t) < λ , which is equivalent
to
f (t) + κt
− < λ − t. (7.24)
f0 (t) + κ
Using the above inequality with (7.23) and the definition of t+ we conclude that
t+ < t + (1 + θ )(1 − κ)(λ − t).
Using (7.10) and (7.12), we have (1 + θ )(1 − κ) ≤ 1 − θ < 1, which combined

with the above inequality yields t+ < λ . Using (7.10), the definition of ε+ and
inequality ε ≤ κt we obtain
ε+ ≤ 2θ ( f (t) + ε) + κt
= κ(t + (1 + θ )( f (t) + ε)).
Taking into account the inequalities f (t) + ε > 0 and −1 ≤ f (t) < 0, we have
that
f (t) + ε
f (t) + ε ≤ − .
f0 (t)
Using the two above inequalities with the definition of t+ we obtain ε+ ≤ κt+ .
In order to prove the two last inequalities, note that using the definition (7.14),
we get
f (t+ ) + ε+ = f (t) + f (t)(t + −t) + e f (t+,t) + 2θ ( f (t) + ε) + ε
= θ ( f (t) + ε) + e f (t+ ,t)

t
= θ ( f (t) + ε) + t + ( f (u) − f (t)) du.
Taking into account that f > 0 for every point we conclude that the integral
is positive. So, last equality implies that f (t+ ) + ε+ = θ ( f (t) + ε) > 0. Taking
s ∈ [t+ , λ ) and using the convexity of f , we have that
t+ t+ u−t
( f (u) − f (t)) du ≤ ( f (s) − f (u)) du
t t s−t
1 (t+ − t)2
= ( f (s) − f (t)) .
2 s−t
If we now substitute last inequality into above equation, we have that

1 (t+ − t)2
f (t+ ) + ε+ ≤ θ ( f (t) + ε) + ( f (s) − f (t))
2 s−t

1 (1 + θ )2 f (t) + ε f (s) − f (t)
= θ+ ( f (t) + ε).
2 (s − t) − f0 (t) − f0 (t)
On the other hand, due to f (s) + κ < 0 and −1 ≤ f (t) we can conclude that
f (s) − f (t) f (s) + κ − f (t) − κ
= ≤ 1 − κ.
− f0 (t) − f0 (t)
Using (7.23), (7.24) and the last two above inequalities and taking in account that
(1 + θ )(1 − κ) ≤ 1 − θ we conclude that

1 λ −t
f (t+ ) + ε+ ≤ θ + (1 + θ )2 (1 − κ)2 ( f (t) + ε)
2 s−t

1 λ −t
= θ + (1 − θ )2 ( f (t) + ε),
2 s−t
and the result follows taking the limit s → λ− .

Therefore, we shall deal with a family of mappings, describing all possible
inexact iterations.
Definition 7.3.14 (cf. [22]) For 0 ≤ θ , Nθ is the family of maps Nθ : U(x0 , t¯) →
X such that
F (x0 )−1 [F(x) + F (x)(Nθ (x) − x)] ≤ θ F (x0 )−1 F(x), (7.25)
for each x ∈ U(x0 , t¯).
If x ∈ U(x0 , t¯), then F (x)−1 ∈ L(Y, X ). Hence, for θ = 0, N0 has a single ele-
ment, namely the exact Newton iteration map given by
N0 : U(x0 , t¯) → X, x → N0 (x) = x − F (x)−1 F(x).
Easily, if 0 ≤ θ ≤ θ then N0 ⊂ Nθ ⊂ Nθ . Therefore, Nθ is non-empty for all

θ = 0.
Remark 7.3.15 (cf. [22]) For any θ ∈ (0, 1) and Nθ ∈ Nθ

Nθ (x) = x ⇔ F(x) = 0, x ∈ U(x0 , t¯).
In other words, the fixed points of the exact Newton iteration and the fixed points
of the inexact Newton iteration Nθ and the zeros of F are the same.
Using a combination of both Lemmas 7.3.12 and 7.3.13 we found the main tool
for the analysis of the inexact Newton method. Define
K(t, ε) := {x ∈ X : x − x0 ≤ t, F (x0 )−1 F(x) ≤ f (t) + ε}, (7.26)
and #
K := K(t, ε). (7.27)
(t,ε)∈Ω
Where nθ and Nθ were defined in (7.21), (7.22) and Definition 3.14 respectively.
Proposition 7.3.16 (cf. [22]) Take 0 ≤ θ ≤ Θ and Nθ ∈ Nθ . Then for any (t, ε) ∈
Ω and x ∈ K(t, ε)
Nθ (K(t, ε)) ⊂ K(nθ (t, ε)) ⊂ K, Nθ (x) − x ≤ t+ − t,
where t+ is the first component of nθ (t, ε). Moreover,
nθ (Ω) ⊂ Ω, Nθ (K) ⊂ K. (7.28)
Proof: Using definitions (7.21), (7.22), Definition 3.14, (7.26), (7.27) with Lem-
mas 7.3.12 and 7.3.13.
Theorem 7.3.17 Take 0 ≤ θ ≤ Ω and Nθ ∈ Nθ . For any (t0 , ε0 ) ∈ Ω and y0 ∈
K(t0 , ε0 ), the sequences
yn+1 = Nθ (yn ), (tn+1 , εn+1 ) = nθ (tn , εn ), k = 0, 1, . . . , (7.29)
are well defined,
yn ∈ K(tn , εn ), (tn , εn ) ∈ Ω, k = 0, 1, . . . , (7.30)
the sequence {tn } is strictly increasing and converges to some (0, λ ], the se-
quence {εn } is non-decreasing and converges to some ε ∈ [0, κλ ],
n
−1 1+θ2
F (x0 ) F(yn ) ≤ f (tn ) + εn ≤ ( f (t0 ) + ε0 ), k = 0, 1, . . . ,
2
(7.31)
the sequence {yn } is contained in U(x0 , λ ) and converges to a point x∗ ∈ Ū(x0 ,t∗ )
which is the unique zero of F in U(x0 , τ̄) and
yn+1 − yn ≤ tn+1 − tn , x∗ − yn ≤
t − tn , k = 0, 1, . . . . (7.32)
Moreover, if the following condition holds
(C4 ) λ < R,
then the sequence {yn } satisfies, for k = 0, 1, . . .

1 + θ D− f (λ ) 2 + f0 (λ )
x∗ − yn+1 ≤ + x∗ − yn + θ x∗ − yn . (7.33)
2 | f0 (λ )| | f0 (λ )|
Hence, if 0 ≤ θ < κ/(4 + κ), then {yn } converges Q-linearly as follows

1 + θ 2θ
x∗ − yn+1 ≤ + x∗ − yn , k = 0, 1, . . . . (7.34)
2 κ
Proof: From the assumptions on θ , (t0 , ε0 ), y0 and the two last inclusions on
Proposition 7.3.16 it is clear that the sequences {(tn , εn )} and {yn } as defined in
(7.29) are well defined. Furthermore, since (7.30) holds for k = 0, using the first
inclusion in Proposition 7.3.5 and an inductive argument on k, we conclude that
(7.30) holds for all k. From Proposition 7.3.5, (7.30) and (7.29) while the first
inequality in (7.31) follows from (7.30) and the definition of K(t, ε) in (7.26)
follows the first inequality in (7.32). Using the definition of in (7.22) shows that
Ω ⊂ [0, λ ) × [0, κλ ).
Hence, using (7.30) and the definition of K(t, ε), we have tn ∈ [0, λ ), εn ∈ [0, κλ ),
yn ∈ U(x0 , λ ), k = 0, 1, . . . .
Using Lemma 7.3.13 and (7.22) we conclude that {tn } is strictly increasing,
{εn } is non-decreasing and the second equality in (7.31) holds for all k. Hence,
in view of the first two above inclusions, {tn } and {εn } converge, respectively, to
some ε ∈ [0, κλ ]. Convergence to
t ∈ (0, λ ] and t , together with the first inequal-
ity in (7.32) and the inclusion yn ∈ U(x0 , λ ) implies that yn converges to some
x∗ ∈ Ū(0, λ ) and that the second inequality on (7.32) holds for all k.
Using the inclusion yn ∈ U(x0 , λ ), the first inequality in Corollary 7.3.9 and
(7.31), we have that
n
1+θ2
− f (yn − x0 ) ≤ ( f (t0 ) + ε0 ), k = 0, 1, . . . .
2
According to (7.12), f < −κ in [0, λ ). Therefore, since f (t∗ ) = 0 and t∗ < λ ,
f (t) ≤ −κ(t − t∗ ), t∗ ≤ t < λ .
Hence, if yn − x0 = t∗ , we can combine the two above inequalities, setting
t = yn − x0 in the second, to obtain
n
1+θ2 f (t0 ) + ε0
yn − x0 − t∗ ≤ .
2 κ
Note that the above inequality remain valid even if yn − x0 < t∗ . Thus, tak-
ing the limit k → ∞ in the above inequality we conclude that x∗ − x0 ≤ t∗ .
Furthermore, now that we know that x∗ is in the interior of the domain of F,

we can conclude that F(x∗ ) = 0. The classical version of Kantorovich’s theo-
rem on Newtons method for a generic majorant function (cf. [6]) guarantee that
under the assumptions of Theorem 7.3.1, F has a unique zero in U(x0 , τ̄). There-
fore x∗ must be this zero of F. To prove (7.33) and (7.34), note that we have
yn − x0 ≤ tn , for all k = 0, 1, . . .. Now, we obtain from second inequality in
(7.32) that x∗ − yn ≤ λ − tn , for all k = 0, 1, . . .. Hence, using (C4 ), F(x∗ ) = 0
and the first equality in (7.29), applying the Lemma 7.3.11 we arrive at the desire
inequalities. In order to conclude the proof, note that for 0 ≤ θ < κ/(4 + κ) the
quantity in the bracket in (7.34) is less than one, which implies that the sequence
{yn } converges Q-linearly.
Proposition 7.3.18 If 0 ≤ ρ < β /2 then
ρ < t¯/2 < t¯, f (ρ) < 0.
Proof: Assumption ρ < β /2 and Proposition 7.3.3-item (iii) proves the first two
inequalities of the proposition. The last inequality follows from the first inequal-
ity and Proposition 7.3.3-item (i).
Now, using all the preceding results we can prove Theorem 7.3.1.
Proof: (Proof of Theorem 7.3.1) We will begin with the case ρ = 0 and z0 = x0 .
Note that, from the definition in (7.10), we have that
κ0 = κ, λ0 = λ , Θ0 = Θ.
Since
(0, 0) ∈ Ω, x0 ∈ K(0, 0),
using Theorem 7.3.17 we conclude that Theorem 7.3.1 holds for ρ = 0. For prov-
ing the general case, take
0 ≤ ρ < β /2, z0 ∈ Ū(x0 , ρ). (7.35)
Using Proposition 7.3.18 and (7.9) we conclude that ρ < t¯/2 and f (ρ) < 0.
Define
−1
g : [0, R − ρ) → R, g(t) = [ f (t + ρ) + 2ρ]. (7.36)
f0 (ρ)
We claim that g is a majorant function for F at point z0 . Trivially, U(z0 , R − ρ) ⊂
D, g (0) = −1, g(0) > 0. Moreover g is also convex and strictly increasing. To
end the proof that g satisfies C1 , C2 and C3 , using Proposition 7.3.3-item (iii) and
second inequality in (7.35), we have that
−1
lim g(t) = (2ρ − β ) < 0.
t→t¯−ρ f0 (ρ)
Using Proposition 7.2.1, we obtain that

−1
F (z0 )−1 F (x0 ) ≤ . (7.37)
f0 (ρ)
Hence, using also the second inequality of Corollary 7.3.9, we get
F (z0 )−1 F(z0 ) ≤ F (z0 )−1 F (x0 )F (x0 )−1 F(z0 )
−1
≤ [ f (z0 − x0 ) + 2z0 − x0 ] .
f0 (ρ)
As f ≥ −1, the function t → f (t) + 2t is (strictly) increasing. Combining this

fact with the above inequality and (7.36) we conclude that
F (z0 )−1 F (z0 ) ≤ g(0).
To end the proof that g is a majorant function for F at z0 , take x, y ∈ X such that
x, y ∈ U(z0 , R − ρ), x − z0 + y − x < R − ρ.
Therefore x, y ∈ U(x0 , R), x − x0 + y − x < R and using (7.37) together with

(7.3), we obtain that

F (z0 )−1 F (y) − F (x) ≤ F (z0 )−1 F (x0 ) F (x0 )−1 F (y) − F (x)
−1
≤ f (y − x + x − x0 ) − f (x − x0 ) .
f0 (ρ)
Since f is convex, the function t → f (s + t) − f (s) is increasing for s = 0 and

x − x0 ≤ x − z0 + z0 − x0 ≤ x − z0 + ρ,
f (y − x + x − x0 ) − f (x − x0 ) ≤ f (y − x + x − z0 + ρ) − f (x − z0 + ρ).
Combining the two above inequalities with the definition of g, we obtain
F (z0 )−1 [F (y) − F (x)] ≤ g (y − x + x − z0 ) − g (x − z0 ).

Note that for κρ , λρ and Θρ as defined in (7.5), we have that
−g(t) κρ
κρ = sup , λρ = sup{t ∈ [0, R − ρ) : κρ + g (t) < 0}, Θρ = ,
0<t<R−ρ t 2 − κρ
which are the same as (7.5) taking g instead of f . Hence, applying Theorem
7.3.1 for F and the majorant function g at point z0 and ρ = 0, we conclude that
the sequence zn is well defined, remains in U(z0 , λρ ), satisfies (7.8) and converges
to some z∗ ∈ Ū(z0 ,t ∗ , ρ) which is a zero of F, where t∗,ρ is the smallest solution
of g(t) = 0. Using (7.36) we conclude that t∗,ρ is the smallest solution of
f (ρ + t) + 2ρ = 0.
Hence, in view of item ii of Proposition 7.3.3, we have ρ + t∗,ρ < t¯ ≤ τ̄ and

Ū(z0 ,t ∗ , ρ) ⊂ U(x0 , τ̄). Hence, z∗ is the unique zero of F in U(x0 , τ̄), which we
already called x∗ . Since
f (t + ρ) − D− f (t + ρ)
g (t) = , D g (t) = , t ∈ [0, R − ρ),
| f0 (ρ)| | f0 (ρ)|
applying again Theorem 7.3.1 for F and the majorant function g at point z0 and
ρ = 0, we conclude that item (C4 ) also holds.
7.4 Special cases and applications

We present some specializations of Theorem 7.3.1 to some interesting cases such
as Kantorovich’s theory (cf. [2, 6, 9, 11, 24, 29, 37]) and even Smale’s α-theory or
Wang’s γ-theory (cf. [9, 12, 13, 15, 19–21, 33, 35]).
Theorem 7.4.1 Let X and Y be Banach spaces, D ⊆ X and F : D → Y Fréchet-
differentiable on int(D). Take x0 ∈ int(D) with F (x0 )−1 ∈ L(Y, X )
Suppose that U(x0 , 1/γ) ⊂ D, b > 0, γ0 > 0, γ > 0 with γ0 ≤ γ and that
√
−1
√ 1−2 α −α
F (x0 ) F(x0 ) ≤ β , α = β γ < 3 − 2 2, 0 ≤ θ ≤ √ .
1+2 α +α
2γ
F (x0 )−1 [F(x) − F(y)] ≤ x − y,
(1 − γr)3
1
for each x, y ∈ U(x0 , r), 0 < r < .
γ
γ0 (2 − γ0 r)
F (x0 )−1 [F(x) − F(x0 )] ≤ x − x0 ,
(1 − γ0 r)2
1
for each x ∈ U(x0 , r), 0 < r < .
γ
Then, the sequence generated by the inexact Newton method for solving F(x) = 0
with starting point x0 and residual relative error tolerance θ : For k = 0, 1, . . . ,
xn+1 = xn + Sn , F (x0 )−1 [F(xn ) + F (xn )Sn ] ≤ θ F (x0 )−1 F(xn ),
is well defined, the generated sequence {xn } converges to a point x∗ which is a
zero of F, n
−1 1+θ2
F (x0 ) F(xn ) ≤ β , k = 0, 1, . . . ,
2
the sequence {xn } is contained in U(x0 , λ ), x∗ ∈ Ū(x0 ,t∗ ) and x∗ is the unique
zero of F in U(x0 , τ̄), where
β
λ := √ ,
α +α
√ √
1 + α − 1 − 6α + α 2 1 + α + 1 − 6α + α 2
t∗ = , τ̄ = .
4 4
On the other hand, the sequence {xn } satisfies, for k = 0, 1, . . . ,

1 + θ D− f (λ ) f0 (λ ) + 2
x∗ − xk+1 ≤ x ∗ − xn + θ x∗ − xn .
2 | f0 (λ )| | f0 (λ )|
√ √
If, additionally, 0 ≤ θ < (1 − 2 α − α)/(5 − 2 α − α), then {xn } converges
Q-linearly as follows

1+θ 2θ
x∗ − xn+1 ≤ + √ x∗ − xn , k = 0, 1, . . . .
2 1−2 α −α
Proof: The function f : [0, 1/γ) → R
t
f (t) = − 2t + β ,
1 − γt
is a majorant function for F in x0 (cf. [20]). Therefore, all results follow from
Theorem 7.3.1, applied to this particular context by choosing f0 : [0, 1/γ) → R
such that
1
f0 (t) = −2 + .
(1 − γ0t)2

Notice that if F is analytic on U(x0 , 1/γ) we can choose
1
F (x0 )−1 F (k)(x0 ) k−1
γ0 = sup

,

k>1 k!
(cf. [1, 11, 12, 16, 37]). A semilocal convergence result for Newton method is in-
strumental in the complexity analysis of linear and quadratic minimization prob-
lems by means of self-concordant functions (cf. [23, 28]). Also in this setting,
Theorem 7.3.1 provides a semilocal convergence result for Newton method with
a relative error tolerance.
Theorem 7.4.2 Let X and Y be a Banach spaces, D ⊆ X and F : D → Y

Fréchet-differentiable on int(D). Take x0 ∈ int(D) with F (x0 )−1 ∈ L(Y, X ).
Suppose that exist constants L0 > 0, L > 0 with L0 ≤ L and β > 0 such that
β L < 1/2, U(x0 , 1/L) ⊂ D and
F (x0 )−1 [F (x) − F (x0 )] ≤ L0 x − x0 , x ∈ U(x0 , 1/L),
F (x0 )−1 [F (y) − F (x)] ≤ Lx − y,x, y ∈ U(x0 , 1/L),

1 − 2β L
F (x0 )−1 F(x0 ) ≤ β , 0≤θ ≤ .
1 + 2β L
Then, the sequence generated by the inexact Newton method for solving F(x) = 0
with starting point x0 and residual relative error tolerance θ : For k = 0, 1, . . . ,
xn+1 = xn + Sn , F (x0 )−1 [F(xn ) + F (xn )Sn ] ≤ θ F (x0 )−1 F(xn ),
is well defined,
n
−1 1+θ2
F (x0 ) F(xn ) ≤ β , k = 0, 1, . . . .
2
the sequence {xn } is contained in U(x0 , λ ), converges to a point x∗ ∈ Ū(x0 ,t∗ )

which is the unique zero of F in U(x0 , 1/L) where

2β L 1 − 1 − 2Lβ
λ := , t∗ = .
L L
Moreover, the sequence {xn } satisfies, for k = 0, 1, . . ., δ = LL0

1+θ L 1 + δ 2β L
x∗ − zn+1 ≤ x∗ − xn + θ x∗ − xn .
2 1 − δ 2β L 1 − δ 2β L

If, additionally, 0 ≤ θ < (1 − 2β L)/(5 − 2β L), then the sequence {xn } con-
verges Q-linearly as follows

1+θ 2θ
x∗ − xn+1 ≤ + x∗ − xn , k = 0, 1, . . . .
2 1 − 2β L
Proof: The function f : [0, 1/L) → R,

L
f (t) := t 2 − t + β ,
2
is a majorant function for F at point x0 . Then, all result follow from Theorem
7.3.1, applied to this particular context by choosing f0 : [0, L1 ) → R such that
f0 (t) = L0t − 1.
Remark 7.4.3 (a) If f0 = f all preceding results reduce to the corresponding

ones in [22]. If f0 < f the earlier results are improved, since the error
bounds on the distances are more precise. Moreover, these advantages
are obtained under the same convergence criteria and computational cost,
since function f0 is a special case of f .
(b) Let us show that obtaining a even weaken criteria is possible in the special
case when θ = 0. That is when, θn = 0 for each n = 0, 1, 2 . . .. Then, we
have Newton’s method defined by (7.2).
Using Proposition 7.2.1 and Corollary 7.3.9 for f (t) = L2 t 2 − t + β and

f0 (t) = L0t − 1 we deduce that
L̄xn+1 − xn 2
xn+1 − xn ≤ for each n = 0, 1, 2, . . . ,
2 (1 − L0 xn+1 − x0 )
where
L0 if n=1
L̄ =
L if k>1
Define scalar sequence {rn } by
L0 (r1 − r0 )2
r0 = 0, r1 = β , r2 = r1 + ,
2(1 − L0 r1 )
L(rn − rn−1 )2
rn+1 = rn + for each n = 2, 3, . . . .
2(1 − L0 rn )
Then, we have that (cf. [11])
xn+1 − xn ≤ rn+1 − rn for each n = 0, 1, 2, . . . .
It was shown in [11] that {rn } is a majorizing sequence for {xn } and con-
verges provided that
1
H1 = L1 β ≤ ,
2
where
1 √
L1 = 4L0 + L0 L + L0 L + 8L0 . 2
8
The corresponding results deduced from Theorem 7.3.1 (with f0 = f ) are
given by
Lxn+1 − xn 2
xn+1 − xn ≤ for each n = 0, 1, 2, . . .
2(1 − L0 xn+1 − x0 )
and the corresponding majorizing sequence {vn } is defined by
L(vn − vn−1 )2 f (vn )

v0 = 0, vn+1 = vn + = vn − for each n = 1, 2, . . . .
2(1 − L0 vn ) f (vn )
The sufficient convergence criterion for sequence {vn } is given by the fa-
mous for its simplicity and clarity Kantorovich hypothesis (cf. [24])
1
h = Lβ ≤ .
2
Notice that
1 1
h≤ ⇒ h1 ≤
2 2
but not necessarily vice versa unless if L0 = L.

We also have that L0 ≤ L holds in general and L/L0 can be arbitrarily large
(cf. [11–13]). Moreover, an inductive argument shows
rn ≤ vn
rn+1 − rn ≤ vn+1 − vn
and
r∗ = lim rn ≤ v∗ = lim vn .
k→∞ k→∞
The inequality in the first two preceding inequalities is strict if L0 < L for
each n = 2, 3, . . .. Finally, notice that
h1 L0
→ 0 as → 0.
h L
The preceding estimate shows by how many times at most the applicability
of Newton’s method is expanded with our approach. Examples where L0 <
L and where the old convergence criteria are not satisfied but the new
convergence criteria are applied can be found for example in [11].
Notice that the Lipschitz conditions in the last two theorems imply the
center-Lipschitz conditions with γ0 ≤ γ and L0 ≤ L. That is the center-
Lipschitz conditions are not additional hypotheses to the Lipschitz condi-
tions.
(c) Another important consideration is that in case of modified inexact Newton
method defined by
x̄n+1 = x̄n + Sn , F (x0 )−1 [F(x̄n ) + F (x0 )Sn ] ≤ θ F (x0 )−1 F(xn )
is used, then f0 can replace f in all preceding results. The resulting con-
vergence criteria are weaker but the rate of convergence is only linear.
However, we can start with the slower method until a certain finite step
N ≥ 1 such that the convergence criteria of Theorem 7.3.1 hold for x0 = x̄N .
Then, we can continue with the faster method. Such an approach has been
given by us in [11–13] for Newton’s method (7.2). Notice that the ap-
proach in [22] shall use f instead of f0 . That means, e.g., in the case of
Newton’s method the sufficient convergence criterion shall be
1
H = Lβ ≤
2
for the modified Newton’s method whereas in our case it will be
1
H0 = L0 β ≤ .
2
Notice again that

1 1 H0 L0
H≤ ⇒ H0 ≤ and → 0 as → 0.
2 2 H L
Finally, the error bounds are also more precise in this case (cf. [5, 11]).
References
[1] F. Alvarez, J. Bolte and J. Munier, A unifying local convergence result

for Newton’s method in Riemannian manifolds. Found. Comput. Math., 8,
197–226, 2008.
[2] S. Amat, C. Bermúdez, S. Busquier, M. J. Legaz and S. Plaza, On a family
of high-order iterative methods under Kantorovich conditions and some
applications. Abstr. Appl. Anal., Art. ID 782170, 14 pages, 2012.
ative functions to solve nonlinear equations. J. Comput. Appl. Math., 157,
197–205, 2003.
[4] I. K. Argyros, Forcing sequences and inexact Newton iterates in Banach
space. Appl. Math. Let., 13, 69–75, 2000.
method with a modified Newton’s method iterate as a starting point. Rev.
[7] I. K. Argyros, On the semilocal convergence of inexact Newton methods in
Banach spaces, J. Comput. Appl. Math., 228, 434–443, 2009.
[8] I. K. Argyros, Y. J. Cho and S. Hilout, On the local convergence analysis of
inexact Gauss–Newton–like methods. Panamer. Math. J., 21, 11–18, 2011.
[9] I. K. Argyros, Y. J. Cho and S. Hilout, Numerical methods for equations and
its applications, CRC Press/Taylor and Francis Group, New York, 2012.
140
References 141
[10] I. K. Argyros and S. Hilout, Improved local convergence of Newton’s

method under weak majorant condition J. Comput. Appl. Math., 236, 1892–
1902, 2012.
ton’s method. J. Complexity, 28, 364–387, 2012.
[12] I. K. Argyros and S. Hilout, Computational Methods in Nonlinear Analysis:
Efficient Algorithms, Fixed Point Theory and Applications, World Scien-
tific, London, 2013.
[13] I. K. Argyros and S. Hilout, Improved local convergence analysis of inex-
act Gauss-Newton like methods under the majorant condition in Banach
spaces. J. Franklin Inst., 350, 1531–1544, 2013.
[14] L. Blum, F. Cucker, M. Shub and S. Smale, Complexity and Real Compu-
tation, Springer-Verlag, New York, 1998.
[15] J. Chen and W. Li, Convergence behaviour of inexact Newton methods un-
der weak Lipschitz condition J. Comput. Appl. Math., 191, 143–164, 2006.
[16] F. Cianciaruso, Convergence of Newton–Kantorovich approximations to an
approximate zero. Numer. Funct. Anal. Optim., 28, 631–645 (2007).
[17] R. Dembo, S. C. Eisenstat and T. Steihaug, Inexact Newton methods. SIAM
J. Numer. Anal., 19, 400–408 (1982).
[18] J. E. Jr. Dennis and R. B. Schnabel, Numerical methods for unconstrained
optimization and nonlinear equations (Corrected reprint of the 1983 origi-
nal), Classics in Appl. Math. SIAM, 16, Philadelphia, PA, 1996.
[19] O. P. Ferreira and B. F. Svaiter, Kantorovich’s theorem on Newton’s method
in Riemannian manifolds, J. Complexity, 18, 304–329, 2002.
[20] O. P. Ferreira and B. F. Svaiter, Kantorovich’s majorant principle for New-
ton’s method, Comput. Optim. Appl., 42, 213–229, 2009.
[21] O. P. Ferreira, M. L. N. Gonçalves and P. R. Oliveira, Local convergence
analysis of inexact Gauss-Newton like methods under majorant condition,
J. Comput. Appl. Math., 236, 2487–2498, 2012.
[22] O. P. Ferreira and B. F. Svaiter, A robust Kantorovich’s theorem on inexact
Newton method with relative residual error tolerance. J. Complexity, 28,
346–363, 2012.
[23] J. B. Hiriart-Urruty and C. Lemaréchal, Convex analysis and minimization
algorithms (two volumes). I. Fundamentals. II. Advanced theory and bundle
methods, 305 and 306, Springer–Verlag, Berlin, 1993.
142 References

Oxford, 1982.
[26] I. Moret, A Kantorovich–type theorem for inexact Newton methods, Nu-
mer. Funct. Anal. Optim., 10, 351–365, 1989.
[27] B. Morini, Convergence behaviour of inexact Newton methods. Math.
Comp., 68, 1605–1613, 1999.
[28] Y. Nesterov and A. Nemirovskii, Interior–point polynomial algorithms in
convex programming, SIAM Studies in Appl. Math., 13, Philadelphia, PA,
1994.
[29] F. A. Potra, The Kantorovich theorem and interior point methods, Math.
Program., 102, 47–70, 2005.
(2L), 415–421, 2015.
291, 468–477, 2016.
[33] S. Smale, Newton method estimates from data at one point. The merging
of disciplines: New Directions in Pure, Applied and Computational Math-
ematics (R. Ewing, K. Gross, C. Martin, eds.), Springer–Verlag, New York,
185–196, 1986.
[34] J. F. Traub, Iterative methods for the solution of equations, Englewood
Cliffs, New Jersey: Prentice Hall, 1964.
[35] X. Wang, Convergence of Newton’s and uniqueness of the solution of equa-
tions in Banach space, IMA J. Numer. Anal., 20, 123–134, 2001.
[36] T. J. Ypma, Local convergence of inexact Newton methods, SIAM J. Numer.
Anal., 21, 583–590, 1984.
References 143
[37] P. P. Zabrejko and D. F. Nguen, The majorant method in the theory of

Newton–Kantorovich approximations and the Ptǎk error estimates, Numer.
Funct. Anal. Optim., 9, 671–684, 1987.
Chapter 8
Inexact
Gauss-Newton-like
method for least square
problems
Ioannis K. Argyros

CONTENTS
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.2 Auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.3 Local convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.4 Applications and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
144
Inexact Gauss-Newton-like method for least square problems 145
8.1 Introduction
In this chapter we are interested in locating a solution x∗ of the nonlinear least
squares problem:
1
min G(x) := F(x)T F(x), (8.1)
2
where F is Fréchet-differentiable defined on Rn with values in Rm , m ≥ n.
In Computational Sciences the practice of numerical analysis is essentially
connected to variants of Newton’s method [1]– [15]. In the present chapter we
consider the local convergence of inexact Gauss-Newton method (IGNM) for
solving problem (8.1).
In particular, we define the method as follows [5, 8, 10, 11]:
Let U(x, r) and U(x, r) denote, respectively, for the open and closed ball in Rn
with center x and radius r > 0. Denote by L(Rn , Rm ) the space of bounded linear
operators from Rn into Rm . For k = 0 step 1 until convergence do.
Find the step Δk which satisfies
rk
Bk Δk = −F (xk )T F(xk ) + rk , where ≤ ηk . (8.2)
F (xk )T F(xk )
Set xk+1 = xk + Δk .
Here, Bk is an n × n nonsingular matrix, and {ηk } is a sequence of forcing
terms such that ηk ∈ [0, 1] for each k.
The process is called inexact Gauss-Newton method (IGNM) if Bk =
F (xk )T F(xk ) and it is called inexact Gauss-Newton-like method (IGNLM) if
Bk = B(xk ) approximates F (xk )T F(xk ). There is a plethora of convergence re-
sults for (IGNM) and (IGNLM) under various Lipschitz-type conditions [1]–
[15]. Recently, studies have been focused on the analysis of stopping residual
rk
controls F(x k )
≤ ηk and its effect on convergence properties [3, 5–8, 10, 11] by
considering iterative form where a scaled residual control is performed at each
iteration as follows:
For k = 0 step 1 until convergence do.
Find the step Δk which satisfies
Pk rk
Bk Δk = −F(xk ) + rk , where ≤ θk . (8.3)
Pk F(xk )
Set xk+1 = xk + Δk .
Here, Pk−1 ∈ L(Rm , Rn ) for each k.
In this chapter, we are motivated by different studies [5–7] and optimization
considerations. We show our advantages by considering a combination of weak
Lipschitz and weak center Lipschitz conditions.
The chapter is organized as follows. In Section 8.2, we present some auxil-
iary results on Moore-Penrose inverses. The local convergence of (IGNM) and
(IGNLM) is presented in Section 8.3. Numerical examples are presented in Sec-
tion 8.4.
8.2 Auxiliary results

We denote as Rm×n the set of all m × n matrices A. We also denote by A† the
Moore-Penrose inverse of matrix A [12, 13]. If A has full rank, i.e., if rank(A) =
min{m, n} = n, then A† = (At A)−1 AT . Next, we present some standard Lemmas.
Lemma 8.2.1 [12]. Suppose that A, E ∈ Rm×n , B = A + E, A† E < 1,
rank(A) = rank(B), then
A†
B† ≤ . (8.4)
1 − A† E|
Moreover, if rank(A) = rank(B) = min{m, n}, then

√
† † 2A† 2 E
B − A ≤ . (8.5)
1 − A† E|
Lemma 8.2.2 [12]. Suppose that A, E ∈ Rm×n , B = A + E, EA† E < 1,

rank(A) = n, then rank(B) = n.
Now, we present results involving weak Lipschitz condition ( [15]) in the case
F (x)† = F (x)−1 .
Lemma 8.2.3 Suppose that F is continuously Fréchet-differentiable in
U(x , r), F(x ) = 0, F (x ) has full rank:
(i) If F satisfies the center-Lipschitz condition with L0 average:
θ s(x)
θ
F (x ) − F (x ) ≤ L0 (u)du for each x ∈ U(x , r), 0 ≤ θ ≤ 1,
0
(8.6)
where xθ = x + θ (x − x ), s(x) = x − x and L0 is a positive integrable
function. Then, for β = F (x )† , if
s(x)
β L0 (u)du < 1, (8.7)
0
1 s(x)
F (xθ ) − F (x )s(x)dθ ≤ L0 (u)(s(x) − u)du. (8.8)
0 0
and s(x)
† x − x + β L0 (u)(s(x) − u)du
F (y) F(x) ≤ s(y)
0
(8.9)
1 − β 0 L0 (u)du
for each x and y ∈ U(x , r).
(ii) If F satisfies the radius Lipschitz condition with the L average:

s(x)
F (x) − F (xθ ) ≤ L(u)du for each x and y ∈ U(x , r), 0 ≤ θ ≤ 1,
θ s(x)
(8.10)
where L is a positive integrable function. Then,
1 s(x)
θ
F (x) − F (x )s(x)dθ ≤ L(u)udu (8.11)
0 0
and if (8.7) is satisfied,

s(x) s(x)
† β L(u)udu − β 0 (L(u) − L0 (u)s(x)du
F (x) F(x) ≤ s(x) + 0
s(x)
1 − β 0 L0 (u)du
s(x)
β L(u)udu
≤ s(x) + s(x)
0
. (8.12)
1 − β 0 L0 (u)du
Proof: (i) Simply use L0 instead of L in the proof of (ii) in Lemma 2.3 [5, Page
101].
(ii) Using (8.10), we get
1 1 s(x)
F (x) − F (xθ )s(x)dθ ≤ L(u)dus(x)dθ
0 0 θ s(x)
s(x)
= L(u)udu,
0
which shows (8.11). Moreover, since F (x ) has full rank, it follows from Lem-
mas 8.2.1, 8.2.2, 8.2.3 and
s(x)
†
F (x ) F (x) − F (x ) ≤ β L0 (u)du < 1 for each x ∈ U(x , r) (8.13)
0

that F (x) has full rank and
β
[F (x)T F (x)]−1 F (x)T ≤ s(x) for each x ∈ U(x , r). (8.14)
1−β 0
L0 (u)du
Notice that we have the estimates
F (y)† F (xθ ) = I − F (y)† (F (y) − F (xθ )) (8.15)
and
F (y)† F(x) = F (y)† (F(x) − F(x ))
1
= F (y)† F (xθ )dθ (x − x ). (8.16)
0
Then, in view of (8.11), (8.14)-(8.16), we obtain

1
†
F (x) F(x) = (I − F (x)† (F (x) − F (xθ )))dθ (x − x )
0
1
≤ (1 + F (x)† F (x) − F (xθ )dθ )x − x
0
which implies (8.12).

Remark 8.2.4 If F satisfies the radius condition with L average, then, F satis-
fies the center Lipschitz condition with L0 average but not necessarily vice versa
even if L0 = L. Notice also that in general
L0 (u) ≤ L(u) for each u ∈ [0, R] (8.17)
holds and LL0 can be arbitrarily large.

If L0 = L, then, our results reduce to the corresponding ones in Lemma
8.2.3 [5]. In other case, if strict inequality holds in (8.17), then our estimates
(8.7), (8.8), (8.9) and (8.12) are more precise than the corresponding ones ob-
tained from the preceding ones for L0 = L, see (3.1)-(3.4) given in [5]. This
improvement is obtained under the same computational cost, since the computa-
tion of function L involves the computation of L0 as a special case.
Next, we complete this auxiliary section with another two results involving
functions appearing in the convergence analysis that follows in the next section.
Lemma 8.2.5 [15] Let
t
1
ϕ(t) := α L(u)uα−1 du, α ≥ 1, 0 ≤ t ≤ r, (8.18)
t 0
where L is a positive integrable function and monotonically increasing in [0, r].

Then, function ϕ is nondecreasing for each α.
Lemma 8.2.6 [15] Let

t
1
ψ(t) := 2 L(u)(αt − u)du, α ≥ 1, 0 ≤ t ≤ r, (8.19)
t 0
where L is as in Lemma 8.2.5. Then, function ψ is monotonically increasing.
8.3 Local convergence analysis

In this section, we present the main local convergence results for (IGNM) and
(IGNLM).
Proof are omitted since are analogous to the corresponding ones in [5], where
we replace old estimates
s(x)
β L(u)du < 1, (8.20)
0
1 s(x)
θ
F (x ) − F (x )s(x)dθ ≤ L(u)(s(x) − u)du, (8.21)
0 0
s(x)
† x − x + β L(u)(s(x) − u)du
F (y) F(x) ≤ s(y)
0
(8.22)
1 − β 0 L(u)du
and s(x)
† β L(u)du
F(x) F(x) ≤ s(x0 + s(x)
0
(8.23)
1 − β 0 L(u)du
by the new and more precise (8.7), (8.8), (8.9) and (8.12), respectively.
Next, we present the local convergence results. The first one involving
(IGNM).
Theorem 8.3.1 Suppose x satisfies (8.1), F has a continuous derivative in
U(x , r),
F (x ) has full rank and F (x) satisfies the radius Lipschitz condition with
L average and the center-Lipschitz condition with L0 average where L and
L0 are nondecreasing. Assume Bk = F (xk )T F (xk ), for each k in (8.2), vk =
θk (Pk F (xk )T F (xk ))−1
Pk F (xk )T F (xk ) = θkCond(Pk F (xk )T F (xk )) with vk ≤ v < 1. Let r > 0 satisfy
r √ r
β 0 L(u)udu 2cβ 2 0 L0 (u)du
(1 + v) r + r + v ≤ 1. (8.24)
r(1 − β 0 L0 (u)du) r(1 − β 0 L0 (u)du)
Then (IGNM) is convergent for all x0 ∈ U(x , r) and

s(x0 )
β L(u)du
xk+1 − x ≤ (1 + v) 0
s(x0 ) xk − x 2
s(x0 )2 (1 − β 0 L0 (u)du)
√ s(x )
2cβ 2 0 0 L0 (u)du
+ s(x0 ) + v xk − x , (8.25)
s(x0 )(1 − β 0 L0 (u)du)
where c = F(x ), β = (F (x )T F (x ))−1 F (x )T ,

s(x0 ) √ s(x0 )
β L(u)du 2cβ 2 L0 (u)du
q = (1 + v) 0
s(x0 ) + s(x0 )
0
+ v < 1.
s(x0 )(1 − β 0
L0 (u)du) s(x0 )(1 − β 0 L0 (u)du)
(8.26)
Proof: Let x0 ∈ U(x , r) where r satisfies (8.11), from the monotonicity of L0 , L,

(8.7) and Lemma 8.2.5, we have in turn
s(x0 ) √ s(x0 )
β 0 L(u)du 2cβ 2
L0 (u)du 0
q = (1 + v) s(x0 ) s(x0 ) + s(x0 ) +v
s(x0 )2 (1 − β 0 L0 (u)du) s(x0 )(1 − β 0 L0 (u)du)
r √ r
β 0 L(u)du 2cβ 2 0 L0 (u)du
< (1 + v) s(x ) + s(x ) +v ≤ 1
s(x0 )(1 − β 0 0 L0 (u)du) s(x0 )(1 − β 0 0 L0 (u)du)
and
s(x0 )
[F (x )T F (x )]−1 F (x )T F (x) − F (x ) ≤ β L0 (u)du
0
r
≤ β L0 (u)du < 1,
0
for each x ∈ U(x , r).
That is, q given by (8.26) is less than 1.

By Lemma 8.2.1 and 8.2.2, F (x) has full rank for each x ∈ U(x , r) and
β
[F (x)T F (x)]−1 F (x)T ≤ s(x0 ) ,
1−β 0
L0 (u)du
for each x ∈ U(x , r),
√ s(x)
T −1 T T −1 T 2β 2 0 L0 (u)du
[F (x) F (x)] F (x) − [F (x ) F (x )] F (x ) ≤ s(x) ,
1−β 0 L0 (u)du
Then, if xk ∈ U(x , r), we have by (8.2) in turn that
xk+1 − x = xk − x − F (xk )† (F(xk ) − F(x )) + (F (xk )T F (xk ))−1 rk

1
†
= F (xk ) (F (xk ) − F (xθ ))(xk − x )dθ
0
+(F (xk )T F (xk ))−1 Pk−1 Pk rk + (F (x )T F (x ))−1 F (x )T F(x )
−(F (xk )T F (xk ))−1 F (xk )T F(x ).
Hence, by Lemma 8.2.3 and conditions (8.11) and (8.6) we obtain

1
†
xk+1 − x ≤ F (xk ) F (xk ) − F (xθ )xk − x dθ
0
+θk (F (xk )T F (xk ))−1 Pk F (xk )T F (xk )
+(F (xk )T F (xk ))−1 F (x )T − (F (xk )T F (xk ))−1
F (xk )T F(x )
β
≤ s(xk )
1 − β 0 L0 (u)du
1 s(xk )
× L(u)dus(xk )dθ + θk (Pk F (xk )T F (xk ))−1
0 θ s(xk )
×Pk F (xk )T F (xk )(F (xk )T F (xk ))−1 F (xk )T F(xk )

√ s(x )
2cβ 2 0 k L0 (u)du
+ s(x )
1 − β 0 k L0 (u)du
s(x )
β 0 k L(u)du
≤ s(xk ) + θkCond(Pk F (xk )T F (xk ))
1 − β 0 L0 (u)du
s(x )
β 0 k L0 (u)du
× s(xk ) + s(x )
1 − β 0 k L0 (u)du
√ s(x )
2cβ 2 0 k L0 (u)du
+ s(x )
1 − β 0 k L0 (u)du
s(x )
β 0 k L(u)du
≤ (1 + vk ) s(x ) + vk s(xk )
1 − β 0 k L0 (u)du
√ s(x )
2cβ 2 0 k L0 (u)du
+ s(x ) .
1 − β 0 k L0 (u)du
Taking k = 0 above, we get x1 − x ≤ qx0 − x < x0 − x . That is x1 ∈

U(x , r). By mathematical induction, all xk belong to U(x , r) and s(xk ) = xk −
x decreases monotonically. Hence, for all k ≥ 0, we have
s(xk )
β L(u)du
xk+1 − x ≤ (1 + vk ) 0
s(xk ) s(xk )2
s(xk )2 (1 − β 0
L0 (u)du)
√ s(x )
2cβ 2 0 k L0 (u)du
+vk s(xk ) + s(x )
1 − β 0 k L0 (u)du
s(x )
β 0 k L(u)du
≤ (1 + v) s(xk ) s(xk )2
s(xk ) (1 − β 0 L0 (u)du)
2
√ s(x )
2cβ 2 0 k L0 (u)du
+ v+ s(x ) s(xk ).
1 − β 0 k L0 (u)du
2
Theorem 8.3.2 Suppose x satisfies (8.1), F(x ) = 0, F has a continuous deriva-

tive in U(x , r), F (x ) has full rank and F (x) satisfies the center Lipschitz
condition (8.6) with L0 average where L0 is nondecreasing. Assume Bk =
F (x0 )T F (x0 ), for each k in (8.2), vk = θk (P0 F (x0 )T F (x0 ))−1 .P0 F (x0 )T
F (x0 ) = θkCond
(P0 F (x0 )T F (x0 )) with vk ≤ v < 1. Let r > 0 satisfy
r r
β 0 L0 (u)(r − u)du v + β 0 L0 (u)du
(1 + v) r + r ≤ 1. (8.27)
r(1 − β 0 L0 (u)du) 1 − β 0 L0 (u)du)
Then Modified Inexact Gauss-Newton Method (MIGNM) is convergent for all

x0 ∈ U(x , r) and
s(x )
β 0 0 L0 (u)(s(x0 ) − u)du
xk+1 − x ≤ (1 + v) s(x0 ) xk − x 2
s(x0 ) (1 − β 0 L0 (u)du)
2
s(x )
v + β 0 0 L0 (u)du
+ s(x ) , (8.28)
1 − β 0 0 L0 (u)du
is less than 1.
Proof: Simply replace L by L0 in the proof of Theorem 8.3.2 in [5]. 2

Next, we present the corresponding results for Inexact Gauss-Newton-Like
Method (IGNLM), where Bk = B(xk ) approximates F (xk )T F (xk ).
Theorem 8.3.3 Suppose x satisfies (8.1), F has a continuous derivative in
U(x , r), F (x ) has full rank and F (x) satisfies the radius Lipschitz condition
(8.11) with L average and the center Lipschitz condition with L0 average where
L and L0 are nondecreasing. Let B(x) be invertible and
B(x)−1 F (x)T F (x) ≤ ω1 , B(x)−1 F (x)T F (x) − I ≤ ω2 , (8.29)

vk = θk (Pk F (xk )T F (xk ))−1 Pk F (xk )T F (xk ) = θkCond(Pk F (xk )T F (xk ))
with vk ≤ v < 1. Let r > 0 satisfy
r √ 2 r
β ω1 0 L(u)udu 2β ω1 0 L0 (u)du
(1 + v) r + ω2 + ω1 v + r ≤ 1. (8.30)
r(1 − β 0 L0 (u)du) r(1 − β 0 L0 (u)du))
Then (IGNLM) is convergent for all x0 ∈ U(x , r) and
s(x )
β ω1 0 0 L(u)udu
xk+1 − x ≤ (1 + v) s(x0 ) xk − x 2
s(x0 ) (1 − β 0 L0 (u)du)
2
√ 2 s(x0 )
2β ω1 0 L0 (u)du
+ ω2 + ω1 v + s(x ) xk − x
s(x0 )(1 − β 0 0 L0 (u)du)
(8.31)
where c = F(x ), β = [F (x )T F (x )]−1 F (x )T ,
s(x )
β ω1 0 0 L(u)udu
q = (1 + v) s(x )
s(x0 )(1 − β 0 0 L0 (u)du)
√ 2 s(x0 )
2β ω1 0 L0 (u)du
+ω2 + ω1 v + s(x ) (8.32)
s(x0 )(1 − β 0 0 L0 (u)du)
is less than 1.
Proof: Let x0 ∈ U(x , r), where r satisfies (8.30), by the monotonicity of L0 , L,
(8.17) and Lemma (8.2.5), we have
s(x )
β ω1 0 0 L(u)udu
q = (1 + v) s(x ) s(x0 )
s(x0 )2 (1 − β 0 0 L0 (u)du)
√ 2 s(x0 )
2β ω1 0 L0 (u)du
+ω2 + ω1 v + s(x )
s(x0 )(1 − β 0 0 L0 (u)du)
r
β ω1 0 L(u)udu
< (1 + v) r) r
r2 (1 − β 0 L0 (u)du)
√ 2 r
2β ω1 0 L0 (u)du
+ω2 + ω1 v + r ≤ 1,
r(1 − β 0 L0 (u)du)
and
s(x)
[F (x)T F (x)]−1 F (x)T F (x) − F (x ) ≤ β L0 (u)du
0
r
≤ β L0 (u)du < 1,
0
That is, q given by (8.32) is less than 1.

Since F (x), x ∈ U(x , r) has full rank by Lemma 8.2.1 and 8.2.2, we have
β
[F (x)T F (x)]−1 F (x)T ≤ s(x) , for each x ∈ U(x , r)
1−β 0
L0 (u)du
√ s(x)
−1 T −1 T 2β 2 L0 (u)du
[F (x) F (x)]T
F (x) − [F (x ) F (x )]
T
F (x ) ≤ 0
s(x) ,
1 − β 0 L0 (u)du
Then, if xk ∈ U(x , r), we have by (8.2) in turn that
xk+1 − x = xk − x − B−1 −1
k F (xk ) (F(xk ) − F(x )) + Bk rk
T
1
= xk − x − B−1 T θ
k F (xk ) F (x )(xk − x )dθ
0
+B−1 −1 −1 T −1 T
k Pk Pk rk + Bk F (xk ) F(xk ){[F (x ) F (x )] F (x ) F(x )
T
−(F (xk )T F (xk ))−1 F (xk )T F(x )}

1
−1 T
= Bk F (xk ) F (xk ) F (xk )† (F (xk ) − F (xθ ))(xk − x )dθ
0
−Bk (F (xk ) F (xk ) − Bk )(xk − x ) + B−1
−1 T −1
k Pk Pk rk
+B−1 T −1 T
k F (xk ) F(xk ){[F (x ) F (x )] F (x ) F(x )
T
−(F (xk )T F (xk ))−1 F (xk )T F(x )}.

In view of Lemma 8.2.3 and conditions (8.11) and (8.6) we obtain

1
−1 T
xk+1 − x ≤ Bk F (xk ) F (xk ) F (xk )† F (xk ) − F (xθ )
0
xk − x dθ
+B−1 T
k (F (xk ) F (xk ) − Bk )xk − x
+θk B−1 −1 T
k Pk Pk F (xk ) F (xk )
+B−1 T T −1 T
k F (xk ) F (xk )[F (x ) F (x )] F (x )
−[F (xk )T F (xk )]−1 F (xk )T F (x )
1 s(xk )
β ω1
≤ s(x ) L(u)dus(xk )dθ
1 − β 0 k L0 (u)du 0 θ s(xk )
+ω2 s(xk ) + θk B−1 T
k F (xk ) F (xk )
×(Pk F (xk )T F (xk ))−1 F (xk )† F(xk )
√ s(x )
2cβ 2 ω1 0 k L0 (u)du
+ s(x )
1 − β 0 k L0 (u)du
s(x )
β ω1 0 k L(u)du
≤ s(x )
1 − β 0 k L0 (u)du
s(x )
β 0 k L0 (u)du
+ω1 vk s(xk ) + s(x )
1 − β 0 k L0 (u)du
√ s(x )
2cβ 2 ω1 0 k L0 (u)du
+ω2 s(xk ) + s(x )
1 − β 0 k L0 (u)du
s(x )
β ω1 0 k L(u)du
≤ (1 + vk ) s(x ) + (ω2 + ω1 vk )s(xk )
1 − β 0 k L0 (u)du
√ s(x )
2cβ 2 ω1 0 k L0 (u)du
+ s(x ) .
1 − β 0 k L0 (u)du
If k = 0 above, we obtain x1 − x ≤ qx0 − x < x0 − x . That is x1 ∈

U(x , r), this shows that (8.2) can be continued an infinite number of times. By
mathematical induction, all xk belong to U(x , r) and s(xk ) = xk − x decreases
monotonically. Hence, for all k ≥ 0, we have

s(x )
β ω1 0 k L(u)du
xk+1 − x ≤ (1 + vk ) s(xk ) s(xk )2
s(xk ) (1 − β 0 L0 (u)du)
2
√ s(x )
2cβ 2 ω1 0 k L0 (u)du
+(ω2 + ω1 vk )s(xk ) + s(x ) s(xk )
s(xk )(1 − β 0 k L0 (u)du)
s(x )
β ω1 0 0 L(u)du
≤ (1 + v) s(x ) s(xk )2
s(x0 )2 (1 − β 0 0 L0 (u)du)
√ s(x )
2cβ 2 ω1 0 0 L0 (u)du
+ ω2 + ω1 v + s(x ) s(xk ).
s(x0 )(1 − β 0 0 L0 (u)du)
2
Remark 8.3.4 If L0 = L our results reduce to the corresponding ones in [5]
which in turn improved earlier ones [6, 7, 10]. On the other hand, if L0 < L, our
results constitute an improvement with advantages.
8.4 Applications and examples

First of all, let L0 and L to be positive constant functions. Then, the following
corollaries are obtained.
Corollary 8.4.1 Suppose x satisfies (8.1), F has a continuous derivative in
U(x , r) F (xk ) has full rank and F (x) satisfies the radius Lipschitz condition
with L average
F (x) − F (xθ ) ≤ (1 − θ )Lx − x , 0 ≤ θ ≤ 1, (8.33)
and the center Lipschitz condition with L0 average
F (x) − F (x ) ≤ L0 θ x − x , (8.34)
where xθ = x + θ (x − x ), s(x) = x − x . Assume Bk = F (xk )T F (xk ) for each
k in (8.2), vk = θk [Pk F (xk )T F (xk )]−1 = θkCond
(Pk F (xk )T F (xk )) with vk ≤ v < 1. Let r > 0 satisfy
√
2(1 − v − 2cL0 β 2 )
r= (8.35)
β [(1 + v)L + 2(1 − v)L0 ]
where c = F(x ), β = [F (x )T F (x )]−1 F (x )T . Then (IGNM) is conver-
gent for all x0 ∈ U(x , r),
√
Lβ (1 + v)x0 − x + 2 2cL0 β 2
q = v+ < 1, (8.36)
2(1 − L0 β x0 − x )
and the inequality (8.25) holds.
Corollary 8.4.2 Suppose x satisfies (8.1), F(x ) = 0, F has a continuous

derivative in U(x , r) F (xk ) has full rank and F (x) satisfies the center Lipschitz
condition with Lipschitz condition with L0 average. Assume Bk = F (x0 )T F (x0 )
for each k in (8.2), vk = θk [P0 F (x0 )T F (x0 )]−1 P0 F (x0 )T F (x0 )
= θkCond(P0 F (x0 )T F (x0 )) with vk ≤ v < 1. Let r > 0 satisfy
2(1 − v)
r= (8.37)
β L0 (5 + v)
where β = [F (x )T F (x )]−1 F (x )T . Then (MIGNM) is convergent for all
x0 ∈ U(x , r),
L0 β x0 − x + v L0 β x0 − x (1 + v)
q= + < 1, (8.38)
1 − L0 β x0 − x 2(1 − L0 β x0 − x )
Corollary 8.4.3 Suppose x satisfies (8.1), F has a continuous deriva-

tive in U(x , r) F (xk ) has full rank and F (x) satisfies the radius Lip-
schitz condition (8.33) with L average and the center Lipschitz condi-
tion (8.34) with L0 average. Assume B(x) and F (x)T F (x) satisfy (8.29),
vk = θk [Pk F (xk )T F (xk )]−1 Pk F (xk )T F (xk ) = θkCond(Pk F (xk )T F (xk ))
with vk ≤ v < 1. Let r > 0 satisfy
√
2(1 − vω1 − ω2 − 2cL0 β 2 ω1 )
r= (8.39)
β [(1 + v)ω1 L + 2(1 − ω2 − ω1 v)L0 ]
where c = F(x ), β = [F (x )T F (x )]−1 F (x )T . Then (IGNLM) is conver-
gent for all x0 ∈ U(x , r),
√
Lβ ω1 (1 + v)x0 − x + 2 2cL0 β 2 ω1
q = ω1 v + ω2 + < 1, (8.40)
2(1 − L0 β x0 − x )
Remark 8.4.4 (a) If, L0 = L the results reduce to the corresponding ones
in [5]. On the other hand, if L0 < L the new results constitute an improve-
ment.
(b) The results of Section 5 in [5–7] using only center-Lipschitz condition can
be improved with L0 replacing L.
Example 8.4.5 Let X = Y = R3 , D = U(0, 1) and x∗ = (0, 0, 0)T . Define func-

tion F on D for w = (x, y, z)T by
T
e−1
F(w) = e − 1, x
y + y, z .
2
(8.41)
2
Then, the Fréchet derivative of F is given by

⎛ x ⎞
e 0 0
F (w) = ⎝ 0 (e − 1) y + 1 0⎠
0 0 1
Notice that we have F(x ) = 0, F (x ) = F (x )−1 = diag {1, 1, 1} and L0 =
e − 1 < L = e.
References

[2] I. K. Argyros and S. Hilout, Weak convergence conditions for inexact
Newton-type methods, Appl. Math. Comput., 218, 2800–2809, 2011.
[3] I. K. Argyros and S. Hilout, Extending the applicability of the Gauss-
Newton method under average Lipschitz-type conditions, Numer. Algo-
rithms, 58, 23–52, 2011.
[4] I. K. Argyros and S. Hilout, Weaker conditions for the convergence of
Newton’s method, J. Complexity, 28, 364–387, 2012.
[5] J. Chen, The convergence analysis of inexact Gauss-Newton methods for
nonlinear problems, Comput. Optim. Appl., 40, 97–118, 2008.
[6] J. Chen and W. Li, Convergence of Gauss-Newton’s method and uniqueness
of the solution, Appl. Math. Comput., 170, 686–705, 2005.
[7] J. Chen and W. Li, Convergence behaviour of inexact Newton methods
under weaker Lipschitz condition, J. Comput. Appl. Math., 191, 143–164,
2006.
[8] R. S. Dembo, S. C. Eisenstat and T. Steihaus, Inexact Newton methods,
SIAM J. Numer. Anal., 19, 400–408, 1982.
optimization and nonlinear equations (Corrected reprint of the 1983 origi-
nal), Classics in Appl. Math., SIAM 16, Philadelphia, PA, 1996.
[10] B. Morini, Convergence behaviour of inexact Newton method, Math.
Comp., 61, 1605–1613, 1999.
159
160 References
[11] J. M. Ortega and W. C. Rheinholdt, Iterative Solution of Nonlinear Equa-

tions in Several Variables, Academic Press, New York, 1970.
[12] G. W. Stewart, On the continuity of the generalized inverse, SIAM J. Appl.
Math., 17, 33–45, 1969.
[13] W. C. Rheinholdt, An adaptive continuation process for solving systems
129–142, 1977.
[15] X. Wang, The Convergence on Newton’s method. Kexue Tongbao (A spe-
cial issue of Mathematics, Physics & Chemistry), 25, 36–37, 1980.
Chapter 9
Lavrentiev Regularization
methods for Ill-posed
equations
Ioannis K. Argyros

CONTENTS
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
9.2 Basic assumptions and some preliminary results . . . . . . . . . . . . . . . . . . . . 163
9.3 Error estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9.3.1 Apriori parameter choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
9.3.2 Aposteriori parameter choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
161
9.1 Introduction
In this chapter, we consider the problem of approximately solving the nonlin-
ear ill-posed operator equation of the form
F(x) = y, (9.1)
where F : D(F) ⊂ X → X is a monotone operator and X is a real Hilbert space.

We denote the inner product and the corresponding norm on a Hilbert space by
·, · and · , respectively. Let U(x, r) stand for the open ball in X with center
x ∈ X and radius r > 0. Recall that F is said to be a monotone operator if it
satisfies the relation
F(x1 ) − F(x2 ), x1 − x2 ≥ 0 (9.2)
for all x1 , x2 ∈ D(F).
We assume, throughout this chapter, that yδ ∈ Y are the available noisy data
with
y − yδ ≤ δ (9.3)
and (9.1) has a solution x̂. Since (9.1) is ill-posed, the regularization methods
are used ( [9–11, 13, 14, 19, 21, 22]) for approximately solving (9.1). Lavrentiev
regularization is used to obtain a stable approximate solution of (9.1). In the
Lavrentiev regularization, the approximate solution is obtained as a solution of
the equation
F(x) + α(x − x0 ) = yδ , (9.4)
where α > 0 is the regularization parameter and x0 is an initial guess for the
solution x̂. For deriving the error estimates, we shall make use of the following
equivalent form of (9.4),
xαδ = x0 + (Aδα + αI)−1 [yδ − F(xαδ ) + Aδα (xαδ − x0 )], (9.5)
where Aδα = F (xαδ ).

In [15], Mahale and Nair, motivated by the chapter of Tautenhan [22], con-
sidered Lavrentiev regularization of (9.1) under a general source condition on
x̂ − x0 and obtained an order optimal error estimate.
In the present chapter, we are motivated by [15]. In particular, we expand the
applicability of the Lavrentiev regularization of (9.1) by weakening one of the
major hypotheses in [15] (see below Assumption 2.1 (ii) in the next section).
The chapter is organized as follows: In Section 9.2, we present some basic
assumptions and preliminaries required. The main order optimal result using the
apriori and a posteriori parameter choice is presented in Section 9.3. Finally some
numerical examples are presented in the concluding Section 9.4.
Lavrentiev Regularization methods for Ill-posed equations 163
9.2 Basic assumptions and some preliminary results

We use the following assumptions to prove the results in this chapter.
Assumption 10 (1) There exists r > 0 such that U(x0 , r) ⊆ D(F) and F :
U(x0 , r) → X is Fréchet differentiable.
(2) There exists K0 > 0 such that, for all uθ = u + θ (x0 − u) ∈ U(x0 , r), θ ∈
[0, 1] and v ∈ X, there exists an element, say φ (x0 , uθ , v) ∈ X, satisfying
[F (x0 ) − F (uθ )]v = F (uθ )φ (x0 , uθ , v), φ (x0 , uθ , v) ≤ K0 vx0 − uθ
for all uθ ∈ U(x0 , r) and v ∈ X.

(3) F (u) + αI)−1 F (uθ ) ≤ 1 for all uθ ∈ U(x0 , r)
(4) (F (u) + αI)−1 ≤ α1 .
The condition (2) in Assumption 10 weakens the popular hypotheses given

in [15], [17] and [20].
Assumption 11 There exists a constant K > 0 such that, for all x, y ∈ U(x̂, r) and
v ∈ X, there exists an element denoted by P(x, u, v) ∈ X satisfying
[F (x) − F (u)]v = F (u)P(x, u, v), P(x, u, v) ≤ Kvx − u.
Clearly, Assumption 11 implies Assumption 10 (2) with K0 = K, but not

necessarily vice versa. Note that K0 ≤ K holds in general and KK0 can be arbitrarily
large [1]– [5]. In fact, there are many classes of operators satisfying Assumption
10 (2), but not Assumption 11 (see the numerical examples at the end of this
chapter). Furthermore, if K0 is sufficiently smaller than K which can happen
since KK0 can be arbitrarily large, then the results obtained in this chapter provide
a tighter error analysis than the one in [15].
Finally, note that the computation of constant K is more expensive than the
computation of K0 .
Assumption 12 There exists a continuous and strictly monotonically increasing

function ϕ : (0, a] → (0, ∞) with a ≥ F (x0 ) satisfying
(1) limλ →0 ϕ(λ ) = 0;
(2) supλ ≥0 αϕ(λ )
λ +α ≤ ϕ(α) for all α ∈ (0, a];
(3) there exists v ∈ X with v ≤ 1 such that
x̂ − x0 = ϕ(F (x0 ))v. (9.6)
Note that condition (9.6) is suitable for both mildly and severely ill-posed prob-
lems [16], [17]. Further note that the source condition (9.6) involves the known
initial approximation x0 whereas the source condition considered in [15] requires
the knowledge of the unknown x̂.
We need some auxiliary results based on Assumption 10.
Proposition 9.2.1 For each u ∈ U(x0 , r) and α > 0,

5K0
(F (u)+αI)−1 [F(x̂)−F(u)−F (u)(x̂−u) ≤ x̂−u2 +2K0 x̂−x0 x̂−u.
2
Proof: From the fundamental theorem of integration, for any u ∈ U(x0 , r), we
get
1
F(x̂) − F(u) = F (u + t(x̂ − u))(x̂ − u)dt.
0
Hence, by Assumption 10,
F(x̂) − F(u) − F (u)(x̂ − u)
1
= [F (u + t(x̂ − u)) − F (x0 ) + F (x0 ) − F (u)](x̂ − u)dt
0
1
= F (x0 )[φ (x0 , u + t(x̂ − u), u − x̂) − φ (x0 , u, x̂ − u)]dt.
0
Then, by (2), (3) in Assumptions 10 and the inequality (F (u)+αI)−1 F (uθ ) ≤
1, we get
(F (u) + αI)−1 [F(x̂) − F(u) − F (u)(x̂ − u)
1
≤ φ (x0 , u + t(x̂ − u), u − x̂) + φ (x0 , u, x̂ − u)dt.
0
1
≤ [ K0 (u − x0 + x̂ − ut)dt + K0 u − x0 ]x̂ − u
0
1
≤ [ K0 (u − x̂ + x̂ − x0 + x̂ − ut)dt + K0 (u − x̂ + x̂ − x0 )]x̂ − u
0
5K0
≤ x̂ − u2 + 2K0 x̂ − x0 x̂ − u.
2

Theorem 9.2.2 ( [16, Theorem 2.2], also see [22]) Let xα be the solution of (9.4)
with y in place of yδ . Then
(1) xαδ − xα ≤ αδ ,
(2) xα − x̂ ≤ x0 − x̂,
(3) F(xαδ ) − F(xα ) ≤ δ .
Remark 9.2.3 From Theorem 9.2.2 and triangle inequality we have
δ
xαδ − x̂ ≤ + x0 − x̂.
α
So (9.5) is meaningful if
δ
r> + x0 − x̂.
α
9.3 Error estimates

We will try to find error bounds for xα − x̂ in order to find the error estimates.
Theorem 9.3.1 Let Assumption 10, 12 hold, and let K0 x0 − x̂ < 29 . Then
xα − x̂ ≤ Cϕ(α)
1+2K0 x0 −x̂
where C = 1−9K0 x0 −x̂/2 .
Proof: Let Aα = F (xα ) and A0 = F (x0 ). Then

xα = x0 + (Aα + αI)−1 [y − F(xα ) + Aα (xα − x0 )].
So
xα − x̂ = x0 − x̂ + (Aα + αI)−1 [y − F(xα ) + Aα (xα − x0 )]
= x0 − x̂ + (Aα + αI)−1 [F(x̂) − F(xα ) + Aα (xα − x̂) + Aα (x̂ − x0 )]
= (Aα + αI)−1 [α(x0 − x̂) + F(x̂) − F(xα ) + Aα (xα − x̂)]
= (Aα + αI)−1 α(x0 − x̂) + (Aα + αI)−1 [F(x̂) − F(xα ) + Aα (xα − x̂)]
= (A0 + αI)−1 α(x0 − x̂) + [(Aα + αI)−1 − (A0 + αI)−1 ]α(x0 − x̂)
+(Aα + αI)−1 [F(x̂) − F(xα ) + Aα (xα − x̂)]
= v0 + (Aα + αI)−1 (A0 − Aα )v0 + Δ1 (9.7)
where v0 = (A0 +αI)−1 α(x0 − x̂) and Δ1 = (Aα +αI)−1 [F(x̂)−F(xα )+Aα (xα −
x̂)]. By Assumption 12, we have
v0 ≤ ϕ(α), (9.8)
by Assumption 10, 12 and Theorem 9.2.2, we get
(Aα + αI)−1 (A0 − Aα )v0 = (Aα + αI)−1 A0 ϕ(x0 , xα , v0 )
≤ K0 x0 − xα v0
≤ K0 (x0 − x̂ + x̂ − xα )v0
≤ 2K0 x0 − x̂ϕ(α) (9.9)
and by Proposition 9.2.1
5K0
Δ1 ≤ x̂ − xα 2 + 2K0 x̂ − x0 x̂ − xα
2
9K0
≤ x̂ − x0 x̂ − xα . (9.10)
2
The result now follows from (9.7), (9.8), (9.9) and (9.10).
The following Theorem is a consequence of Theorem 9.2.2 and Theorem
9.3.1.
Theorem 9.3.2 Under the assumptions of Theorem 9.3.1
δ
xαδ − x̂ ≤ C( + ϕ(α))
α
where C is as in Theorem 9.3.1.
9.3.1 Apriori parameter choice

Let ψ : (0, ϕ(a)] → (0, aϕ(a)] be defined as
ψ(λ ) := λ ϕ −1 (λ ). (9.11)
δ
Then α = ϕ(α) ⇔ δ = ψ(ϕ(α)).
Theorem 9.3.3 Let the assumptions of Theorem 9.3.1 be satisfied. If we choose
the regularization parameter as α = ϕ −1 (ψ −1 (δ )) with ψ defined as in (9.11),
then
xαδ − x̂ ≤ 2Cψ −1 (δ ). (9.12)
9.3.2 Aposteriori parameter choice

It is worth noticing that the choice of α in the above Theorem depends on the
unknown source function ϕ. In applications, it is preferable that parameter α is
chosen independent of the source function ϕ, but may depend on (δ , yδ ), and
hence on the regularized solution.
For Lavrentiev regularization (9.5), Tautenhan in [22] considered the follow-
ing discrepancy principal for chosing α,
α(Aδα + αI)−1 [F(xαδ ) − yδ ] = cδ (9.13)
where c > 0 is an appropriate constant. The error estimate in [22] is obtained
under the assumption that the solution satisfies a Holder-type condition. In [15],
Mahale and Nair considered the discrepancy principle (9.13) and extended the
analysis of Tautenhan [22] to include both Holder type and logarithmic type
conditions. In this chapter we consider discrepancy principle (9.13) and derive
an order optimal error estimate under the Assumption 12.
From the following proposition presented in [22], in which the existence of
α satisfying (9.13) is showed.
Proposition 9.3.4 ( [22, Proposition 4.1]) Let F be monotone and F(x0 ) −
(c−1)δ
yδ ≥ cδ with c > 2. Then there exists an α ≥ β0 := x 0 −x̂
satisfying (9.13).
Lemma 9.3.5 Let Assumption 10 and assumptions in Proposition 9.3.4 hold and
α := α(δ ) is chosen according to (9.13). Then
(c − 2)δ
α(A0 + αI)−1 (F(xα ) − y) ≥ (9.14)
1 + k1
K0 (2c−1)x0 −x̂
where k1 = c−1 .
Proof: By (9.13), we obtain
|cδ − α(Aδα + αI)−1 (F(xα ) − y)| = |α(Aδα + αI)−1 (F(xαδ ) − yδ )

−α(Aδα + αI)−1 (F(xα ) − y)|
≤ α(Aδα + αI)−1 [F(xαδ ) − F(xα )
−(yδ − y)]
≤ F(xαδ ) − F(xα ) + yδ − y
≤ 2δ .
The last step follows from Theorem 9.2.2. So
(c − 2)δ ≤ α(Aδα + αI)−1 (F(xα ) − y) ≤ (c + 2)δ .
Let
aα := α(A0 + αI)−1 (F(xα ) − y).
Then
α(Aδα + αI)−1 (F(xα ) − y) ≤ aα + α[(Aδα + αI)−1 − (A0 + αI)−1 ]

(F(xα ) − y)
≤ aα + [(Aδα + αI)−1 (A0 − Aδα )aα
≤ aα + [(Aδα + αI)−1 Aδα ϕ(x0 , xαδ , aα )
≤ aα (1 + K0 x0 − xαδ )
≤ aα (1 + K0 (x0 − x̂ + x̂ − xαδ ))
δ
≤ aα (1 + K0 (x0 − x̂ + + x̂ − x0 ))
α
δ
≤ aα (1 + K0 (2x0 − x̂ + ))
α
x0 − x̂
≤ aα (1 + K0 (2x0 − x̂ + ))
c−1
≤ aα (1 + k1 ),
(c−2)δ
which in turn implies α(A0 + αI)−1 (F(xα ) − y) ≥ 1+k1 .
Lemma 9.3.6 Let the assumptions of Theorem 9.3.1 and Proposition 9.3.4 hold.
Then
(1) α(A0 + αI)−1 (F(xα ) − y) ≤ μαϕ(α),

(2) α ≥ ϕ −1 (ψ −1 (ξ δ )) where μ = C(1 + 3K2 0 x0 − x̂) and ξ = 1+k1 .
c−2
Proof: By Assumption 10, it is clear that for x, z ∈ U(x0 , r) and u ∈ X,
F (z)u = F (x0 )u − F (z)ϕ(x0 , z, u), ϕ(x0 , z, u) ≤ K0 x0 − zu.
So
1
F(xα ) − F(x̂) = F (x̂ + t(xα − x̂))(xα − x̂)dt
0
1
= A0 (xα − x̂) − F (x̂ + t(xα − x̂))ϕ(x0 , x̂ + t(xα − x̂), xα − x̂)dt.
0
Hence
α(A0 + αI)−1 (F(xα ) − F(x̂))

= α(A0 + αI)−1 A0 (xα − x̂)
1
− α(A0 + αI)−1 F (x̂ + t(xα − x̂))
0
ϕ(x0 , x̂ + t(xα − x̂), xα − x̂)dt
xα − x̂
≤ α(xα − x̂ + K0 (x0 − x̂ + )xα − x̂)
2
3K0
≤ α(1 + x0 − x̂)xα − x̂.
2
The last follows from Theorem 9.2.2. Now by using Theorem 9.3.1, we have
α(A0 + αI)−1 (F(xα ) − F(x̂)) ≤ C(1 + 3K2 0 x0 − x̂)αϕ(α) = μαϕ(α). In
view of (9.14), we get
(c − 2)δ
≤ μαϕ(α)
1 + k1
which implies, by definition of ψ
(c − 2)δ
ψ(ϕ(α)) = αϕ(α) ≥ := ξ δ
1 + k1
(c−2)
where ξ = 1+k1 . Thus α ≥ ϕ −1 (ψ −1 (ξ δ )). This completes the proof.
Theorem 9.3.7 Let Assumption 10 be satisfied and 4K0 x0 − x̂ < 1. Then for
0 < α0 ≤ α,
α(A0 + αI)−1 (F(xα ) − F(x̂))

xα − xα0 ≤ .
(1 − 4x0 − x̂)α0
Proof: Since
F(xα ) − y + α(xα − x0 ) = 0, (9.15)
F(xα0 ) − y + α0 (xα0 − x0 ) = 0, (9.16)

and
α0 (xα − xα0 ) = (α − α0 )(x0 − xα ) + α0 (x0 − xα0 ) − α(x0 − xα ),
we have by (9.15) and (9.16),

α − α0
α0 (xα − xα0 ) = (F(xα ) − y) + F(xα0 ) − F(xα ),
α
and
α − α0
(Aα + α0 I)(xα − xα0 ) = (F(xα ) − y) + [F(xα0 ) − F(xα ) + Aα (xα − xα0 )].
α
Hence
α − α0
xα −xα0 = (Aα +α0 I)−1 (F(xα )−y)+(Aα +α0 I)−1 [F(xα0 )−F(xα )+Aα (xα −xα0 )].
α
Thus by Proposition 9.3.4

α − α0
xα − xα0 ≤ (Aα + α0 I)−1 (F(xα ) − y) + Γ1 (9.17)
α
where Γ1 = (Aα + α0 I)−1 [F(xα0 ) − F(xα ) + Aα (xα − xα0 )]. By Fundamental
Theorem of Integration and Assumption 10, we have
1
F(xα0 ) − F(xα ) + Aα (xα − xα0 ) = [F (xα + t(xα0 − xα )) − Aα ](xα0 − xα )dt
0
1
= [F (xα + t(xα0 − xα )) − F (x0 )
0
+F (x0 ) − Aα ](xα0 − xα )dt
1
= [−F (uθ )ϕ(x0 , uθ , xα0 − xα )
0
+Aα ϕ(x0 , xα , xα0 − xα )]dt
where uθ = x0 + θ (x0 − (xα +t(xα0 − xα ))). Hence, again by Assumption 10, we

get in turn that

1
Γ1 ≤ ϕ(x0 , uθ , xα0 − xα )dt + ϕ(x0 , xα , xα0 − xα )
0
1
≤ K0 [ x0 − uθ dt + x0 − xα ]xα0 − xα
0
≤ K0 [x0 − xθ /2 + x0 − xα ]xα0 − xα
≤ K0 [x0 − xα /2 + x0 − xα0 /2 + x0 − xα ]xα0 − xα
≤ K0 [3(x0 − x̂ + x̂ − xα )/2 + (x0 − x̂ + x̂ − xα0 )/2]xα0 − xα
≤ 4K0 x0 − x̂xα0 − xα . (9.18)
α−α0
The last step follows from Theorem 9.2.2. Now since α < 1, we have
α − α0
(Aα + α0 I)−1 (F(xα ) − y) ≤ (Aα + α0 I)−1 (F(xα ) − y)
α
1
≤ α0 (Aα + α0 I)−1
α0 α
(Aα + α0 I)α(Aα + α0 I)−1 (F(xα ) − y)
α0 (λ + α)
≤ sup | |
λ ≥0 α(λ + α0
α(Aα + α0 I)−1 (F(xα ) − y)

α0
α(Aα + α0 I)−1 (F(xα ) − y)
≤ . (9.19)
α0
So by (9.17), (9.18) and (9.19) we have
α(Aα + α0 I)−1 (F(xα ) − y)
xα − xα0 ≤ .
α0 (1 − 4K0 x − x̂)

Lemma 9.3.8 Let assumptions of Lemma 9.3.6 hold. Then
α(Aα + α0 I)−1 (F(xα ) − y) ≤ β δ ,
where β = (c + 2)(1 + K0 (4c−3)x
c−1
0 −x̂
).
Proof: Observe that
α(Aα + α0 I)−1 (F(xα ) − y) ≤ α(Aδα + α0 I)−1 (F(xα ) − y)
+α[(Aα + α0 I)−1 − (Aδα + α0 I)−1 ]
(F(xα ) − y)
≤ α(Aδα + α0 I)−1 (F(xα ) − y) (9.20)
+α(Aα + α0 I) (Aα − Aα )(Aα + α0 I)−1
−1 δ δ
(F(xα ) − y).
Let aδ = α(Aδα + α0 I)−1 (F(xα ) − y). Then
α(Aα + α0 I)−1 (F(xα ) − y) ≤ aδ + α(Aα + α0 I)−1 (Aδα − A0 + A0 − Aα )

(Aδα + α0 I)−1 (F(xα ) − y)
≤ aδ + − (Aα + α0 I)−1 Aδα ϕ(x0 , xαδ , aδ )
+(Aα + α0 I)−1 Aα ϕ(x0 , xα , aδ )
≤ aδ (1 + K0 (x0 − xαδ + x0 − xα ))
≤ aδ (1 + K0 (xα − xαδ + 2x0 − xα ))
δ
≤ aδ (1 + K0 ( + 4x0 − x̂))
α
1
≤ aδ (1 + K0 ( + 4)x0 − x̂)
c−1
4c − 3
≤ aδ (1 + K0 x0 − x̂).
c−1
Now since aδ ≤ (c + 2)δ , we have
4c − 3
α(Aα + α0 I)−1 (F(xα ) − y) ≤ (1 + K0 x0 − x̂)(c + 2)δ .
c−1

Now, we present the main result of the chapter.
Theorem 9.3.9 Let assumptions of Lemma 9.3.6 hold. If, in addition, 9K0 x0 −
x̂ < 2, then
xαδ − x̂ ≤ ℘ψ −1 (ηδ )
where ℘ = 1
ξ + 1−4K01x0 −x̂ +C and η = max{ξ , β }.
Proof: Firstly, we begin with the case α := α(δ ) ≤ α0 . Then by Theorem 9.3.2
we get
δ
xαδ − x̂ ≤ C( + ϕ(α0 )). (9.21)
α
Now, we follow with the case α := α(δ ) > α0 . In this case by Theorem 9.3.7,
we obtain
xαδ − x̂ ≤ xαδ − xα + xα − xα0 + xα0 − x̂

δ α(Aα + α0 I)−1 (F(xα ) − y)
≤ +Cϕ(α0 ) + . (9.22)
α α0 (1 − 4K0 x − x̂)
From (9.21) and (9.22)
δ α(Aα + α0 I)−1 (F(xα ) − y)

xαδ − x̂ ≤ +Cϕ(α0 ) +
α α0 (1 − 4K0 x − x̂)
for all α ∈ (0, a]. Let α0 := ϕ −1 ψ −1 (β δ ) with β = (1 + K0 (4c − 3))x0 − x̂(c +

2)/(c − 1). Then by Lemma 9.3.6 and 9.3.8, we get
δ
xαδ − x̂ ≤ +Cϕ(α0 )
α
α(Aα + α0 I)−1 (F(xα ) − y)
+
α0 (1 − 4K0 x − x̂)
δ βδ
≤ + +Cψ −1 (β δ ).
ϕ −1 ψ −1 (ξ δ ) (1 − 4K0 x − x̂)ϕ −1 ψ −1 (β δ )
λ
Now since ϕ −1 ψ −1 (λ ) = ψ −1 (λ ) we have
ψ −1 (ξ δ ) ψ −1 (β δ )
xαδ − x̂ ≤ + +Cψ −1 (β δ )
ξ 1 − 4K0 x − x̂
1 1
≤ ( + +C)ψ −1 (ηδ ). (9.23)
ξ 1 − 4K0 x0 − x̂
where η = max{ξ , β }. This completes the proof.

Example 9.4.1 Let X = ([0, 1]). We consider the space of continuous functions
defined on [0, 1] equipped with the max norm and D(F) = U(0, 1). Define an
operator F on D(F) by
1
F(h)(x) = h(x) − 5 xθ h(θ )3 dθ . (9.24)
0
Then the Fréchet-derivative is given by

1

F (h[u])(x) = u(x) − 15 xθ h(θ )2 u(θ )dθ (9.25)
0
for all u ∈ D(F). Using (9.24), (9.25), Assumptions 10 (2), 11 for x̂ = 0, we get
K0 = 7.5 < K = 15.
Example 9.4.2 Let X = D(F) = R, x̂ = 0 and define a function F on D(F) by
F(x) = d0 x − d1 sin 1 + d1 sin ed2 x , (9.26)
where d0 , d1 and d2 are the given parameters. Note that F(x̂) = F(0) = 0. Then it
can easily be seen that, for d2 sufficiently large and d1 sufficiently small, KK0 can
be arbitrarily large.
We now present two examples where Assumption 11 is not satisfied, but As-
sumption 10 (2) is satisfied.
Example 9.4.3 Let X = D(F) = R, x̂ = 0 and define a function F on D by

1
x1+ i i
F(x) = + c1 x − c1 − , (9.27)
1+ i1 i + 1
where c1 is a real parameter and i > 2 is an integer. Then F (x) = x1/i + c1 is

not Lipschitz on D. Hence Assumption 11 is not satisfied. However, the central
Lipschitz condition in Assumption 11 (2) holds for K0 = 1. We also have that
F(x̂) = 0. Indeed, we have
F (x) − F (x̂) = |x1/i − x̂1/i |

|x − x̂|
= i−1 i−1
x̂ i + · · · + x i
and so
F (x) − F (x̂) ≤ K0 |x − x̂|.
References

Springer, 2008.
method with a modified Newton’s method iterate as a starting point, Rev.
[3] I. K. Argyros, A semilocal convergence for directional Newton methods,
Math. Comput. (AMS), 80, 327–343, 2011.
[6] A. Bakushinskii and A. Seminova, On application of generalized discrep-
ancy principle to iterative methods for nonlinear ill-posed problems, Numer.
Funct. Anal. Optim., 26, 35–48, 2005.
[7] A. Bakushinskii and A. Seminova, A posteriori stopping rule for regular-
ized fixed point iterations, Nonlinear Anal., 64, 1255–1261, 2006.
[8] A. Bakushinskii and A. Seminova, Iterative regularization and general-
ized discrepancy principle for monotone operator equations, Numer. Funct.
Anal. Optim., 28, 13–25, 2007.
[9] A. Binder, H. W. Engl, C. W. Groetsch, A. Neubauer and O. Scherzer,
Weakly closed nonlinear operators and parameter identification in parabolic
equations by Tikhonov regularization, Appl. Anal., 55, 215–235, 1994.
[10] H. W. Engl, M. Hanke and A. Neubauer, Regularization of Inverse Prob-
lems, Dordrecht, Kluwer, 1993.
174
References 175
[11] H. W. Engl, K. Kunisch and A. Neubauer, Convergence rates for Tikhonov

regularization of nonlinear ill-posed problems, Inverse Problems, 5, 523–
540, 1989.
[12] Q. Jin, On the iteratively regularized Gauss-Newton method for solving
nonlinear ill-posed problems, Math. Comp., 69, 1603–1623, 2000.
[13] Q. Jin and Z. Y. Hou, On the choice of the regularization parameter for ordi-
nary and iterated Tikhonov regularization of nonlinear ill-posed problems,
Inverse Problems, 13, 815–827, 1997.
[14] Q. Jin and Z. Y. Hou, On an a posteriori parameter choice strategy for
Tikhonov regularization of nonlinear ill-posed problems, Numer. Math., 83,
139–159, 1990.
[15] P. Mahale and M. T. Nair, Lavrentiev regularization of nonlinear ill-posed
equations under general source condition, J. Nonlinear Anal. Optim., 4 (2),
193–204, 2013.
[16] P. Mahale and M. T. Nair, Iterated Lavrentiev regularization for nonlinear
ill-posed problems, ANZIAM, 51, 191–217, 2009.
[17] P. Mahale and M.T. Nair, General source conditions for nonlinear ill-posed
problems, Numer. Funct. Anal. Optim. 28, 111–126, 2007.
[18] M. T. Nair and U. Tautanhahn, Lavrentiev regularization for linear ill-posed
problems under general source conditions, Z. Anal. Anwendungen, 23, 167–
185, 2004.
[19] O. Scherzer, H. W. Engl and K. Kunisch, Optimal a posteriori parameter
choice for Tikhonov regularization for solving nonlinear ill-posed prob-
lems, SIAM J. Numer. Anal., 30, 1796–1838, 1993.
[20] E. V. Semenova, Lavrentiev regularization and balancing principle for solv-
ing ill-posed problems with monotone operators, Comput. Methods Appl.
Math., 4, 444–454, 2010.
[21] U. Tautenhahn, Lavrentiev regularization of nonlinear ill-posed problems,
Vietnam J. Math., 32, 29–41, 2004.
[22] U. Tautenhahn, On the method of Lavrentiev regularization for nonlinear
ill-posed problems, Inverse Problems, 18, 191–207, 2002.
[23] U. Tautenhahn and Q. Jin, Tikhonov regularization and a posteriori rule for
solving nonlinear ill-posed problems, Inverse Problems, 19, 1–21, 2003.
Chapter 10
King-Werner-type
√
methods of order 1 + 2
Ioannis K. Argyros

CONTENTS
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
10.2 Majorizing sequences for King-Werner-type methods . . . . . . . . . . . . . . . 178
10.3 Convergence analysis of King-Werner-type methods . . . . . . . . . . . . . . . . 183
10.1 Introduction
Iterative methods are used to generate a sequence of approximating a solution x∗
of the nonlinear equation
F(x) = 0, (10.1)
where F is Fréchet-differentiable operator defined on a convex subset D of a
Banach space X with values in a Banach space Y .
176
√
King-Werner-type methods of order 1 + 2 177
Newton’s method converges quadratically to x if the initial guess is close to

enough to the solution. Iterative methods of convergence order higher than two
such as Chebyshev-Halley-type methods [1]– [21] require the evaluation of the
second Fréchet-derivative, which is very expensive in general. However, there
are integral equations where the second Fréchet-derivative is diagonal by blocks
and inexpensive [1,2,5] or for quadratic equations, the second Fréchet-derivative
is constant. Moreover, in some applications involving stiff systems, high order
methods are useful. That is why it is important to study high-order methods.
In particular, Werner in [15,16] studied a method originally proposed by King
[8] defined by: Given x0 , y0 ∈ D, let
xn+1 = xn − F ( xn +y n −1
2 ) F(xn ), (10.2)
yn+1 = xn+1 − F ( n +y
x n −1
2 ) F(xn+1 )
for each n = 0, 1, 2, . . . and X = Ri , Y = R where i is a whole number. The local

convergence analysis is based on assumptions of the form:
(H0 ) There exists x ∈ D such that F(x ) = 0;
(H1 ) F ∈ C2,a (D), a ∈ (0, 1];
(H2 ) F (x)−1 ∈ L(Y, X) and F (x)−1 ≤ Γ;
(H3 ) The Lipschitz condition
F (x) − F (y) ≤ L1 x − y
holds for each x, y ∈ D;
(H4 ) The Lipschitz condition
F (x) − F (y) ≤ L2,a x − ya
holds for each x, y ∈ D;
(H5 ) U(x , b) ⊆ D for
b = min{b0 , ρ0 , ρ1 },
where
2
b0 =
L1 Γ
and ρ0 , ρ1 solve the system [14, p.337]
Bv1+a + Aw = 1
2Av2 + Avw = w,
ΓL2,a
A = 12 ΓL1 , B = 4(a+1)(a+2) .
√
The convergence order was shown to be 1 + 2. However, there are cases where
condition (H4 ) is not satisfied.
For example, define function f : [−1, 1] → (−∞, ∞) by
f (x) = x2 ln x2 + x2 + x + 1, f (0) = 1.
Then, we have that limx→0 x2 ln x2 = 0, limx→0 x ln x2 = 0, f (x) = 2x ln x2 +4x+1

and f (x) = 2(ln x2 + 4). Then, function f does not satisfy (H4 ).
Recently, McDougall et al. in [9] studied the King-Werner-type method with
repeated initial points, namely:
Given x0 ∈ D,
let
y0 = x0
x1 = x0 − F ( x0 +y0 −1
2 ) F(x0 ),
xn−1 +yn−1 −1 (10.3)
yn = xn − F ( 2 ) F(xn ),
xn+1 = xn − F ( xn +y n −1
2 ) F(xn )
for each n = 1, 2, ... and X = Y = R. Notice that the initial predictor step is just
a Newton step based on the estimated derivative. The re-use of the derivative
means that the evaluations of the yn values in (10.3) essentially come for free,
which then enables the more appropriate value of the derivative to be used√in the
corrector step in (10.3). Method (10.3) was also shown to be of order 1 + 2.
In the present chapter we study the convergence analysis of a more gen-
eralization of method (10.3) in a Banach space setting. The King-Werner-type
method is defined by: Given x0 , y0 ∈ D, let
xn = xn−1 − F ( xn−1 +y
2
n−1 −1
) F(xn−1 ) f or each n = 1, 2, ...
xn−1 +yn−1 −1 (10.4)
yn = xn − F ( 2 ) F(xn ) f or each n = 1, 2, ....
We will use only hypotheses up to the first Fréchet derivative of operator

F in order to prove the convergence of the method. Hence, the applicability is
extended.
√ Notice that the efficiency index of method (10.4) is given in [9] to
1 1
be ( 2 + 1) 2 ≈ 1.5538 which is larger than 2 2 ≈ 1.4142 of Newton’s method
1
and 3 3 ≈ 1.4422 of the cubic convergence methods such as Halley’s method [6],
Chebyshev’s method [2] and Potra-Ptak’s method [10].
The chapter is organized as follows: In Section 10.2 results on majorizing
sequences for King-Werner-type methods are presented. The semilocal and local
convergence results are provided in Section 10.3. Some numerical examples are
presented in Section 10.4.
10.2 Majorizing sequences for King-Werner-type

methods
We present some auxiliary results on majorizing sequence that will be used for
King-Werner-type methods (10.3) and (10.4).
Lemma 10.2.1
Let L0 > 0, L > 0, s ≥ 0, η > 0 be given parameters. Denote by α the only positive
√
root of polynomial p defined by

L0 3 L0 2
p(t) = t + t + Lt − L. (10.5)
2 2
Suppose that
L(s + η)
0< L0 L(s+η)
≤ α ≤ 1 − L0 η. (10.6)
2(1 − 2 (2 + 2−L0 s )η)
Then, scalar sequence {tn } generated by
t0 = 0,t1 = η, s0 = s
L(s +t −2t )(t −t )
sn = tn + n−1 Ln0 n−1 n n−1
2(1− 2 (tn−1 +sn−1 )) (10.7)
L(sn−1 +tn −2tn−1 )(tn −tn−1 )
tn+1 = tn + L f or each n = 1, 2...
2(1− 20 (tn +sn ))
is well defined, increasing, bounded above by

η
t = (10.8)
1−α
and converges to its unique least upper bound t which satisfies η ≤ t ≤ t . More-
over, the following estimates hold
0 < sn − tn ≤ α(tn − tn−1 ), 0 < tn+1 − tn ≤ α(tn − tn−1 ) (10.9)
and
sn < tn+1 (10.10)
for each n = 1, 2, ....
Proof: We have that p(0) = −L < 0 and p(1) = L0 > 0. Then if follows by the
intermediate value theorem that p has roots α ∈ (0, 1). On the other hand, by the
Descartes rule of signs α is the only positive root of polynomial p. Now from
(10.6) and the definition of sequence {tn } we have that
L(sk − tk + tk+1 − tk )
0< ≤α (10.11)
2(1 − L20 (tk + sk ))
and
L(sk − tk + tk+1 − tk )
0< ≤α (10.12)
2(1 − L20 (tk+1 + sk+1 ))
hold for k = 0. Since, (10.11) and (10.12) are satisfied for k = 0 we get from
(10.7) that
0 < sk − tk ≤ α(tk − tk−1 ) ≤ α k η (10.13)
0 < tk+1 − tk ≤ α(tk − tk−1 ) ≤ α k η (10.14)
1 − α k+1
sk ≤ η < t (10.15)
1−α
1 − α k+1
tk+1 ≤ η < t (10.16)
1−α
and
sk < tk+1
also hold for k = 1. Notice that if we show (10.12) using induction then (10.11)
follows too for all k = 0, 1, 2, .... Evidently, (10.12) holds, if
L k L0 1 − α k+1 1 − α k+2
(α η + α k η) + α( + )η − α ≤ 0. (10.17)
2 2 1−α 1−α
Estimate (10.17) motivates us to introduce recurrent polynomials fk on [0, 1)
defined by
L0
fk (t) = Lt k−1 η + (1 + t + ... + t k + 1 + t + ... + t k+1 )η − 1 (10.18)
2
and show instead of (10.17) that
fk (α) ≤ 0 f or each k = 1, 2, .... (10.19)
We need a relationship between two consecutive polynomials fk . Using (10.18)

and some algebraic manipulation we get that
fk+1 (t) = fk (t) + p(t)t k−1 η, (10.20)
where polynomial p is defined by (10.5). In particular, we have by (10.20), the

definition of α and p that
fk+1 (α) = fk (α), (10.21)
since p(α) = 0.
Let us define function f∞ on [0, 1) by
f∞ (t) = lim fk (t). (10.22)

k→∞
Moreover, using (10.18) and (10.22) we get that

L0 η
f∞ (α) = − 1. (10.23)
1−α
Then, in view of (10.19), (10.21) and (10.23) we must show instead of (10.19)
that
f∞ (α) ≤ 0, (10.24)
which is true by (10.6). Induction is now complete. It follows that sequence {tk }
is increasing, bounded above by t and as such that it converges to some t . 2
√
Remark 10.2.2 Estimate (10.6) is the sufficient convergence condition for

sequence {tn }. It shows how small η and s should be chosen so that (10.6) is
satisfied.
Next, we present a result related to another majorizing, tighter than sequence
{tn }, sequence for King-Werner-type methods (10.3) and (10.4), but for which
the sufficient convergence condition is more complicated than (10.6).
Let us define the scalar sequence {rn } by
r0 = 0, r1 = η, q0 = s, q1 = r1 + L0 (q0 +r1L−2r 0 )(r1 −r0 )

, r2 = r1 + L0 (q0 +r1L−2r0 )(r1 −r0 )
,
02(1− 2 (r0 +q0 )) 2(1−0
2 (r1 +q1 ))
qn = rn + L(qn−1 +rLn0−2rn−1 )(rn −rn−1 )
2(1− 2 (rn−1 +qn−1 ))
rn+1 = rn + L(qn−1 +rn −2r
L
n−1 )(rn −rn−1 )
2(1− 20 (rn +qn ))
(10.25)
for each n = 2,3,....
Define polynomial p1 by
p1 (t) = L0t 2 + Lt − L.
Then,
−L + L2 + 4L0 L
β=
2L0
is the only positive root of polynomial p1 . Moreover, define recurrent polynomi-
als gk on [0, 1) by
1 − t k+1 L0
gk (t) = L(r2 − r1 )t k−1 + L0 ( )(r2 − r1 ) + (2r1 + (r2 − r1 )t) − 1.
1−t 2
Then, we get
gk+1 (t) = gk (t) + p1 (t)t k−1 (r2 − r1 )
and
gk+1 (β ) = gk (β ) f or each k = 1, 2, ....
On the other hand, define function g∞ on [0, 1) by
g∞ (t) = lim gk (t).

k→∞
Then, we obtain
1
g∞ (β ) = (2L0 r1 + (r2 − r1 )β L0 − 2).
2
Finally, if one defines parameter β0 by
L(q1 + r2 − 2r1 )
β0 = .
2(1 − L20 (r2 + q2 ))
It is worth noticing that r1 , q1 and r2 depend on the initial data. Then, using the
proof of Lemma 10.2.1, the above definitions and β instead of α we arrive at the
following result.
Lemma 10.2.3 Suppose that
2(1 − L0 r1 )
0 < β0 ≤ β ≤ . (10.26)
(r2 − r1 )L0
Then, scalar sequence {rn } generated by (10.25) is well defined, increasing,

bounded above by r given by
r2 − r1
r = η +
1−β
and converges to its unique least upper bound r which satisfies
r2 ≤ r ≤ r .
Moreover, the following estimates hold
0 < rn+2 − rn+1 ≤ β n (r2 − r1 )

0 < qn+1 − rn+1 ≤ β n (r2 − r1 )
n+1
rn+2 ≤ r1 + 1−β
1−β (r2 − r1 )
n+1
qn+2 ≤ r1 + 1−β
1−β (r2 − r1 ) + β (r2 − r1 )
and
rn+1 ≤ qn
for each n = 1, 2, ....
From a straight simple induction argument and the hypothesis L0 ≤ L we get
in turn that:
Lemma 10.2.4 Suppose that hypotheses (10.6), (2.22) are satisfied and L0 ≤
L. Then, the following estimates hold
q n ≤ sn ,
rn ≤ tn ,
qn − rn ≤ sn − tn ,
rn+1 − qn ≤ tn+1 − sn ,
rn+1 − rn ≤ tn+1 − tn
and
r ≤ t .
Moreover strict inequality holds in all but the last inequality for each n = 2, 3, ...,
if L0 < L.
√
10.3 Convergence analysis of King-Werner-type

methods
In this section we provide the convergence results.
Theorem 10.3.1 Let F : D ⊂ X → Y be a Fréchet-differentiable operator. Sup-
pose that there exist x0 , y0 ∈ D, L0 > 0, L > 0, s ≥ 0, η > 0 with L0 ≤ L such that
for each x, y ∈ D
F (x0 )−1 ∈ L(Y, X) (10.27)
F (x0 )−1 F(x0 ) ≤ η (10.28)
x0 − y0 ≤ s (10.29)
−1
F (x0 ) (F (x) − F (y)) ≤ Lx − y (10.30)
F (x0 )−1 (F (x) − F (x0 )) ≤ L0 x − x0 (10.31)

U(x0 ,t ) ⊆ D (10.32)
and hypotheses of Lemma 10.2.1 hold, where t is given in Lemma 10.2.1. Then,
sequence {xn } generated by King-Werner-type (10.4) is well defined, remains in
U(x0 ,t ) and converges to a unique solution x ∈ U(x0 ,t ) of equation F(x) = 0.
Moreover, the following estimates hold for each n = 0, 1, 2, ...
xn − x ≤ t − tn , (10.33)
where, {tn } is given in Lemma 10.2.1. Furthermore, if there exists R > t such
that
U(x0 , R) ⊆ D (10.34)
and
L0 (t + R) ≤ 2, (10.35)

then, the point x is the only solution of equation F(x) = 0 in U(x0 , R).
Proof: We shall show using induction that the following assertions hold:
(Ik ) xk+1 − xk ≤ tk+1 − tk
(IIk ) yk − xk ≤ sk − tk
and
(IIIk ) F ( xk +y k −1
2 ) F (x0 ) ≤ L0
1
.
1− 2 (xk −x0 +yk −x0 )
We have that (Ik ), (IIk ) hold for k = 0 by the initial conditions (10.27)-(10.29).
Using (10.31), (II0 ), (10.6) and (10.16) we get
F (x0 )−1 (F ( x0 +y x0 +y0

2 ) − F (x0 )) ≤ L0 2 − x0 = 2 x0 − y0 (10.36)
0 L0
≤ L20 s ≤ L02η ≤ L20 t < 1.
It follows from (10.36) and the Banach lemma on invertible operators [7] that
F ( x0 +y0 −1
2 ) ∈ L(Y, X) so that (III0 ) is satisfied. Suppose that (Ik ), (IIk ), (IIIk )
hold for k < n, we get that

n+1
n+1
xn+1 − x0 ≤ xi − xi−1 ≤ (ti − ti−1 ) = tn+1 − t0 = tn+1 < t ,
i=1 i=1
yn − x0 ≤ yn − xn + xn − x0 ≤ sn − tn + tn − t0 = sn < t

xn + yn 1 1
− x0 ≤ [xn − x0 + yn − x0 ] < (t + t ) = t ,
2 2 2
and
xn + θ (xn+1 − xn ) − x0 ≤ tn + θ (tn+1 − tn ) ≤ t .
Therefore, xn+1 , yn , xn +y
2 ∈ U(x0 ,t ). As in (10.36) we get that
n
F (x0 )−1 (F ( xn +y xn +yn

2 ) − F (x0 )) ≤ L0 2 − x0
n
≤ L20 (xn − x0 + yn − x0 ) (10.37)

≤ L20 (tn + sn ) ≤ L0t < 1.
Therefore, F ( xn +yn −1
2 ) ∈ L(Y, X) and (IIIn ) holds. Using (10.4) we get in turn
that
F(xn ) = F(xn ) − F(xn−1 ) − F ( xn−1 +y n−1
)(xn − xn−1 )
1 2
= 0 [F (xn−1 + θ (xn − xn−1 )) − F ( xn−1 +y
2
n−1
)](xn − xn−1 )dθ .
(10.38)
Then, by (10.7), (10.30), (IIIn ) and (10.38) we get
xn+1 − xn ≤

F ( xn +y n −1
2 ) F (x0 )F (x0 )
−1 1
0
[F (xn−1 + θ (xn − xn−1 ))
xn−1 +yn−1
−F ( )]dθ xn − xn−1
2 1
≤ L0
L
2xn−1 + 2θ (xn − xn−1 ) − xn−1 − yn−1 xn − xn−1 dθ
2(1− 2 (tn +sn )) 0
1
≤ L0
L
0
(xn−1 − yn−1 ) + 2θ (xn − xn−1 )xn − xn−1 dθ
2(1− 2(tn +sn ))
≤ L
L
(sn−1 − tn−1 + tn − tn−1 )(tn − tn−1 ) = tn+1 − tn ,
2(1− 20 (tn +sn ))
(10.39)
which shows (In ). Moreover, from (10.4) we get that
yn − xn ≤ F ( xn−1 +y
2
n−1 −1
) F (x0 )F (x0 )−1 F(xn )
L(sn−1 −tn−1 +tn −tn−1 )(tn −tn−1 ) (10.40)
≤ L0 = sn − tn ,
2(1− 2 (tn−1 +sn−1 ))
which shows that (IIn ) holds. The induction is now complete. Thus, {xn } is
a complete sequence in a Banach space X and as such it converges to some
x ∈ U(x0 ,t ) (since U(x0 ,t ) is a closed set). In view of (10.39) we obtain the
estimate
L
F (x0 )−1 F(xn ) ≤ ((sn−1 − tn−1 ) + (tn − tn−1 ))(tn − tn−1 ). (10.41)
2
√
By letting n → ∞ in (10.41) we obtain that F(x ) = 0. Estimate (10.33) is ob-

tained from (In ) by using standard majorization techniques [2,5,7,12]. The proof
of the uniqueness part can be founded in [6]. 2
Remark 10.3.2 It follows from the proof of (10.39) and (10.40) for n = 1
and x0 = y0 that (10.30) is not needed in the computation of the upper bound on
x2 − x1 and y1 − x1 . Then, under the hypotheses of Lemma 10.2.3 we obtain
(using (10.31) instead of (10.30)) that

x2 − x1 ≤ F ( x1 +y 1 −1
2 ) F (x0 )F (x0 )
−1 1
0
[F (x0 + θ (x1 − x0 ))
−F ( x0 +y 0
)]x − x dθ
2 1 0
1
≤ L0 x1 −x0 F (x0 )−1 0 [F (x0 + θ (x1 − x0 )) − F (x0 )]dθ
1− 2 (x1 −x0 +y1 −x0 )
L0 (q0 −r0 +r1 −r0 )(r1 −r0 )
≤ L = r2 − r1
2(1− 20 (q1 +r1 ))
and similarly
y1 − x1 ≤ q1 − r1 ,
which justify the definition of q1 , r1 , r2 and consequently the definition of se-
quence {rn }.
Then, we present the following result.
Theorem 10.3.3 Let F : D ⊆ X → Y be a Fréchet-differentiable operator.
Suppose that there exist x0 , y0 ∈ D, L0 > 0, L > 0, η > 0 with L0 ≤ L such that
for each x, y ∈ D
F (x0 )−1 ∈ L(Y, X)
F (x0 )−1 F(x0 ) ≤ η
F (x0 )−1 (F (x) − F (y)) ≤ Lx − y
F (x0 )−1 (F (x) − F (x0 )) ≤ L0 x − x0
U(x0 , r ) ⊆ D
and hypotheses of Lemma 10.2.3 hold with s = 0, where r is given in Lemma
10.2.3. Then, sequence {xn } generated by King-Werner-type method (10.3) with
y0 = x0 is well defined, remains in U(x0 , r ) and converges to a unique solution
x ∈ U(x0 , r ) of equation F(x) = 0. Moreover, the following estimates hold for
each n = 0, 1, 2, ...
xn − x ≤ r − rn ,
where, {rn } is given in Lemma 10.2.3. Furthermore, if there exists R > r such
that
U(x0 , R) ⊆ D
and
L0 (r + R) ≤ 2,
Remark 10.3.4 (a) It is worth noticing that (10.31) is not an additional to

(10.30) hypothesis, since (10.30) always implies (10.31) but not necessarily vice
versa. We also have that
L0 ≤ L
holds in general and LL0 can be arbitrarily large [2–6]. In the literature (with the
exception of our works) (10.30) is only used for the computation of the upper
bounds of the inverses of the operators involved.
(b) The limit point t (or r ) can replaced by t (or r ) (which are given in
closed form) in the hypotheses of Theorem 10.3.1 (or Theorem 10.3.3).
Next, we present the local convergence analysis of King-Werner-type method
(10.4).
Theorem 10.3.5 Let F : D ⊆ X → Y be a Fréchet-differentiable operator.
Suppose that there exist x ∈ D, l0 > 0, l > 0 with l0 ≤ l such that for each
x, y ∈ D
F(x ) = 0, F (x )−1 ∈ L(Y, X) (10.42)
F(x )−1 (F (x) − F (y)) ≤ lx − y (10.43)
−1
F(x ) (F (x) − F (x )) ≤ l0 x − x (10.44)
and
U(x , ρ) ⊆ D, (10.45)
where
2
ρ= . (10.46)
3l + 2l0
Then, sequence {xn } generated by King-Werner-type method (10.4) is well de-
fined, remains in U(x , ρ) and converges to x , provided that x0 , y0 ∈ U(x , ρ).
Proof: Using the hypotheses that x0 , y0 ∈ U(x , ρ), (10.44) and (10.46) we have
F (x )−1 (F (x ) − F (x0 )) ≤ l0 x − x0 < l0 ρ < 1. (10.47)
It follows from (10.47) and the Banach lemma on invertible operators that
F (x0 )−1 ∈ L(Y, X) and
1
F (x0 )−1 F (x ) ≤ . (10.48)
1 − l0 x − x0
Similarly, we have that
x0 + y0 1 1
− x ≤ (x − x0 + x − y0 ) < (ρ + ρ) = ρ,
2 2 2
x0 +y0
so, 2 ∈ U(x , ρ) and
x0 + y0 x + y0 l
F (x )−1 (F (x )−F ( )) ≤ l0 x − 0 ≤ 0 (x −x0 +x −y0 ) < l0 ρ < 1.
2 2 2
(10.49)
√
Then, again we have that F ( x0 +y0 −1

2 ) ∈ L(Y, X) and
x0 + y0 −1 1
F ( ) F (x ) ≤ . (10.50)
2 1 − 2 (x − x0 + x − y0 )
l 0
Using the second substep of (10.4) and (10.42), we get in turn that
x1 − x = x0 − x − F ( x0 +y0 −1 x0 +y0 −1
2 ) F(x0 ) = F ( 2 )
[F ( x0 +y
2 )(x0 − x ) −F(x0 )]
0
x0 +y0 −1 1 (10.51)
= −F ( 2 ) F (x ) 0 F (x )−1 [F (x + θ (x0 − x ))
+y
−F ( 0 2 0 )](x0 − x )dθ .
x
Then, in view of (10.43), (10.46), (10.50) and (10.51) we obtain that

1 −1
x1 − x ≤ F ( x0 +y 0 −1
2 ) F (x ) 0 F (x ) [F (x + θ (x0 − x ))
0 0 +y
−F

( 2 )]x0 − x dθ
x
x +y
1
l 0 2 0 −x −θ (x0 −x )x0 −x dθ
≤ 0
l0 −y )
1 1−1 2 (x −x0 +x

0

2 −θ |x0 −x + 2 y0 −x )dθ
1
l(|
< 0
1−l0 ρ x0 − x
3
4 lρ
< 1−l0 ρ x0 − x < x0 − x < ρ.
(10.52)
Therefore, x1 ∈ U(x , ρ).
Notice that x0 +y
2 ∈ U(x , ρ), since
0
x0 + y0 1
− x ≤ (x0 − x + y0 − x ) < ρ,
2 2
using the first substep of (10.4) and (10.42) we get that
y1 − x = x1 − x − F ( x0 +y 0 −1
2 ) F(x1 )
+y 1
= −F ( 0 2 0 )−1 F (x ) 0 [F (x )−1 (F (x + θ (x1 − x ))
x
(10.53)
−F ( x0 +y
2 ))](x1 − x )dθ .
0
Next, using (10.43), (10.46), (10.50) and (10.53) we obtain in turn that
1 −1
y1 − x ≤ F ( x0 +y 0 −1
2 ) F (x ) 0 F (x ) (F (x + θ (x1 − x ))

x0 +y0
−F ( 2 ))x1 − x dθ

≤ l0
l

( x −x
2
0
+ x −y
2
0
+ x −x
2
1
)x1 − x
1− 2 (x0 −x +y0 −x )
3lρ
≤ 2(1−l0 ρ) x1 − x ≤ x1 − x < ρ,
(10.54)
x1 +y1
which shows y1 ∈ U(x , ρ). We also have that ∈ U(x , ρ). If we replace
2
the role of x0 , y0 by xk , yk in the preceding estimates we obtain as in (10.52) and
(10.54) that
xk+1 − x < xk − x < ρ (10.55)
and
yk+1 − x < xk+1 − x < ρ, (10.56)

respectively. By letting k → ∞ in (10.55), we obtain limk→∞ xk = x . 2
Remark 10.3.6 Notice that the radius of convergence for Newton’s method
due to Traub [12] or Rheinboldt [11] is given by
2
ξ= . (10.57)
3l
The corresponding radius due to us in [2, 5, 6] is given by
2
ξ1 = . (10.58)
2l0 + l
Comparing (10.46), (10.57) and (10.58), we see that
ρ < ξ < ξ1 . (10.59)
Notice however that King-Werner-type method (10.4) is faster than Newton’s
method. Finally notice that
ρ ρ 1 ξ1 l0
→ 1, → and →3 as → 0. (10.60)
ξ ξ1 3 ξ l

Example 10.4.1 Let X = Y = R, D = U(0, 1) and x = 0. Define mapping F on
D by
F(x) = ex − 1. (10.61)
Then, the Fréchet-derivatives of F are given by
F (x) = ex , F (x) = ex .
Notice that F(x ) = 0, F (x ) = F (x )−1 = 1, a = 1, L2,1 = e, l0 = e − 1 < l =
L1 = e and Γ = e. Then, using Theorem 10.3.5 and Remark 10.3.6 we obtain the
following radii
ξ = .245252961, ξ1 = .324947231 and ρ = .172541576.
2 2
Then, the system in (H5 ), since A = e
2 and B = e
24 becomes
e2 2 e2
v + w=1
24 2
e2
vw = w.
e 2 v2 +
2
Hence, we have ρ0 = .1350812188, ρ1 = .2691499885, b0 = .270670566 and
b = .1350812188. Notice that under our approach and under weaker hypotheses
b < ρ < ξ < ξ1 .
√
Next we present an example when X = Y = C[0, 1].

Example 10.4.2 Let X = Y = C[0, 1], the space of continuous functions defined
on [0, 1] equipped with the max norm and D = U(0, 1). Define operator F on D
by
1
F(x)(t) = x(t) − tθ x3 (θ )dθ .
0
Then, we have
1

F (x)(w)(t) = w(t) − 3 tθ x2 (θ )w(θ )dθ f or each w ∈ D.
0
Then, we have for x (t) = 0 (t ∈ [0, 1]) that l = L1 = 3 and l0 = 32 . Then, using
(10.46) we obtain that
1
ρ= .
6
Example 10.4.3 Let also X = Y = C[0, 1] equipped with the max norm and D =
U(0, r) for some r > 1. Define F on D by
1
F(x)(s) = x(s) − y(s) − μ G(s,t)x3 (t)dt, x ∈ C[0, 1], s ∈ [0, 1].
0
y ∈ C[0, 1] is given, μ is a real parameter and the Kernel G is the Green’s function
defined by
(1 − s)t if t ≤ s
G(s,t) =
s(1 − t) if s ≤ t.
Then, the Fréchet derivative of F is defined by
1

(F (x)(w))(s) = w(s) − 3μ G(s,t)x2 (t)w(t)dt, w ∈ C[0, 1], s ∈ [0, 1].
0
Let us choose x0 (s) = y(s) = 1 and |μ| < 83 . Then, we have that
I − F (x0 ) ≤ 38 μ, F (x0 )−1 ∈ L(Y, X),

|μ|
F (x0 )−1 ≤ 8−3|μ|
8
, η = 8−3|μ| , L0 = 3(1+r)|μ|
8−3|μ| ,
and
6r|μ|
L= .
8 − 3|μ|
Let us simple choose y0 (s) = 1, r = 3 and μ = 12 . Then, we have that
s = 0, η = .076923077, L0 = .923076923, L = 1.384615385

and
L(s + η)
L(s+η)
= 0.057441746, α = 0.711345739,
2(1 − 2 (2 + 2−L0 s )η)
L0
1 − L0 η = 0.928994083.
That is, condition (10.6) is satisfied and Theorem 10.3.1 applies.
References

in Computational Mathematics 15, Editors, C. K. Chui and L. Wuytack,
Elservier Publ. Co. New York, U.S.A., 2007.
methods, Math. Comput., 80 (273), 327–343, 2011.
336–346, 2015.
[5] I. K. Argyros, A. Cordero, Á. A. Magreñán and J. R. Torregrosa, On the
convergence of a higher order family of methods and its dynamics. J. Com-
put. Appl. Math., 80, 327–343, 2016.
[6] I. K. Argyros and S. Hilout, Estimating upper bounds on the limit points of
majorizing sequences for Newton’s method, Numer. Algorithms, 62, 115–
132, 2013.
[7] I. K. Argyros and S. Hilout, Computational methods in nonlinear analysis.
Efficient algorithms, fixed point theory and applications, World Scientific,
2013.
[8] I. K. Argyros and H. M. Ren, Ball convergence theorems for Halley’s
method in Banach spaces. J. Appl. Math. Comput., 38, 453–465, 2012.
Chemistry, 53 (3), 893–910, 2015.
191
192 References
282, 254–275, 2016.
Oxford, 1982.
[12] R. F. King, Tangent methods for nonlinear equations, Numer. Math., 18,
298–304, 1972.
[15] T. J. McDougall and S. J. Wotherspoon, A simple
√ modification of Newton’s
method to achieve convergence of order 1 + 2, Appl. Math. Let., 29, 20–
25, 2014.
[16] F. A. Potra and V. Ptak, Nondiscrete induction and iterative processes [J].
Research Notes in Mathematics, 103, Pitman, Boston, 5, 112–119, 1984.
[17] H. M. Ren, Q. B. Wu and W. H. Bi, On convergence of a new secant like
method for solving nonlinear equations, Appl. Math. Comput., 217, 583–
589, 2010.
129–142, 1977.
[19] J. F. Traub, Iterative Methods for the Solution of Equations, Englewood
Cliffs, Prentice Hull, 1984.
√
[20] W. Werner, Uber ein Verfahren der Ordnung 1 + 2 zur Nullstellenbestim-
mung, Numer. Math., 32, 333–342, 1979.
√
[21] W. Werner, Some supplementary results on the 1 + 2 order method for
the solution of nonlinear equations, Numer. Math., 38, 383–392, 1982.
Chapter 11
Generalized equations
and Newton’s method
Ioannis K. Argyros

CONTENTS
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
11.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
11.1 Introduction
In [18], G. S. Silva considered the problem of approximating the solution of the
generalized equation
F(x) + Q(x) 0, (11.1)
where F : D −→ H is a Fréchet differentiable function, H is a Hilbert space
with inner product ., . and corresponding norm ., D ⊆ H an open set and
T : H ⇒ H is set-valued and maximal monotone. It is well known that the sys-
tem of nonlinear equations and abstract inequality system can be modelled as
equation of the form (11.1) [17]. If ψ : H −→ (−∞, +∞] is a proper lower
193
semicontinuous convex function and
Q(x) = ∂ ψ(x) = {u ∈ H : ψ(y) ≥ ψ(x) + u, y − x}, for all y ∈ H (11.2)
then (11.1) becomes the variational inequality problem
F(x) + ∂ ψ(x) 0,
including linear and nonlinear complementary problems. Newton’s method for

solving (11.1) for an initial guess x0 is defined by
F(xk ) + F (xk )(xk+1 − xk ) + Q(xk+1 ) 0, k = 0, 1, 2 . . . (11.3)
has been studied by several authors [1]– [24].

In [13], Kantorovich obtained a convergence result for Newton’s method for
solving the equation F(x) = 0 under some assumptions on the derivative F (x0 )
and F (x0 )−1 F(x0 ). Kantorovich, used the majorization principle to prove his
results. Later in [16], Robinson considered generalization of the Kantorovich
theorem of the type F(x) ∈ K, where K is a nonempty closed and convex cone,
and obtained convergence results and error bounds for this method. Josephy [12],
considered a semilocal Newton’s method of the kind (11.3) in order to solving
(11.1) with F = NC the normal cone mapping of a convex set C ⊂ R2 .
In this chapter, we expand the applicability of Newton’s method for solving
(11.1) using the same functions as in [18].
The rest of this chapter is organized as follows. In Section 11.2, we present
some preliminaries and in Section 11.3 we present the main results are presented.
11.2 Preliminaries
Let U(x, ρ) and Ū(x, ρ) stand respectively for open and closed balls in H with
center x ∈ H and radius ρ > 0.
The following items are stated briefly here in order to make the chapter as
selfcontains as possible. More details can be found in [18].
Definition 11.2.1 Let D ⊆ H be an open nonempty subset of H, h : D −→ H be
a Fréchet differentiable function with Fréchet derivative h and Q : H ⇒ H be
a set mapping. The partial linearization of the mapping h + Q at x ∈ H is the
set-valued mapping Lh (x, .) : H ⇒ H given by
Lh (x, y) := h(x) + h (x)(y − x) + Q(y). (11.4)
For each x ∈ H, the inverse Lh (x, .)−1 : H ⇒ H of the mapping Lh (x, .) at z ∈ H

is defined by
Lh (x, .)−1 := {y ∈ H : z ∈ h(x) + h (x)(y − x) + Q(y)}. (11.5)

Generalized equations and Newton’s method 195
Definition 11.2.2 Let Q : H ⇒ H be a set-valued operator. Q is said to be mono-

tone if for any x, y ∈ domQ and u ∈ Q(y), v ∈ Q(x) implies that the following
inequality holds:
u − v, y − x ≥ 0.
A subset of H × H is monotone if it is the graph of a monotone operator. If
ϕ : H −→ (−∞, +∞] is a proper function then the subgradient of ϕ is monotone.
Definition 11.2.3 Let Q : H ⇒ H be monotone. Then Q is maximal monotone if

the following implication holds for all x, u ∈ H :
u − v, y − x ≥ 0 for each y ∈ domQ and v ∈ Q(y) ⇒ x ∈ domQ and v ∈ Q(y).

(11.6)
Lemma 11.2.4 ( [22]) Let G be a positive operator (i.e.,G(x), x ≥ 0). The fol-
lowing statements about G hold:
G2 = G2 ;
If G−1 exists, then G−1 is a positive operator.
Lemma 11.2.5 ( [21, Lemma 2.2]) Let G be a positive operator. Suppose G−1
exists, then for each x ∈ H we have
x2
G(x), x ≥ .
G−1

In this section we use some more flexible scalar majorant functions than in [18]
in order to obtain our main convergence result.
Theorem 11.3.1 Let F : D ⊆ H −→ H be continuous with Fréchet derivative F

continuous on D. Let also Q : H ⇒ H be a set-valued operator. Suppose that there
exists x0 ∈ D such that F (x0 ) is a positive operator and F̂ (x0 )−1 exists. Let R > 0
and ρ := sup{t ∈ [0, R) : U(x0 ,t) ⊆ D}. Suppose that there exists f0 : [0, R) −→ R
twice continuously differentiable such that for each x ∈ U(x0 , ρ)
F̂ (x0 )−1 F (x) − F (x0 ) ≤ f0 (x − x0 ) − f0 (0). (11.7)
Moreover, suppose that

(h01 ) f0 (0) > 0 and f0 (0) = −1.
(h02 ) f0 is convex and strictly increasing.
(h03 ) f0 (t) = 0 for some t ∈ (0, R).
Then, sequence {tn0 } generated by t00 = 0,
f0 (tn0 )
0
tn+1 = tn0 − , n = 0, 1, 2, . . .
f0 (tn0 )
is strictly increasing, remains in (0,t0∗ ) and converges to t0∗ , where t0∗ is the small-
est zero of function f0 in (0, R). Furthermore, suppose that for each x, y ∈ D1 :=
Ū(x0 , ρ) ∩U(x0 ,t0∗ ) there exists f1 : [0, ρ1 ) −→ R, ρ1 = min{ρ,t0∗ } such that
F̂ (x0 )−1 F (y) − F (x) ≤ f1 (x − y + x − x0 ) − f1 (x − x0 )
(11.8)
and x1 − x0 ≤ f1 (0)

(h11 ) f1 (0) > 0 and f1 (0) = −1.
(h13 ) f1 (t) = 0 for some t ∈ (0, ρ1 ).
(h14 ) f0 (t) ≤ f1 (t) and f0 (t) ≤ f1 (t) for each t ∈ [0, ρ1 ).
Then, f1 has a smallest zero t1∗ ∈ (0, ρ1 ), the sequences generated by generalized
Newton’s method for solving the generalized equation F(x) + Q(x) 0 and the
scalar equation f1 (0) = 0, with initial point x0 and t01 (or s0 = 0), respectively,
0 ∈ F(xn ) + F (xn )(xn+1 − xn ) + Q(xn+1 ),
f1 (tn ) f1 (sn )
1
tn+1 = tn1 − (or sn+1 = sn − )
f1 (tn ) f0 (sn )
are well defined, {tn1 } (or sn ) is strictly increasing, remians in (0,t1∗ ) and
converges to t1∗ . Moreover, sequence {xn } generated by generalized Newton’s
method (11.3) is well defined, remains in U(x0 ,t1∗ ) and converges to a point x∗ ∈
Ū(x0 ,t1∗ ), which is the unique solution of generalized equation F(x) + Q(x) 0
in Ū(x0 ,t1∗ ). Furthermore, the following estimates hold:
xn − x∗ ≤ t1∗ − tn1 , xn − x∗ ≤ t1∗ − sn ,
t1∗ − tn+1
1
xn+1 − x∗ ≤ xn − x∗ 2
(t1∗ − tn1 )2
t1∗ − s1n+1
xn+1 − x∗ ≤ xn − x∗ 2
(t1∗ − s1n )2
sn+1 − sn ≤ tn+1
1
− tn1 ,
and sequences {tn1 }, {sn } and {xn } converge Q−linearly as follows:

1
xn+1 − x∗ ≤ xn − x∗ ,
2
1
t1∗ − tn+1
1
≤ (t1∗ − tn ),
2
and
1
t1∗ − s1n+1 ≤ (t1∗ − sn ).
2
Finally, if
(h15 ) f1 (t1∗ ) < 0,
then the sequences {tn1 }, {sn } and {xn } converge Q−quadratically as follows:
D− f1 (t1∗ )
xn+1 − x∗ ≤ xn − x∗ 2 ,
−2 f1 (t1∗ )
D− f1 (t1∗ )
xn+1 − x∗ ≤ xn − x∗ 2 ,
−2 f0 (t1∗ )
D− f1 (t1∗ ) ∗
t1∗ − tn+1 ≤ (t − tn )2 ,
−2 f1 (t1∗ ) 1
and
D− f1 (t1∗ ) ∗
t1∗ − sn+1 ≤ (t − sn )2 ,
−2 f0 (t1∗ ) 1
where D− stands for the left directional derivative of function f1 .
Remark 11.3.2 (a) Suppose that there exists f : [0, R) −→ R twice continu-
ously differentiable such that for each x, y ∈ U(x0 , ρ)
F̂ (x0 )−1 F (y) − F (x) ≤ f (x − y + x − x0 ) − f (x − x0 ).

(11.9)
If f1 (t) = f0 (t) = f (t) for each t ∈ [0, R), then Theorem 11.3.1 specializes
to Theorem 4 in [18]. On the other hand, if
f0 (t) ≤ f1 (t) ≤ f (t) for each t ∈ [0, ρ1 ), (11.10)
then, our Theorem is an improvement of Theorem 4 under the same com-

putational cost, since in practice the computation of function f requires
the computation of functions f0 or f1 as special cases. Moreover, we have
that for each t ∈ [0, ρ1 )
f0 (t) ≤ f (t) (11.11)
and
f1 (t) ≤ f (t) (11.12)
leading to tn1 ≤ tn , sn ≤ tn ,
1
tn+1 − tn1 ≤ tn+1 − tn (11.13)
s1n+1 − s1n ≤ sn+1 − sn (11.14)

and
t1∗ ≤ t ∗ , (11.15)
where {tn } is defined by
f (tn )
t0 = 0, tn+1 = tn − ,
f (tn )
t ∗ = limn−→∞ tn and t ∗ is the smallest zero of function f in (0, R) (provided

that the “h” conditions hold for function f replacing f1 and f0 ). If
f1 (t) f (s)
− ≤− (11.16)
f1 (t) f (s)
or
f1 (t) f (s)
− ≤− , (11.17)
f0 (t) f (s)
respectively for each t ≤ s. Estimates (11.13) and (11.14) can be strict if
(11.16) and (11.17) hold as strict inequalities.
(b) We have improved the error bounds and the location of the solution x∗
but not necessarily the convergence domain of the generalized Newton’s
method (11.3). We can also show that convergence domain can be im-
proved in some interesting special cases. Let F ≡ {0},
L
f (t) = t 2 − t + η,
2
L0 2
f0 (t) = t −t +η
2
and
L1 2
f1 (t) =
t − t + η,
2
where x1 − x0 ≤ η and L, L0 and L1 are Lipschitz constants satisfying:
F̂ (x0 )−1 F (y) − F (x) ≤ Ly − x
F̂ (x0 )−1 F (x) − F (x0 ) ≤ L0 x − x0

and
F̂ (x0 )−1 F (y) − F (x) ≤ L1 y − x,
on the corresponding balls. Then, we have that
L0 ≤ L
and
L1 ≤ L.
The corresponding majorizing sequences are
f (tn ) L(tn − tn−1 )2
t0 = 0, tn+1 = tn −
= tn +
f (tn ) 2(1 − Ltn )
1 1 2
f1 (tn1 ) L1 (tn − tn−1 )
t01 = 0, tn+1
1
= tn1 − = tn1 +
f1 (tn1 ) 2(1 − Ltn1 )
f1 (sn ) − f1 (sn−1 ) − f1 (sn−1 )(sn − sn−1 ) L1 (sn − sn−1 )2
s0 = 0, sn+1 = sn − = s n + .
f0 (sn ) 2(1 − L0 sn )
Then, sequences converge provided, respectively that
1
h = Lη ≤ (11.18)
2
and for the last two
1
h1 = L1 η ≤ ,
2
so
1 1
h≤
=⇒ h1 ≤ .
2 2
It turns out from the proof of Theorem 11.3.1 that sequence {rn } defined
by [6]
L0 (r1 − r0 )2
r0 = 0, r1 = η, r2 = r1 + ,
2(1 − L0 r1 )
L1 (rn+1 − rn )2
rn+2 = rn+1 +
2(1 − L0 rn+1 )
is also a tighter majorizing sequence than the preceding ones for {xn }.
The sufficient convergence condition for {rn } is given by [6]:
1
h2 = Kη ≤ ,
2
where
1 √
K = (4L0 + L1 L0 + 8L02 + L0 L1 ).
8
Then, we have that
1 1
h1 ≤ =⇒ h2 ≤ .
2 2
Therefore, the old results in [18] have been also improved. Similar im-
provements can follow for the Smale’s alpha theory [2, 6] or Wang’s the-
ory [18,22,24]. Examples where L0 < L or L1 < L or L0 < L1 can be found
in [6]. Notice that (11.18) is the celebrated Newton-Kantorovich hypothe-
sis for solving nonlinear equations using Newton’s method [13] employed
as a sufficient convergence condition in all earlier studies other than ours.
(c) The introduction of (11.8) depends on (11.7) (i.e., f1 depends on f0 ). Such
an introduction was not possible before (i.e., when f was used instead of
f1 ).
Now we can prove the Theorem 11.3.1.
Proof of Theorem 11.3.1 It is clear that the iterates {xn } ∈ D1 which is a more
precise location than Ū(x0 , ρ) used in [18], since D1 ⊆ Ū(x0 , ρ). Then, the defi-
nition of function f1 becomes possible and replaces f in the proof [18], whereas
for the computation on the upper bounds F̂ (x)−1 we use the more precise f0
than f as it is shown in the next perturbation Banach lemma [13].
2
Lemma 11.3.3 Let x0 ∈ D be such that F̄ (x0 ) is a positive operator and
F̄ (x0 )−1 exists. If x−x0 ≤ t < t ∗ , then F̄ (x) is a positive operator and F̄ (x)−1
exists. Furthermore,
F̄ (x0 )−1
F̄ (x)−1 ≤ . (11.19)
f0 (t)
Proof: It is easy to see that
1 1
F̄ (x)− F̄ (x0 ) ≤ F̄ (x)− F̄ (x0 )+ (F̄ (x)− F̄ (x0 ))∗ = F̄ (x)− F̄ (x0 ).
2 2
(11.20)
∗
Let x ∈ Ū(x0 ,t), 0 ≤ t < t . Thus f (t) < 0. Using (h1 ) and (h2 ), we get in turn
1 1
that
F̄ (x0 )−1 F̄ (x) − F̄ (x0 ) ≤ F̄ (x0 )−1 F̄ (x) − F̄ (x0 )
≤ f0 (x − x0 ) − f0 (0)
< f0 (t) + 1 < 1. (11.21)
Now, using Banach’s Lemma on invertible operators, we have F̄ (x)−1 exists.
Moreover, using the above inequality,
F̄ (x0 )−1 F̄ (x0 )−1 F (x0 )−1
F̄ (x)−1 ≤ ≤ = − .
1 − F̄ (x0 )−1 F (x) − F (x0 ) 1 − ( f0 (t) + 1) f0 (t)
Now, using (11.21) we obtain in turn that
1
F̄ (x) − F̄ (x0 ) ≤ . (11.22)
F̄ (x0 )−1
Consequently, we get
y2
(F̄ (x0 ) − F̄ (x))y, y ≤ F̄ (x0 ) − F̄ (x)y2 ≤ ,
F̄ (x −1
0)
which implies,
y2
F̄ (x0 )y, y − ≤ F̄ (x)y, y. (11.23)
F̄ (x0 )−1
Now since F̄ (x0 ) is a positive operator and F̄ (x0 )−1 exists by assumption, we
get
y2
F̄ (x0 )y, y ≥ . (11.24)
F̄ (x0 )−1

The result now follows from (11.23) and (11.24).

2
Remark 11.3.4 This result improves the corresponding one in [18, Lemma 8]
(using function f instead of f0 or f1 ) leading to more precise estimates on the
distances xn+1 − x∗ which together with idea of restricted convergence do-
mains lead to the aforementioned advantages stated in Remark 11.3.2.
References
[1] I. K. Argyros, Concerning the convergence of Newton’s method and

quadratic majorants, J. Appl. Math. Computing, 29, 391–400, 2009.
[2] I. K. Argyros, A Kantorovich-type convergence analysis of the Newton-
Josephy method for solving variational inequalities, Numer. Algorithms, 55,
447–466, 2010.
[3] I. K. Argyros, Variational inequalities problems and fixed point problems,
Computers Math. Appl., 60, 2292–2301, 2010.
[4] I. K. Argyros, Improved local convergence of Newton’s method under weak
majorant condition, J. Comput. Appl. Math., 236, 1892–1902, 2012.
[5] I. K. Argyros, Improved local converge analysis of inexact Gauss-Newton
like methods under the majorant condition, J. Franklin Inst., 350 (6), 1531–
1544, (2013),
[7] A. I. Dontchev and R. T. Rockafellar, Implicit functions and solution map-
pings, Springer Monographs in Mathematics, Springer, Dordrecht (2009).
[8] O. Ferreira, A robust semi-local convergence analysis of Newtons method
for cone inclusion problems in Banach spaces under affine invariant majo-
rant condition, Journal of Comput. Appl. Math., 279, 318–335, 2015.
[9] O. P. Ferreira, M. L. N. Goncalves and P. R. Oliveria, Convergence of the
Gauss-Newton method for convex composite optimization under a majo-
rant condition, SIAM J. Optim., 23 (3), 1757–1783, 2013.
[10] O. P. Ferreira and G. N. Silva, Inexact Newton’s method to nonlinear func-
tions with values in a cone, arXiv: 1510.01947 (2015).
202
References 203
[11] O. P. Ferreira and B. F. Svaiter, Kantorovich’s majorants principle for New-

ton’s method, Comput. Optim. Appl., 42 (2), 213–229, 2009.
[12] N. Josephy, Newton’s method for generalized equations and the PIES en-
ergy model, University of Wisconsin-Madison, 1979.
[13] L. V. Kantorovic̆, On Newton’s method for functional equations, Doklady
Akad Nauk SSSR (N.S.), 59, 1237–1240, 1948.
[14] A. Pietrus and C. Jean-Alexis, Newton-secant method for functions with
values in a cone, Serdica Math. J., 39 (3-4), 271–286, 2013.
[15] F. A. Potra, The Kantorovich theorem and interior point methods, Mathe-
matical Programming, 102 (1), 47–70, 2005.
[16] S. M. Robinson, Strongly regular generalized equations, Math. Oper. Res.,
5 (1), 43–62, 1980.
[17] R. T. Rochafellar, Convex analysis, Princeton Mathematical Series, No. 28,
Princeton University Press, Princeton, N. J., 1970.
[18] G. N. Silva, On the Kantorovich’s theorem for Newton’s method for solving
generalized equations under the majorant condition, Appl. Math. Comput.,
286, 178–188, 2016.
[19] S. Smale, Newtons method estimates from data at one point. In R. Ewing,
K. Gross and C. Martin, editors, The Merging of Disciplines: New Direc-
tions in pure, Applied and Computational Mathematics, pages 185–196,
Springer New York, 1986.
[20] J. F. Traub and H. Woźniakowski, Convergence and complexity of Newton
iteration for operator equations, J. Assoc. Comput. Mach., 26 (2), 250–258,
1979.
[21] L. U. Uko and I. K. Argyros, Generalized equation, variational inequali-
ties and a weak Kantorivich theorem, Numer. Algorithms, 52 (3), 321–333,
2009.
[22] J. Wang, Convergence ball of Newton’s method for generalized equation
and uniqueness of the solution, J. Nonlinear Convex Anal., 16 (9), 1847–
1859, 2015.
Newton-Kantorivich approximations and the Pták error estimates. Numer.
Funct. Anal. Optim., 9 (5-6), 671–684, 1987.
[24] Y. Zhang, J. Wang and S. M. Gau, Convergence criteria of the generalized
Newton method and uniqueness of solution for generalized equations, J.
Nonlinear Convex. Anal., 16 (7), 1485–1499, 2015.
Chapter 12
Newton’s method for

using restricted domains
Ioannis K. Argyros

CONTENTS
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
12.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
12.4 Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
12.1 Introduction
In this chapter we are concerned with the study of the generalized equation
F(x) + Q(x) 0, (12.1)
204
Newton’s method for generalized equations using restricted domains 205
where F : D −→ H is a nonlinear Fréchet differentiable defined on the open sub-

set D of the Hilbert space H, and Q : H ⇒ H is set-valued and maximal mono-
tone. Many problems from Applied Sciences can be solved finding the solutions
of equations in a form like (12.1) [17–22,27,28,30,39]. If ψ : H −→ (−∞, +∞]
is a strict lower semicontinuous convex function and
Q(x) = ∂ ψ(x) = {u ∈ H : ψ(y) ≥ ψ(x) + u, y − x}, for all y ∈ H,
F(x) + ∂ ψ(x) 0,
including linear and nonlinear complementary problems, additional comments

about such problems can be found in [1–32, 36–40].
In the present chapter, we consider the well-known Newton’s method defined
as
F(xk ) + F (xk )(xk+1 − xk ) + F(xk+1 ) 0 for each n = 0, 1, 2, . . . (12.2)
for solving (12.1) in an approximate way. We will use the idea of restricted con-
vergence domains to present a convergence analysis of (12.2). In our analysis we
relax the Lipschitz type continuity of the derivative of the operator involved. Our
main goal in this chapter is to find larger convergence domain for the method
(12.2). By means of using the restricted convergence domains, we obtained a
finer convergence analysis, with the advantages (A):
tighter error estimates on the distances involved and
the information on the location of the solution is at least as precise.
These advantages were obtained (under the same computational cost) using
the same or weaker hypotheses as in [36].
The rest of the chapter is organized as follows. In Section 12.2 we present
some preliminaries, then in Section 12.3 we present the local convergence study
of method (12.2). Finally, in Section 12.4 we provide the special cases and nu-
merical examples.
12.2 Preliminaries
In order to make the chapter as self contained as possible we reintroduce some
standard notations and auxiliary results for the monotonicity of set valued oper-
ators presented in [18, 22, 27].
Throughout this chapter we will denote by U(w, ρ),U(w, ρ), the open and
closed balls in B1 , respectively, with center w ∈ H and of radius ρ > 0.

tone if for any x, y ∈ domQ and, u ∈ Q(y), v ∈ Q(x) implies that the following
inequality holds:
u − v, y − x ≥ 0.

the following holds for all x, u ∈ H :
u − v, y − x ≥ 0 for each y ∈ domQ and v ∈ Q(y) =⇒ x ∈ domQ and v ∈ Q(x).
We will be using the following results for proving our results.

Lemma 12.2.3 Let G be a positive operator (i.e., G(x), x ≥ 0). The following
statements about G hold:
G2 = G2 ;
Lemma 12.2.4 Let G be a positive operator. Suppose that G−1 exists, then for
each x ∈ H we have
x2
G(x), x ≥ .
G−1
Lemma 12.2.5 Let B : H −→ H be a bounded linear operator and I : H −→ H
the identity operator. If B − I < 1 then B is invertible and B−1 ≤ (1−B−I)
1
.
Let G : H −→ H be a bounded linear operator. Then G ( := 1 (G + G∗ ) where

2
∗
G is the adjoint of G. Hereafter, we assume that Q : H ⇒ H is a set valued
maximal monotone operator and F : H −→ H is a Fréchet differentiable operator.

In this section, we present our main convergence result for Newton’s method
(12.2),
Theorem 12.3.1 Let F : D ⊂ H −→ H be nonlinear operator with a continuous

Fréchet derivative F , where D is an open subset of H. Let Q : H ⇒ H be a
set-valued operator and x∗ ∈ D. Suppose that 0 ∈ F(x∗ ) + Q(x∗ ), F (x∗ ) is a
−1
positive operator and F (x∗ ) exists. Let R > 0 and suppose that there exist
f0 , f : [0, R) −→ R twice continuously differentiable such that
(h0 )
−1
F
(x∗ ) F (x) − F (x∗ ) ≤ f0 (x − x∗ ) − f0 (0), (12.3)
x ∈ D and
−1
F
(x∗ ) F (x) − F (x∗ + θ (x − x∗ )) ≤ f (x − x∗ ) − f (θ x − x∗ )
(12.4)
for each x ∈ D0 = D ∩U(x∗ , R), θ ∈ [0, 1].
(h1 ) f (0) = f0 (0) and f (0) = f0 (0) = −1, f0 (t) ≤ f (t), f0 (t) ≤ f (t) for each
t ∈ [0, R).
(h2 ) f0 , f are convex and strictly increasing.
Let ν := sup{t ∈ [0, R) : f (t) < 0} and r := sup{t ∈ (0, ν) : t ff (t)

(t) − 1 < 1}.
Then, the sequence with starting point x0 ∈ B(x , r)/{x } and t0 = x∗ − x0 ,
∗ ∗
respectively,
f (tk )
0 ∈ F(xk ) + F (xk )(xk+1 − xk ) + Q(xk+1 ), tk+1 = |tk − |, k = 0, 1, . . . ,
f (tk )
(12.5)
are well defined, {tk } is strictly decreasing, is contained in (0, r) and converges
to 0, {xk } is contained in U(x∗ , r) and converges to the point x∗ which is the
unique solution of the generalized equation F(x) + Q(x) 0 in U(x∗ , σ̄ ), where
σ̄ = min{r, σ } and σ := sup{0 < t < R : f (t) < 0}. Moreover, the sequence { tk+1 tk2
}
is strictly decreasing,
tk+1 ∗ 2 tk+1 f (t0 )
x∗ − xk+1 ≤ [ ]x k − x , ≤ , k = 0, 1, . . . . (12.6)
tk2 tk2 2| f (t0 )|

If, additionally ρ f ρ(ρ)− f (ρ)
f (ρ) = 1 and ρ < R then r = ρ is the optimal convergence
radius. Furthermore, for t ∈ (0, r) and x ∈ Ū(x∗ ,t)
e f (xk − x∗ , 0)
xk+1 − x∗ ≤
| f0 (xk∗ − x∗ )|
e f (xk − x∗ , 0) |η f (t)|
:= αk ≤ ∗ ≤ xk − x∗ 2 , (12.7)
| f (xk − x∗ )| t2
where
e f (s,t) := f (t) − ( f (s) + f (s)(t − s)) for each s,t ∈ [0, R)
and
f (t)
η f (t) := t − for each s,t ∈ [0, ν).
f (t)
Finally, by the second inequality in (12.7) there exists r∗ ≥ r such that
limk−→∞ xk = x∗ , if x0 ∈ U(x∗ , r∗ ) − {x∗ }.
Before proving the results of previous Theorem we must consider the following
results.
From now on we assume that the hypotheses of Theorem 12.3.1 hold.
Remark 12.3.2 The introduction of the center-Lipschitz-type condition (12.3)
(i.e., function f0 ) lends to the introduction of restricted Lipschitz-type condition
(12.4). The condition used in earlier studies such as [36] is given by
−1
F
(x∗ )F (x) − F (x∗ + θ (x − x∗ )) ≤ f1 (x − x∗ ) − f1 (θ x − x∗ )
(12.8)
for each x ∈ D, θ ∈ [0, 1], where f1 : [0, +∞)R is also twice continuously differ-
entiable. It follows from (12.3), (12.4) and (12.8) that
f0 (t) ≤ f1 (t) (12.9)
f (t) ≤ f (t) (12.10)
for each t ∈ [o, ν) since D0 ⊆ D. If f0 (t) = f1 (t) = f (t) for each t ∈ [0, ν), then
our results reduce to the corresponding ones in [36]. Otherwise, (i.e., if strict
inequality holds in (12.9) or (12.10)) then the new results improve the old ones.
Indeed, let
t f (t) − f1 (t)
r1 := sup{t ∈ (0, ν̄) : − 1 < 1},
t f1 (t)
where ν1 := sup{t ∈ [0, +∞) : f1 (t) < 0}. Then, the error bounds are (corre-
sponding to (12.7)):
e f1 (xk − x∗ , 0)
xk+1 − x∗ ≤
| f1 (xk∗ − x∗ )|
|η f1 (t)|
:= βk ≤ xk − x∗ 2 . (12.11)
t2
In view of the definition of r, r1 and estimates (12.7), (12.9), (12.10) and (12.11),
it is clear that
r1 ≤ r (12.12)
and
αk ≤ βk , k = 1, 2, . . . . (12.13)
Hence, we obtain a larger radius of convergence and tighter error estimates on
the distances involved, leading to a wider choice of initial guesses x∗ and fewer
computations of iterates xk in order to achieve a desired error tolerance. It is
also worth noticing:
The advantages are obtained under the same computational cost due to the
fact in practice the computation of function f1 requires the computation, as spe-
cial cases, of functions f0 and f . The introduction of function f was not possible
before (when only function f1 was used). This inclusion is possible using function
f0 (i.e., f is a function of f0 ).
Next, we present an auxiliary Banach Lemma relating the operator F with the
majorant function f0 .
Lemma 12.3.3 Let x∗ ∈ H be such that F (x∗ ) is a positive operator and

−1
F
(x∗ ) exists. If x − x∗ ≤ min{R, ν}, then F (x∗ ) is a positive operator and
−1
F
(x∗ ) exists. Moreover,
−1
−1 F (x∗ )
F
(x) ≤ . (12.14)
| f0 (x − x∗ )|
Proof: Firstly note that

1 1
F
(x)− F (x∗ ) ≤ F (x)−F (x∗ )+ (F (x)−F (x∗ ))∗ = F (x)−F (x∗ ).
2 2
(12.15)
Take x ∈ U(x∗ , r). Since r < ν we have x − x∗ < ν. Thus, f (x − x∗ ) < 0
which, together with (12.15), imply that for all x ∈ U(x∗ , r)
−1 −1
F
(x∗ ) F
(x) − F
(x∗ ) ≤ F (x∗ ) F (x) − F (x∗ ) ≤ f0 (x − x∗ ) − f0 (0) < 1.
(12.16)
−1
By Banach Lemma, we conclude that F
(x∗ ) exists. Moreover taking into
account the last inequality
−1 −1 −1
−1 F
(x∗ ) F (x∗ ) F (x∗ )
F
(x) ≤ ≤ = .
−1 1 − ( f0 (x − x∗ ) − f0 (0)) | f0 (x − x∗ )|
1 − F
(x∗ ) F (x) − F (x∗ )
The last step follows from the fact that r = min{R, ν}. Moreover, using
(12.16) we have
1
F
(x) − F (x∗ ) ≤
−1
. (12.17)
F
(x∗ )
Choose y ∈ H. Then, from the last inequality
y2
(F
(x∗ ) − F
(x))y, y ≤ F
(x∗ ) − F (x)y2 ≤
−1
,
F
(x∗ )
which implies
y2
F
(x∗ )y, y −
−1
≤ F
(x)y, y.
F
(x∗ )
−1
Since F
(x∗ ) is a positive operator and F (x∗ ) exists by assumption, we obtain
by Lemma 12.2.5 that
y2
F
(x∗ )y, y ≥
−1
.
F
(x∗ )
Therefore, combining the two last inequalities we conclude that F (x)y, y ≥ 0,
i.e., F
(x) is a positive operator.
−1
Lemma 12.2.4 shows that F
(x) is a positive operator and F (x) exists, thus
by Lemma 12.2.3 we have that for any y ∈ H
y2
F
(x)y, y ≥
−1
.
F
(x)
2
It is worth noticing that F
(x)y, y = F (x)y, y. Thus by the second part of
Lemma 12.2.4 we conclude that the Newton iteration mapping is well-defined.

Now, denoting NF+Q , the Newton iteration mapping for f + F in that region,
namely, NF+Q : U(x∗ , r) −→ H is defined by
0 ∈ F(x) + F (x)(NF+Q (x) − x) + Q(NF+Q (x)), for all x ∈ U(x∗ , r). (12.18)
Remark 12.3.4 Under condition (12.8) it was shown in [36] that

−1
−1 F (x∗ )
F
(x) ≤ (12.19)
| f1 (x − x∗ )|
instead of (12.14). However, we have that (12.14) gives a tighter error estimate
than (12.15), since | f1 (t) ≤ | f0 (t)|. This is really important in the proof of Theo-
rem 12.3.1.
Proof of Theorem 12.3.1 Simply follow the proof of Theorem 4 in [36] but
realize that the iterates xk lie in D0 which is a more precise location than D (used
in [36]) allowing the usage of tighter function f than f1 and also the usage of
tighter function f0 that f1 for the computation of the upper bounds of the inverses
−1
F (x) (i.e., we use (12.14) instead of (12.19)).
12.4 Special cases

First of all, let functions f0 , f , f1 be defined by
L0 2
f0 (t) = t −t
2
L
f (t) = t 2 − t
2
and
L1 2
f1 (t) =
t −t
2
for some positive constants L0 , L and L1 to be determined using a specialized
operator F.
Example 12.4.1 Let X = Y = R3 , D = Ū(0, 1), x∗ = (0, 0, 0)T . Define function

F on D for w = (x, y, z)T by
e−1 2
F(w) = (ex − 1, y + y, z)T .
2
Then, the Fréchet-derivative is given by
⎡ x ⎤
e 0 0
F (v) = ⎣ 0 (e − 1)y + 1 0 ⎦ .
0 0 1
1
Notice that L0 = e − 1, L = e L0 , L1 = e. Hence f0 (t) < f (t) < f1 (t). Therefore,
we have that the conditions of Theorem 12.3.1 hold. Moreover, we get that
2 2 2
r1 := < r := < r∗ := .
3L1 3L 2L0 + L
Furthermore, the corresponding error bounds are:
Lxk − x∗ 2
xk+1 − x∗ ≤
2(1 − L0 xk − x∗ )
Lxk − x∗ 2
xk+1 − x∗ ≤
2(1 − Lxk − x∗ )
and
L1 xk − x∗ 2
xk+1 − x∗ ≤ .
2(1 − L1 xk − x∗ )
References

methods, Math. Comput. AMS, 80, 327–343, 2011.
[2] I. K. Argyros, Computational theory of iterative methods, Studies in Com-
putational Mathematics, 15, Editors: K. Chui and L. Wuytack. Elsevier
New York, U.S.A., 2007.
quadratic majorants, J. Appl. Math. Comput., 29, 391–400, 2009.
[4] I. K. Argyros and D. González, Local convergence analysis of inexact
Gauss-Newton method for singular systems of equations under majorant
and center-majorant condition, SeMA, 69 (1), 37–51, 2015.
[5] I. K. Argyros and S. Hilout, On the Gauss-Newton method, J. Appl. Math.,
1–14, 2010.
Newton method under average Lipschitz-conditions, Numer. Algorithms,
58, 23–52, 2011.
[7] I. K. Argyros and S. Hilout, On the solution of systems of equations with
constant rank derivatives, Numer. Algorithms, 57, 235–253, 2011.
method under weaker majorant condition, J. Comput. Appl. Math., 236 (7),
1892–1902, 2012.
212
References 213
[11] I. K. Argyros and S. Hilout, Computational Methods in Nonlinear Analysis,

World Scientific Publ. Comp., New Jersey (2013).
[12] L. Blum, F. Cucker, M. Shub and S. Smale. Complexity and real computa-
tion. Springer-Verlag, New York, 1998.
[13] L. Blum, F. Cucker, M. Shub and S. Smale, Complexity and Real Compu-
tation, Springer-Verlag, New York, 1997.
[14] D. C. Chang, J. Wang and J. C. Yao, Newtons method for variational in-
equality problems: Smales point estimate theory under the condition, Ap-
plicable Anal., 94 (1), 44–55, 2015.
[15] J. E. Dennis Jr. and R. B. Schnabel, Numerical Methods for Unconstrained
Optimization and Nonlinear Equations, in: Classics in Applied Mathemat-
ics, SIAM, Philadelphia, 1996.
[16] P. Deuflhard and G. Heindl, Affine invariant convergence for Newton’s
method and extensions to related methods, SIAM J. Numer. Anal., 16 (1),
1–10, 1979.
[17] A. L. Dontchev and R. T. Rockafellar, Implicit Functions and Solution
Mappings. A View From Variational Analysis, in: Springer Monographs
in Mathematics, Springer, Dordrecht, 2009.
[18] S. P. Dokov and A. L. Dontchev, Robinson’s strong regularity implies ro-
bust local convergence of Newton’s method Optimal control (Gainesville,
FL, 1997), Vol. 15 of Appl. Optim., pages 116–129 Kluwer Acad. Publ.,
Dordrecht (1998).
[19] A. L. Dontchev, Local analysis of a Newton-type method based on partial
linearization. In The mathematics of numerical analysis (Park City, UT,
1995). Amer. Math. Soc., Providence, RI, 1996.
[20] A. L. Dontchev, Local convergence of the Newton method for generalized
equations, C. R. Acad. Sci. Paris Ser. I Math., 322 (4), 327–331, 1996.
[21] A. L. Dontchev and R. T. Rockafellar, Implicit functions and solution map-
pings. Springer Monographs in Mathematics. Springer, Dordrecht, 2009.
[22] A. L. Dontchev and R. T. Rockafellar, Newton’s method for generalized
equations: a sequential implicit function theorem, Math. Program., 123 (1,
Ser. B), 139–159, 2010.
[23] O. Ferreira, Local convergence of Newton’s method in Banach space from
the viewpoint of the majorant principle, IMA J. Numer. Anal., 29 (3), 746–
759, 2009.
214 References
[24] O. P. Ferreira, M. L. N. Goncalves and P. R. Oliveira, Local convergence

analysis of the Gauss-Newton method under a majorant condition, J. Com-
plexity, 27 (1), 111–125, 2011.
ton’s method. Comput. Optim. Appl., 42 (2), 213–229, 2009.
[26] O. P. Ferreira, A robust semi-local convergence analysis of Newton’s
method for cone inclusion problems in Banach spaces under affine invariant
[27] M. Josephy, Newton’s Method for Generalized Equations and the PIES En-
ergy Model. University of Wisconsin Madison, 1979.
[29] L. B. Rall, A Note on the Convergence of Newton’s Method, SIAM J. Nu-
mer. Anal., 11 (1), 34–36, 1974.
[30] S. M. Robinson, Extension of Newton’s method to nonlinear functions with
values in a cone, Numer. Math., 19, 341–347, 1972.
5 (1), 43–62, 1980.
[32] S. M. Robinson, Normed convex processes, Trans. Amer. Math. Soc., 174,
127–140, 1972.
(2L), 415–421, 2015.
291, 468–477, 2016.
[36] G. N. Silva, Convergence of the Newton’s method for generalized equations
under the majorant condition, arXiv:1603.05280v1[math.NA] 16 March
2016.
References 215
[37] S. Smale, Newton method estimates from data at one point, in: R. Ewing,
K. Gross and C. Martin (Eds.), The Merging of Disciplines: New Directions
in Pure, Applied and Computational Mathematics, pp. 185–196. Springer-
Verlag, New York, 1986.
[38] J. F. Traub and H. Wozniakowski, Convergence and complexity of Newton
1979.
[39] L. U. Uko, Generalized equations and the generalized Newton method,
Math. Programming, 73 (3, Ser. A), 251–268, 1996.
[40] X. Wang, Convergence of Newton’s method and uniqueness of the solution
of equation in Banach space, IMA J. Numer. Anal., 20, 123–134, 2000.
Chapter 13
Secant-like methods
Ioannis K. Argyros

CONTENTS
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
13.2 Semilocal convergence analysis of the secant method I . . . . . . . . . . . . . . 218
13.3 Semilocal convergence analysis of the secant method II . . . . . . . . . . . . . 226
13.4 Local convergence analysis of the secant method I . . . . . . . . . . . . . . . . . . 227
13.5 Local convergence analysis of the secant method II . . . . . . . . . . . . . . . . . 231
13.1 Introduction
In this chapter we study the problem of finding a locally unique solution x of
equation
F(x) = 0, (13.1)
where F is a Fréchet-differentiable operator defined on a convex subset D of a
Banach space X with values in a Banach space Y.
216
Secant-like methods 217
equations in a form like (13.1) [6, 7, 17, 22].
In this chapter, we present a both semilocal/local convergence study for the
secant-like method defined for each n = 0, 1, 2, . . . by
xn+1 = xn − A−1
n F(xn ), An = δ F(yn , zn ), (13.2)
where x−1 , x0 ∈ D are initial points, yn = g1 (xn−1 , xn ), zn = g2 (xn−1 , xn ), δ F : D ×

D → L(X , Y) and g1 , g2 : D × D → D. Mapping δ F may be a divided difference
of order one at the points (x, y) ∈ D → D with x = y satisfying
δ F(x, y)(x − y) = F(x) − F(y) (13.3)
and if operator F is Fréchet-differentiable, then F (x) = δ F(x, x) for each x ∈ D.

Many methods involving divided differences are special cases of method (13.2):
Amat et al. method (X = Y = Rm )
xn+1 = xn − δ F(yn , zn )−1 F(xn ) (13.4)

Choosing g1 (x, y) = x − aF(x) and g2 (x, y) = x + bF(x) for a, b ∈
[0, +∞).
Ezquerro et al. method [12]
xn+1 = xn − δ F(yn , xn )−1 F(xn ) (13.5)

Choosing g1 (x, y) = λ y + (1 − λ )x and g2 (x, y) = y for all λ ∈ [0, 1].
Kurchatov method [6, 19, 26]
xn+1 = xn − δ F(2xn − xn−1 , xn−1 )−1 F(xn ) (13.6)

Choosing g1 (x, y) = 2y − x and g2 (x, y) = x.
Secant method [5–7, 11, 17, 20–22]
xn+1 = xn − δ F(xn−1 , xn )−1 F(xn ) (13.7)

Choosing g1 (x, y) = x and g2 (x, y) = y.
If x−1 = x0 and zn = yn = xn , for each n = 0, 1, 2 . . ., then we obtain Newton-

like methods [4–9, 11, 22].
Many other choices are also possible [5–7, 11, 16, 22].
A very important problem in the study of iterative procedures is the con-
vergence domain. In general the convergence domain is small. Therefore, it is
important to enlarge the convergence domain without additional hypotheses. An-

other important problem is to find more precise error estimates on the distances
xn+1 − xn , xn − x . These are our objectives in this chapter.
The simplified secant method defined for each n = 0, 1, 2, . . . by
xn+1 = xn − A−1
0 F(xn ), (x−1 , x0 ∈ D)
was first studied by S. Ulm [30].
The first semilocal convergence analysis was given by P. Laasonen [19]. His
results were improved by F. A. Potra and V. Pták [20–22]. A semilocal conver-
gence analysis for general secant-type methods was given in general by Argy-
ros [5–9], J. E. Dennis [11], Hernández et al. [16, 17], Potra [20–22] and many
others [4, 12, 14], have provided sufficient convergence conditions for the secant
method based on Lipschitz–type conditions on δ F.
The use of Lipschitz and center–Lipschitz conditions is one way used to en-
large the convergence domain of different methods. This technique consists on
using both conditions together instead of using only the Lipschitz one which al-
low us to find a finer majorizing sequence, that is, a larger convergence domain. It
has been used in order to find weaker convergence criteria for Newton’s method
by Amat et al. in [1–4], Argyros in [5–9]. On the other hand, Gutiérrez et al.
in [14, 15] give sufficient conditions for Newton’s method using both Lipschitz
and center-Lipschitz conditions.
Here using Lipschitz and center–Lipschitz conditions, we present semilocal
convergence results for method (13.2). Notice that our error bounds and the infor-
mation on the location of the solution are, under the same convergence condition,
more precise than the old ones given in earlier studies.
The rest of the chapter is organized as follows: In Sections 13.2 and 13.3,
we present the semilocal convergence study. Then, in Sections 13.4 and 13.5
we present the local convergence analysis of the secant method. Finally, some
numerical examples are presented in the concluding Section 13.6.
13.2 Semilocal convergence analysis of the secant

method I
We shall show the semilocal convergence of method under a set of conditions.
The conditions (C) are:
(C1 ) F is a Fréchet-differentiable operator defined on a convex subset D of a
(C2 ) x−1 and x0 are two points belonging in D satisfying the inequality
x−1 − x0 ≤ c.
(C3 ) There exist continuous and nondecreasing functions h1 , h2 : [0, +∞) ×

[0, +∞) → [0, +∞), functions g1 , g2 : D × D → D such that for each

u, v ∈ D there exist y = g1 (u, v) ∈ D and z = g2 (u, v) ∈ D
y − x0 ≤ h1 (u − x0 , v − x0 )
and
z − x0 ≤ h2 (u − x0 , v − x0 ).
(C4 ) There exist a mapping δ F :→ ×D → L(X , Y), continuous, nondecreasing

functions f1 , f2 : [0, +∞) × [0, +∞) → [0, +∞) with f1 (0, 0) = f2 (0, 0) =
0, n ≥ 0 and x−1 , x0 , y0 = g1 (x−1 , x0 ), z0 = g2 (x−1 , x0 ) ∈ D such that
F (x0 )−1 , A−1
0 ∈ L(Y, X ),
A−1
0 F(x0 ) ≤ η
and for each x, y, z ∈ D
F (x0 )−1 (δ F(x, y) − F (x0 )) ≤ f0 (x − x0 , y − x0 )
and
F (x0 )−1 (δ F(x, y) − F (z)) ≤ f1 (x − z, y − z).
(C5 ) Define functions ϕ, ψ : [0, +∞) → [0, +∞) by
ϕ(t) = f0 (h1 (t,t), h2 (t,t)),
and
1
ψ(t) = f1 (t + θ η + h1 (t + c,t),t + θ η + h2 (t + c,t))dθ .
0
Moreover, define function q by

ψ(t)
q(t) = .
1 − ϕ(t)
Suppose that equation
t(1 − q(t)) − η = 0 (13.8)
has at least one positive zero and the smallest positive zero, denoted by r,
satisfies
ϕ(r) < 1,
h1 (c, 0) ≤ r̄, h1 (r, r) ≤ r̄
and
h2 (c, 0) ≤ r̄, h2 (r, r) ≤ r̄, for some r̄ ≥ r.
Set q = q(r).
(C6 ) Ū(x0 , r̄) ⊆ D.

(C7 ) There exists r1 ≥ r such that
1
f0 (θ r + (1 − θ )r1 , θ r + (1 − θ )r1 )dθ < 1.
0
Now using the conditions (C) we present the main semilocal convergence
result.
Theorem 13.2.1 Suppose that conditions (C) hold. Then, the sequence {xn }
generated by method (13.2) starting at x−1 and x0 is well defined, remains in
U(x0 , r) for all n ≥ 0 and converges to a solution x∗ ∈ Ū(x0 , r) of equation
F(x) = 0, which is unique in Ū(x0 , r1 ) ∩ D.
Proof: We shall first show using mathematical induction that sequence {xn } is
well defined and belongs in U(x0 , r). By (C2 )-(C4 ), y0 = g1 (x−1 , x0 ) ∈ D, z0 =
g2 (x−1 , x0 ) ∈ D and A−1
0 ∈ L(Y, X ). It follows that x1 is well defined and x1 −
x0 = A−1 0 F(x0 ) < η < r, since r is a solution of equation (13.8). Then, we
have that x1 ∈ U(x0 , r). Moreover, y1 = g1 (x0 , x1 ) and z1 = g2 (x0 , x1 ) are well
defined by the last two hypotheses in (C5 ). By the second condition in (C4 ), we
get that
F (x0 )−1 (A1 − F (x0 )) ≤ f0 (y1 − x0 , z1 − x0 )
≤ f0 (h1 (x0 − x0 , x1 − x0 ),

(13.9)
h2 (x0 − x0 , x1 − x0 ))
≤ f0 (h1 (0, η), h2 (0, η)) ≤ ϕ(r) < 1.
It follows from (13.9) and the Banach lemma on invertible operators [7, 18,
22] that A−1
1 ∈ L(Y, X ) and
1 1
A−1
1 F (x0 ) ≤ ≤ . (13.10)
1 − f0 (h1 (0, η), h2 (0, η)) 1 − ϕ(r)
Therefore, x2 is well defined.
We can write using method (13.2) for n = 0 and (C1 ) that
F(x1 ) = F(x1 ) − F(x0 ) − A0 (x1 − x0 )

1 (13.11)
= 0 [F (x0 + θ (x1 − x0 )) − A0 ] dθ (x1 − x0 ).
Then, using the first and second condition in (C4 ), (C3 ) and (13.11) we get in
turn that
1
F (x0 )−1 F(x1 ) = 0
F (x0 )−1 [F (x0 + θ (x1 − x0 )) − A0 ] dθ (x1 − x0 )
1
≤ 0
f1 (x0 − x0 + θ x1 − x0 + y0 − x0 ,
x0 − x0 + θ x1 − x0 + z0 − x0 )dθ x1 − x0

1
≤ 0
f1 (θ η + h1 (x−1 − x0 , x0 − x0 )x1 − x0
+θ η + h2 (x−1 − x0 , x0 − x0 ))dθ x1 − x0

1
≤ 0
f1 (θ η + h1 (c, 0), θ η + h2 (c, 0))dθ x1 − x0
≤ ψ(r)x1 − x0 .
(13.12)
By (13.2), (13.10), (C5 ) and (13.12), we obtain
x2 − x1 = A−1 −1 −1
1 F(x1 ) ≤ A1 F (x0 )F (x0 ) F(x1 )
ψ(r)x1 − x0 (13.13)
≤ ≤ qx1 − x0 < η
1 − ϕ(r)
and
1 − q2 η
x2 − x0 = x2 − x1 + x1 − x0 ≤ x1 − x0 < = r.
1−q 1−q
(13.14)
That is x2 ∈ U(x0 , r). On the other hand, y2 = g1 (x1 , x2 ) and z2 = g2 (x1 , x2 ) are
well defined by the last two hypotheses in (C5 ). Then, as in (13.9), we have that
F (x0 )−1 (A2 − F (x0 )) ≤ f0 (h1 (η, r), h2 (η, r)) ≤ ϕ(r) < 1.
Hence, we have that

1 1
A−1
2 F (x0 ) ≤ ≤ . (13.15)
1 − f0 (h1 (η, r), h2 (η, r)) 1 − ϕ(r)
That is x3 is well defined. Furthermore, y2 = g1 (x1 , x2 ) and z2 = g2 (x1 , x2 ) are

well defined. Then, we have by method (13.2) for n = 0 and (C1 ) that
1
F(x2 ) = F(x2 )−F(x1 )−A1 (x2 −x1 ) = (F (x1 +θ (x2 −x1 ))−A1 )dθ (x2 −x1 ).
0
(13.16)
Then, from the first, second hypotheses in (C4 ), (C3 ) and (13.11) we have as
in (13.12) that
1
F (x0 )−1 F(x2 ) ≤ 0 f1 (x1 − x0 + θ x2 − x1 + y1 − x0 ,
x1 − x0 + θ x2 − x1 + z1 − x0 )dθ x2 − x1

1
≤ 0
f1 (η + θ η + h1 (x0 − x0 , x1 − x0 ),
η + θ η + h2 (x0 − x0 , x1 − x0 ))dθ x2 − x1

1
≤ 0
f1 ((1 + θ )η + h1 (c, η), (1 + θ )η+
h2 (c, η))dθ x2 − x1
≤ ψ(r)x2 − x1 .
(13.17)
We get
x3 − x2 = A−1 −1
2 F(x0 )F (x0 ) F(x2 )
ψ(r)
≤ x2 − x1 ≤ qx2 − x1 < q2 x1 − x0 < η
1 − ϕ(r)
and
1 − q3 η
x3 − x0 = x3 − x2 + x2 − x1 + x1 − x0 ≤ η< = r.
1−q 1−q
Then, similarly, we have that y3 , z3 are well defined and

1
A−1
3 F(x0 ) ≤
1 − ( f0 (h1 (x2 − x0 , x3 − x0 ), h2 (x2 − x0 , x3 − x0 )))
1 1
≤ ≤ .
1 − f0 (h1 (r, r), h2 (r, r)) 1 − ϕ(r)
(13.18)
We also have that

1
F (x0 )−1 F(x3 ) ≤ 0
f1 (x2 − x0 + θ x3 − x2 + y2 − x0 ,
x2 − x0 + θ x3 − x2 + z2 − x0 )dθ x3 − x2

1
≤ 0
f1 (r + θ η + h1 (η, r),
r + θ η + h2 (η, r))dθ x3 − x2
≤ ψ(r)x3 − x2 .
(13.19)
In view of (13.18) and (13.19) we get that
x4 − x3 = A−1 −1
3 F (x0 )F (x0 ) F(x3 )
ψ(r)
≤ x3 − x2 ≤ qx3 − x2 < q3 x1 − x0 < η
1 − ϕ(r)
and
x4 − x0 = x4 − x3 + x3 − x2 + x2 − x1 + x1 − x0
1 − q4 η
≤ η< = r.
1−q 1−q
Then, we get, as in (13.18) that y4 , z4 are well defined and
1
A−1
4 F (x0 ) ≤ (13.20)
1 − ϕ(r)
and
1
F (x0 )−1 F(x4 ) ≤ 0
f1 (r + θ η + h1 (r, r), r + θ η + h2 (r, r))dθ x4 − x3
≤ ψ(r)x3 − x2 .
(13.21)
Then, we have from (13.21) that
x5 − x4 = A−1 −1
4 F (x0 )F (x0 ) F(x4 )
ψ(r)
≤ x4 − x3 ≤ qx4 − x3 < q4 x1 − x0 < η
1 − ϕ(r)
and
x5 − x0 = x5 − x4 + x4 − x3 + x3 − x2 + x2 − x1 + x1 − x0
1 − q5 η
≤ η< = r.
1−q 1−q
After that, supposing that for i = 1, 2, . . . , k − 1

The operator A−1
k ∈ L(Y, X ) and
1
A−1
k F (x0 ) ≤ ,
1 − ϕ(r)
xk+1 − xk ≤ qxk − xk−1

1 − qk+1
xk+1 − x0 ≤ η < r.
1−q
Analogously, we get
1
A−1
k+1 F (x0 ) ≤ ,
1 − ϕ(r)
xk+2 − xk+1 ≤ qk+1 η (13.22)

and
1 − qk+2
xk+2 − x0 ≤ η < r.
1−q
Thus, we get that xk+2 ∈ U(x0 , r). That is sequence {xn } is well defined. Using
(13.22) we have
xk+ j − xk = xk+ j − xk+ j−1 + xk+ j−1 − xk+ j−2 + · · · + xk+1 − xk
≤ (q j−1 + q j−2 + . . . + q + 1)xk+1 − xk
1 − qj qk
= xk+1 − xk < x1 − x0 .
1−q 1−q
(13.23)
for j = 1, 2, . . . and q < 1. It follows from (13.23) that sequence {xk } is com-
plete in a Banach space X and as such it converges to some x∗ ∈ Ū(x0 , r) (since
Ū(x0 , r) is a closed set). From the estimate
F (x0 )−1 F(xk ) ≤ ψ(r)xk − xk−1
and letting k → ∞ we deduce that F(x∗ ) = 0. Next, we will study the uniqueness
part. Let y∗ ∈ Ū(x0 , r) be such that F(y∗ ) = 0. Then, using the last condition in
1
(C4 ) and (C7 ), we get in turn that for Q = 0 F (y∗ + θ (x∗ − y∗ ))dθ
1
F (x0 )−1 (F (x0 ) − Q) ≤ 0
f0 (y∗ + θ (x∗ − y∗ ) − x0 ,
y∗ + θ (x∗ − y∗ ) − x0 )dθ
1
≤ 0
f0 (θ x∗ − x0 + (1 − θ )y∗ − x0 ,
θ x∗ − x0 + (1 − θ )y∗ − x0 )dθ
1
f0 (θ r + (1 − θ )r1 , θ r + (1 − θ )r1 )dθ < 1.
≤ 0
(13.24)
If follows from (13.24) and the Banach’s Lemma on invertible operators that
Q−1 ∈ L(Y, X ). Then, using the identity 0 = F(x∗ ) − F(y∗ ) = Q(y∗ − x∗ ), we
conclude that y∗ = x∗ . .
Remark 13.2.2 (a) The condition A−10 ∈ L(Y, X ) can be dropped as follows.
Suppose that
f0 (h1 (c, 0), h2 (c, 0)) < 1. (13.25)
Then, we have by the third condition in (C4 ) that
F (x0 )−1 (A0 − F (x0 )) ≤ f0 (y0 − x0 , z0 − x0 )
≤ f0 (h1 (x−1 − x0 , x0 − x0 ),

h2 (x−1 − x0 , x0 − x0 ))
≤ f0 (h1 (c, 0), h2 (c, 0)) < 1.
It follows from (13.25) that A−1

0 ∈ L(Y, X ) and
1
A0 F (x0 )−1 ≤ . (13.26)
1 − f0 (h1 (c, 0), h2 (c, 0))
Then, due to the estimate
F (x0 )−1 F(x0 )
x1 − x0 ≤ A−1 −1
0 F (x0 )F (x0 ) F(x0 ) ≤ ,
1 − f0 (h1 (c, 0), h2 (c, 0))
we can define
F (x0 )−1 F(x0 )
η= . (13.27)
1 − f0 (h1 (c, 0), h2 (c, 0))
Then, the conclusions of Theorem 2.1 hold with (13.25) replacing condi-
tion A−1
0 ∈ L(Y, X ) and with η given by (13.27).
(b) In view of the second and third condition in (C4 ) we have that
f0 (s1 , s2 ) ≤ f1 (s3 , s4 ) for each si ≥ 0, i = 1, 2, 3, 4 with s1 ≤ s3 and s2 ≤ s4 .

13.3 Semilocal convergence analysis of the secant

method II
Next, we show the semilocal convergence of method (13.2) under different con-
ditions. The conditions (H) are:
(H1 ) F is a continuous operator defined on a convex subset D of a Banach space

X with values in a Banach space Y.
(H2 ) x−1 and x0 are two points belonging in D satisfying the inequality
x−1 − x0 ≤ c.
(H3 ) There exist continuous and nondecreasing functions h1 , h2 : [0, +∞) ×

y − x0 ≤ h1 (u − x0 , v − x0 )
and
z − x0 ≤ h2 (u − x0 , v − x0 )
(H4 ) There exist a divided difference of order one δ F, continuous, nonde-

creasing functions f1 , f2 : [0, +∞) × [0, +∞) → [0, +∞), n ≥ 0 and
x−1 , x0 , y0 = g1 (x−1 , x0 ), z0 = g2 (x−1 , x0 ) ∈ D such that A−1
0 ∈ L(Y, X ),
A−1
0 F(x0 ) ≤ η
and for each x, y, u, v ∈ D
A−1
0 (δ F(x, y) − A0 ) ≤ f 0 (x − y0 , y − z0 )
and
A−1
0 (δ F(x, y) − δ F(u, v)) ≤ f 1 (x − u, y − v)
(H5 ) Define functions ϕ, ψ : [0, +∞) → [0, +∞) by
ϕ(t) = f0 (h1 (t,t) + h1 (c, 0), h2 (t,t) + h2 (c, 0)),
ψ(t) = f1 (t + h1 (t + c,t),t + h2 (t + c,t))

and
ψ(t)
q(t) = .
1 − ϕ(t)
t(1 − q(t)) − η = 0
has at least one positive zero and the smallest positive zero, denoted by R,
satisfies
ϕ(R) < 1,
h1 (c, 0) ≤ R̄, h1 (R, R) ≤ R̄
and
h2 (c, 0) ≤ R̄, h2 (R, R) ≤ R̄, for some R̄ ≥ R.
Set q = q(R).
(H6 ) Ū(x0 , R̄) ⊆ D.
(H7 ) There exists R1 ≥ R such that
f0 (R1 + h1 (c, 0), R1 + h2 (c, 0)) < 1.
Using the (H) conditions we present the following semilocal convergence result.
Theorem 13.3.1 Suppose that the conditions (H) hold. Then, the sequence {xn }
generated by method (13.2) starting at x−1 and x0 is well defined, remains in
U(x0 , R) for all n ≥ 0 and converges to a solution x∗ ∈ Ū(x0 , R) of equation
F(x) = 0, which is unique in Ū(x0 , R1 ) ∩ D.
It is worth noticing that the second and third conditions in (H4 ) do not neces-
sarily imply the Fréchet-differentiability of F.
13.4 Local convergence analysis of the secant method I

Next, we shall show the local convergence of method under a set of conditions
(C∗ ). The conditions (C∗ ) are:
(C1∗ ) F is a Fréchet-differentiable operator defined on a convex subset D of a
(C2∗ ) There exist x∗ ∈ D, such that F(x∗ ) = 0 and F (x∗ )−1 ∈ L(Y, X ).
(C3∗ ) There exist continuous and nondecreasing functions h1 , h2 : [0, +∞) ×
y − x∗ ≤ h1 (u − x∗ , v − x∗ )
and
z − x∗ ≤ h2 (u − x∗ , v − x∗ )
(C4∗ ) There exist a mapping δ F : D × D → L(X , Y), continuous, nondecreasing

functions f1 , f2 : [0, +∞) × [0, +∞) → [0, +∞) with f1 (0, 0) = f2 (0, 0) =
0, n ≥ 0 and x−1 , x0 , y0 = g1 (x−1 , x0 ), z0 = g2 (x−1 , x0 ) ∈ D such that for
each x, y, z ∈ D
F (x∗ )−1 (δ F(x, y) − F (x∗ )) ≤ f0 (x − x∗ , y − x∗ )
and
F (x∗ )−1 (δ F(x, y) − F (z)) ≤ f1 (x − z, y − z)
(C5∗ ) Define functions ϕ, ψ : [0, +∞) → [0, +∞) by
ϕ(t) = f0 (h1 (t,t), h2 (t,t)),
and 1
ψ(t) = f1 (θt + h1 (t,t), θt + h2 (t,t))dθ .
0
Moreover, define functions q and p by
ψ(t)
q(t) = .
1 − ϕ(t)
and
1
p(t) = f1 (θt + h1 (t,t), θt + h2 (t,t))dθ + f0 (h1 (t,t), h2 (t,t)) − 1.
0

p(t) = 0 (13.28)
has at least one positive zero and the smallest positive zero, denoted by r,
satisfies
ϕ(r) < 1,
h1 (r, r) ≤ r̄
and
h2 (r, r) ≤ r̄, for some r̄ ≥ r.
Set q = q(r).
(C6∗ ) Ū(x0 , r̄) ⊆ D.
(C7∗ ) There exists r1 ≥ r such that
1
f0 ((1 − θ )r1 , (1 − θ )r1 )dθ < 1.
0
Theorem 13.4.1 Suppose that the conditions (C∗ ) hold. Then, the sequence {xn }
generated for x−1 , x0 ∈ U(x∗ , r) \ {x∗ } by method (13.2) is well defined, remains
in U(x∗ , r) and converges to the solution x∗ of equation F(x) = 0, which is unique
in Ū(x0 , r1 ) ∩ D.
Proof: The proof is similar to the proof of Theorem 13.2.1. We shall first show
using mathematical induction that sequence {xn } is well defined and belongs in
U(x∗ , r). We have by the first condition in (C4∗ ) that
F (x∗ )−1 (A0 − F (x1 )) ≤ f0 (y0 − x∗ , z0 − x∗ )
≤ f0 (h1 (x−1 − x∗ , x0 − x∗ ),

h2 (x−1 − x∗ , x0 − x∗ ))
≤ f0 (h1 (r, r), h2 (r, r)) ≤ ϕ(r) < 1.

Hence, A−1
0 ∈ L(Y, X ) and
1 1
A−1
0 F (x0 ) ≤ ≤ .
1 − f0 (h1 (x−1 − x∗ , x0 − x∗ ), h2 (x−1 − x∗ , x0 − x∗ )) 1 − ϕ(r)
Thus, x1 is well defined. Then, we have by the second substep of condition (C4∗ )
that
1
F (x∗ )−1 ( 0 F (x∗ ) + θ (x0 − x∗ )) − A0 )dθ (x0 − x∗ )
1
≤ 0
f1 (x∗ + θ (x0 − x∗ ) − y0 , x∗ + θ (x0 − x∗ ) − z0 )dθ x0 − x∗
1
≤ 0
f1 (θ x0 − x∗ + y0 − x∗ , θ x0 − x∗ + z0 − x∗ )dθ x0 − x∗
1
≤ 0
f1 (θ x0 − x∗ + h1 (x−1 − x∗ , x0 − x∗ ), θ x0 − x∗
+h2 (x−1 − x∗ , x0 − x∗ )dθ x0 − x∗
≤ f1 (θ r + h1 (r, r), θ r + h2 (r, r))dθ x0 − x∗ .
In view of the approximation

x1 − x ∗ = x0 − x∗ − A−1
0 F(x0 )
= −A−1 ∗ ∗
0 (F(x0 ) − F(x ) − A0 (x0 − x ))

= −A−1
1 ∗
0 0
F (x + θ (x0 − x∗ ) − A0 ) dθ (x0 − x∗ )
we obtain
1
x1 − x∗ ≤ A−1 ∗
0 F(x ) 0
F (x∗ )−1 (F (x∗ + θ (x0 − x∗ )) − A0 )
dθ x0 − x∗
≤ qx0 − x∗ < x0 − x∗ < r,
which shows x1 ∈ U(x∗ .r).

Then, we must show that A−1
1 ∈ L(Y, X ). We get
F (x∗ )−1 (A1 − F (x∗ )) ≤ f0 (h1 (x0 − x∗ , x1 − x∗ ),

h2 (x0 − x∗ , x1 − x∗ ))
≤ f0 (h1 (r, r), h2 (r, r))x1 − x∗
= ϕ(r) < 1,
so
1
A−1 ∗
1 F (x ) ≤ .
1 − ϕ(r)
That is x2 is well defined. Then, we obtain

1
F (x∗ )−1 0 F (x∗ + θ (x1 − x∗ )) − A1 dθ x1 − x∗
≤ f1 (θ x1 − x∗ + y1 − x∗ , θ x1 − x∗ + z1 − x∗ )dθ x1 − x∗
≤ f1 (θ r + h1 (r, r), θ r + h2 (r, r), )dθ x1 − x∗ .
Using the identity

1
∗
x2 − x = −A−1
1 F (x + θ (x1 − x )) − A1 dθ x1 − x∗ ,
∗ ∗
0
we get again that
x2 − x∗ ≤ qx1 − x∗ < x1 − x∗ < r,
which shows that x2 ∈ U(x∗ , r).

In a similar way, we obtain
xk+1 − xk ≤ qxk − x∗ < xk − x∗ < r,
which shows that lim xk = x∗ and xk+1 ∈ U(x∗ , r). The part related to the unique-
k→∞
ness has been shown in Theorem 13.2.1. .
13.5 Local convergence analysis of the secant method II

Now we consider, as in the semilocal case, another set of conditions. The condi-
tions (H ∗ ) are:
(H1∗ ) F is a continuous operator defined on a convex subset D of a Banach space
X with values in a Banach space Y.
(H2∗ ) There exists x∗ ∈ D such that F(x∗ ) = 0.
(H3∗ ) There exist continuous and nondecreasing functions h1 , h2 : [0, +∞) ×
y − x∗ ≤ h1 (u − x∗ , v − x∗ )
and
z − x∗ ≤ h2 (u − x∗ , v − x∗ ).
(H4∗ ) There exist a divided difference of order one δ F, a linear operator

M ∈ L(X , Y), continuous, nondecreasing functions f1 , f2 : [0, +∞) ×
[0, +∞) → [0, +∞), n ≥ 0 and x−1 , x0 , y0 = g1 (x−1 , x0 ), z0 = g2 (x−1 , x0 ) ∈
D such that M −1 ∈ L(Y, X ) and for each x, y, z ∈ D
M −1 (δ F(x, y) − M) ≤ f0 (x − x∗ , y − x∗ )
and
M −1 (δ F(x, x∗ ) − δ F(u, v)) ≤ f1 (x − u, x∗ − v).
(H5∗ ) Define functions ϕ, ψ : [0, +∞) → [0, +∞) by

ϕ(t) = f0 (h1 (t,t), h2 (t,t))
and
ψ(t) = f1 (t + h1 (t,t),t + h2 (t,t)).
Moreover, define functions q and p by

ψ(t)
q(t) =
1 − ϕ(t)
and
p(t) = f1 (t + h1 (t,t), h2 (t,t)) + f0 (h1 (t,t), h2 (t,t)) − 1.
p(t) = 0
has at least one positive zero and we denote R the smallest positive zero
that satisfies
ϕ(R) < 1,
h1 (R, R) ≤ R̄
and
h2 (R, R) ≤ R̄, for some R̄ ≥ R.
Set q = q(R).
(H6∗ ) Ū(x0 , R̄) ⊆ D.
(H7∗ ) There exists R1 ≥ R such that
f0 (0, R1 ) < 1.
Theorem 13.5.1 Suppose that the conditions (H ∗ ) hold. Then, the sequence
{xn } generated for x−1 , x0 ∈ Ū(x∗ , R) \ x∗ by method (13.2) is well defined, re-
mains in U(x∗ , R) and converges to the solution x∗ of equation F(x) = 0, which
is unique in Ū(x0 , R1 ) ∩ D.
The results in Sections 13.4 and 13.5 hold for Newton-like methods, if we set
x−1 = x0 and zn = yn = xn , for each n = 0, 1, 2 . . ..

Example 13.6.1 We consider
1
x(s) = 1 + G(s,t)x(t)2 dt, s ∈ [0, 1] (13.29)
0
where x ∈ C[0, 1] and the kernel G is the Green’s function in [0, 1] × [0, 1].
We use a discretization process and transform equation (13.29) into a finite
dimensional problem. For this, we follow the process described in Section 13.2
in [12] with m = 8 and we obtain the following system of nonlinear equations:
F(x) ≡ x − 1 − Avx = 0, F : R 8 → R8 , (13.30)
where
a = (x1 , x2 , . . . , x8 )T , 1 = (1, 1, . . . , 1)T , A = (ai j )8i, j=1 , vx = (x12 , x22 , . . . , x82 )T .
We use the divided difference of first order of F as [u, v; F] = I − B, where
B = (bi j )8i, j=1 with bi j = ai j (u j + v j ).
If we choose the starting points x−1 = ( 10 , 10 , . . . , 10
7 7 7 T
) and x0 =
( 10 , 10 , . . . , 10 ) , method (13.5) with λ = 2 and the max-norm, we obtain c = 11
18 18 18 T 1
10 ,
y0 = ( 4 , 4 , . . . , 4 ) , β = 1.555774 . . ., η = 0.7839875 . . ., m = 0.257405 . . . and
5 5 5 T
the polynomial

0.257406
p(t) = −0.789051 + t 1 −
1 − 0.19223(0.55 + 2t)
has no real roots so the conditions in [12] are not satisfied so we cannot ensure
the convergence of method (13.5).
Next, we are going to use our (C) conditions in order to ensure the conver-
gence of the methods. We consider g1 (x, y) = y+x 2 and g2 (x, y) = y, and we define
the functions
s t
h1 (s,t) = + ,
2 2
h2 (s,t) = t,
f0 (s,t) = 0.0103507 . . . (s + t)
and
f1 (s,t) = 0.047827 . . . (s + t),
which satisfy the (C3 ) and (C4 ) conditions. Moreover, we define
ϕ(t) = 0.0207015t,
ψ(t) = 0.0640438 + 0.191311t

and
φ (t) 0.0640438 + 0.191311t
q(t) = = .
1 − ϕ(t) 1 − 0.0207015t
The solutions of equation (13.8) are
r1 = 1.09603 . . . and r2 = 3.39564 . . . .
Then, by denoting r = 1.09603 . . . and r̄ = r it is easy to see that the following

conditions are verified
ϕ(r) = 0.0226895 . . . < 1,
h1 (c, 0) = 0.55 ≤ r, h1 (r, r) = r ≤ r̄,

and
h2 (c, 0) = 0 ≤ r̄, h2 (r, r) = r ≤ r̄,
Finally to show that condition (C7 ) is verified we choose as r1 = 1.25 and we
obtain that
1
f0 (θ r + (1 − θ )r1 , θ r + (1 − θ )r1 )dθ = 0.0258769 . . . < 1.
0
So, all the conditions of Theorem 13.2.1 are satisfied and a consequence we can
ensure the convergence of method (13.5).
Example 13.6.2 We consider the same problem as in Example 1 but with

method (13.4). If we choose the starting points x−1 = ( 12 10 , 10 , . . . , 10 ) and
12 12 T
x0 = ( 14
10 , 10 , . . . , 10 ) , method (13.4) with a = 0.8, b = 0.75 and the max-norm,
14 14 T
we obtain c = 0.2,
y0 = (1.11459, 1.16864, 1.23603, 1.28163, 1.28163, 1.23603, 1.16864, 1.11459)T ,
z0 = (1.70443, 1.64679, 1.5749, 1.52626, 1.52626, 1.5749, 1.64679, 1.70443)T ,

β = 1.49123 . . ., δ = 0.380542 . . ., η = 0.394737 . . ., γ = 1.67305 . . . and M =
0.47606 . . .. One of the necessary conditions given in [4] is not satisfied since
Mδ γ 2 = 0.507089 . . . ≤ 0.5.
so we cannot ensure the convergence of method (13.4).

Next, we are going to use our (H) conditions in order to ensure the conver-
gence of the methods. We consider g1 (x, y) = x − aF(x) and g2 (x, y) = x + bF(x),
and we define the functions
h1 (s,t) = p0t + 0.5p,
h2 (s,t) = p1t + 0.5p,

f0 (s,t) = 0.0579754 . . . (s + t)
and
f1 (s,t) = 0.0579754 . . . (s + t),
where p = A0 , p0 = I − 0.75δ F[x, x0 ] = 1.16988 . . . and p1 = I +
0.8δ F[x, x0 ] = 1.03601 which satisfy the (C3 ) and (C4 ) conditions. Moreover,
we define
ϕ(t) = 0.0579754(2.98245 + 2.20589t),
ψ(t) = 0.10934 + 0.243838t
and
φ (t) 0.10934 + 0.243838t
q(t) = = .
1 − ϕ(t) 1 − 0.0579754(2.98245 + 2.20589t)
The solutions of equation (13.8) are
r1 = 0.598033 . . . and r2 = 1.46863 . . . .
Then, by denoting r = 0.598033 . . . and r̄ = 1.44524 . . . it is easy to see that the

following conditions are verified
ϕ(r) = 0.24939 . . . < 1,
h1 (c, 0) = 0.745613 . . . ≤ r̄, h1 (r, r) = r̄ . . . ≤ r̄,

and
h2 (c, 0) = 0.745613 . . . ≤ r̄, h2 (r, r) = r̄ . . . ≤ r̄,
Finally to show that condition (C7 ) is verified we choose as r1 = 0.6 and we
obtain that
1
f0 (θ r + (1 − θ )r1 , θ r + (1 − θ )r1 )dθ = 0.0695705 . . . < 1.
0
So, all the conditions of Theorem 13.2.2 are satisfied and a consequence we can
ensure the convergence of method (13.4).
Example 13.6.3 Let X = Y = R3 , D = U(0, 1), x∗ = (0, 0, 0)T and define func-
tion F on D by
F(x, y, z) = (ex − 1, y2 + y, z)T . (13.31)
We have that for u = (x, y, z)T
⎛ ⎞
ex 0 0
F (u) = ⎝ 0

2y + 1 0 ⎠ , (13.32)
0 0 1
Using the norm of the maximum of the rows and (13.31)–(13.32) we see that since
F (x∗ ) = diag{1, 1, 1}, In this case, we take the domain Ω = B(x0 , R), where
x0 = 0.5 and R = 0.3. Choosing x−1 = 0.4, it is easy to see that now, we are
going to use our (C∗ ) conditions in order to ensure the convergence of the method
(13.7). We define the functions
h1 (s,t) = t
h2 (s,t) = s,
e
f0 (s,t) = (s + t)
2
and
f1 (s,t) = e(s + t)
which satisfy the (H3∗ ) and (H4∗ ) conditions. Moreover, we define
ϕ(t) = et,
ψ(t) = 3et,
φ (t) 3et
q(t) = =
1 − ϕ(t) 1 − et
and
1
p(t) = f1 (θt +h1 (t,t), θt +h2 (t,t))dθ + f0 (h1 (t,t), h2 (t,t))−1 = 10.8731t −1.
0
The root of p(t) is

r1 = 0.0919699 . . .
so denoting r = 0.0919699 . . . and r̄ = r it is easy to see that the following con-
ditions are verified
ϕ(r) = 0.25 < 1,
h1 (r, r) = r̄ ≤ r̄,
and
h2 (r, r) = r ≤ r̄,
Finally, it is clear that Ū(x0 , r̄) ⊆ U(x0 , R) and to show that condition (H7∗ )
is verified we choose as r1 = 0.1 and we obtain that
1
f0 (θ r + (1 − θ )r1 , θ r + (1 − θ )r1 )dθ = 0.135914 . . . < 1.
0
So, all the conditions of Theorem 13.2.4 are satisfied and as a consequence we
can ensure the convergence of method (13.7).
Example 13.6.4 In order to show the applicability of our theory in a real prob-
lem we are going to consider the following quartic equation that describes the
fraction of the nitrogen-hydrogen feed that gets converted to ammonia, called the
fractional conversion showed in [13,25]. In Figure 13.1 they process of ammonia
is shown.
For 250 atm and 500C, this equation takes the form:
f (x) = x4 − 7.79075x3 + 14.7445x2 + 2.511x − 1.674
In this case, we take the domain Ω = B(x0 , R), where x0 = 0.5 and R = 0.3.
Choose x−1 = 0.4. We are going to use our (C∗ ) conditions in order to ensure
the convergence of the method (13.6). Define the functions
h1 (s,t) = s + 2t
h2 (s,t) = s,
f0 (s,t) = 0.111304(32.2642s + 20.5458t)
and
f1 (s,t) = 0.111304(40.9686s + 30.7779t),
which satisfy the (C3∗ ) and (C4∗ ) conditions. Moreover, we define
ϕ(t) = 13.0603t,
ψ(t) = 21.0985t,
Figure 13.1: Ammonia process.Taken from

www.essentialchemicalindustry.org/chemicals/ammonia.
φ (t) −1 + 34.1587t
q(t) = =
1 − ϕ(t) 1 − 13.0603t
and
1
p(t) = f1 (θt +h1 (t,t), θt +h2 (t,t))dθ + f0 (h1 (t,t), h2 (t,t))−1 = 34.1587t −1.
0
The root of p(t) is

r1 = 0.0292751 . . . .
Then, by denoting r = 0.0292751 . . . and r̄ = 3r = 0.0878253 . . . it is easy to see
that the following conditions are verified
ϕ(r) = 0.38234 . . . < 1,
h1 (r, r) = r̄ ≤ r̄,
and
h2 (r, r) = r ≤ r̄,
Finally, it is clear that Ū(x0 , r̄) ⊆ U(x0 , R). Next, to show that condition (C7∗ )
is verified we choose as r1 = 0.1 and we obtain that
1
f0 (θ r + (1 − θ )r1 , θ r + (1 − θ )r1 )dθ = 0.293899 . . . < 1.
0
So, all the conditions of Theorem 13.2.3 are satisfied and as a consequence we
can ensure the convergence of method (13.6).
References
[1] S. Amat, S. Busquier and J. M. Gutiérrez, On the local convergence of

secant-type methods, Int. J. Comput. Math., 81(8), 1153–1161 (2004).
[2] S. Amat, S. Busquier and Á. A. Magreñán, Reducing chaos and bifurca-
tions in Newton-type methods, Abst. Appl. Anal., 2013 (2013), Article ID
726701, 10 pages, http://dx.doi.org/10.1155/2013/726701.
[3] S. Amat, Á. A. Magreñán and N. Romero, On a family of two step Newton-
type methods, Appl. Math. Comput., 219 (4), 11341–11347, 2013.
[4] S. Amat, J. A. Ezquerro and M. A. Hernández, On a Steffensen-like method
for solving equations, Calcolo, 53 (2), 171–188, 2016.
[5] I. K. Argyros, A unifying local–semilocal convergence analysis and ap-
plications for two–point Newton–like methods in Banach space, J. Math.
Anal. Appl., 298, 374–397, 2004.
[6] I. K. Argyros, Computational theory of iterative methods, Elsevier Science,
Editors C. K. Chui and L. Wuytack, San Diego (U.S.A.), 2007.
methods, Math. Comput. AMS, 80, 327–343, 2011.
[8] I. K. Argyros, Weaker conditions for the convergence of Newton’s method,
J. Comp., 28, 364–387, 2012.
[9] I. K. Argyros, D. González and Á. A. Magreñán, A semilocal conver-
gence for a uniparametric family of efficient secant-Like methods, J. Funct.
Spaces, Volume 2014 (2014), Article ID 467980, 10 pages. http://dx.
doi.org/10.1155/2014/467980.
[10] I. K. Argyros and Á. A. Magreñán, A unified convergence analysis for
Secant-type methods, J. Korean Math. Soc., 51 (6), 1155-1175, 2014.
238
References 239
[11] J. E. Dennis, Toward a unified convergence theory for Newton-like meth-

ods, Nonlinear Funct. Anal. App. (L. B. Rall, ed.), Academic Press, New
York, 425–472,1971.
[12] J. A. Ezquerro, M. A. Hernández and A. I. Velasco, An analysis of the
semilocal convergence for secant-like methods, Appl. Math. Comput., 266,
883–892, 2015.
[13] V. B. Gopalan and J. D. Seader, Application of interval Newton’s method to
chemical engineering problems, Reliable computing, 1 (3), 215–223, 1995.
[14] J. M. Gutiérrez, Á. A. Magreñán and N. Romero, On the semilocal conver-
Appl. Math. Comput., 221, 79–88, 2013.
[15] J. M. Gutiérrez, Á. A. Magreñán and J. L. Varona, The “Gauss-
Seidelization” of iterative methods for solving nonlinear equations in the
complex plane. Appl. Math. Comput., 218 (6), 2467–2479, 2011.
[16] M. A. Hernández, M. J. Rubio and J. A. Ezquerro, Solving a special case of
conservative problems by secant–like method, Appl. Math. Comput., 169,
926–942, 2005.
[17] M. A. Hernández, M. J. Rubio and J. A. Ezquerro, Secant-like methods for
solving nonlinear integral equations of the Hammerstein type, J. Comput.
Appl. Math., 115, 245–254, 2000.
[18] V. A. Kurchatov, On a method of linear interpolation for the solution of
functional equations, Dokl. Akad. Nauk SSSR (Russian), 198 (3), 524–526,
1971. Translation in Soviet Math. Dokl., 12, 835–838, 1974.
[19] P. Laarsonen, Ein überquadratisch konvergenter iterativer algorithmus,
Ann. Acad. Sci. Fenn. Ser I, 450, 1–10, 1969.
[20] F. A. Potra, An error analysis for the secant method, Numer. Math., 38,
427–445, 1982.
[21] F. A. Potra, Sharp error bounds for a class of Newton-like methods, Lib.
Math., 5, 71–84, 1985.
[22] F. A. Potra and V. Pták, Nondiscrete induction and iterative processes, Pit-
man, New York, 1984.
[23] J. W. Schmidt, Untere Fehlerschranken fur Regula–Falsi Verhafren, Period.
Hungar., 9, 241–247, 1978.
[24] A. S. Sergeev, On the method of Chords (in Russian), Sibirsk, Math. J., 11,
282–289, 1961.
240 References
[25] M. Shacham, An improved memory method for the solution of a nonlinear

equation, Chem. Engin. Sci., 44, 1495–150, 1989.
[26] S. M. Shakhno, On a Kurchatov’s method of linear interpolation for solving
nonlinear equations, PAMM-Proc. Appl. Math. Mech., 4, 650–651, 2004.
(2L), 415–421, 2015.
291, 468–477, 2016.
[30] S. Ulm, Majorant principle and the method of Chords (in Russian), Izv.
Akad. Nauk Eston. SSR, Ser. Fiz.-Mat., 13, 217–227, 1964.
[31] T. Yamamoto, A convergence Theorem for Newton-like methods in Banach
spaces, Numer. Math., 51, 545–557, 1987.
Chapter 14
King-Werner-like
methods free of
derivatives
Ioannis K. Argyros

CONTENTS
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
14.1 Introduction
Recently, Argyros and Ren in [6] studied King-Werner-like methods for approx-
imating a locally unique solution x of equation
F(x) = 0, (14.1)
241
where F is Fréchet-differentiable operator defined on a convex subset of a Ba-

nach space B1 with values in a Banach space B2 .
In particular, they studied the semilocal convergence analysis of method de-
fined for n = 0, 1, 2, . . . by
xn+1 = xn − A−1
n F(xn )
(14.2)
yn+1 = xn+1 − A−1
n F(xn+1 ),
where x0 , y0 are initial points, An = [xn , yn ; F] and [x, y; F] denotes a divided dif-
ference of order one for operator F at points x, y ∈ Ω [2, 4, 7] satisfying
[x, y; F](x − y) = F(x) − F(y) f or each x, y ∈ Ω with x = y. (14.3)
If F is Fréchet-differentiable on Ω, then F (x) = [x, x; F] for each x ∈ Ω.

On the other hand, the local convergence analysis of method (14.2) was given
√ B1 = B2 = R. The convergence order of method
in [9] in the special case when
(14.2) was shown to be 1 + 2.
The chapter is organized as follows: In Section 14.2 the semilocal conver-
gence study of method (14.2) is presented. Then, in Section 14.3 the local conver-
gence study is provided. Finally, the numerical examples are presented in Section
14.4.

First of all, we need some auxiliary results related to majorizing sequence. All
the proofs of these results can be found in [6].
Lemma 14.2.1 Let L0 > 0, L > 0, s0 ≥ 0, t1 ≥ 0 be given parameters. Denote by

α the only root in the interval (0, 1) of polynomial p defined by
p(t) = L0t 3 + L0t 2 + 2Lt − 2L. (14.4)
Suppose that
L(t1 + s0 ) 2L0t1
0< ≤ α ≤ 1− , (14.5)
1 − L0 (t1 + s1 + s0 ) 1 − L0 s0
where
s1 = t1 + L(t1 + s0 )t1 . (14.6)
Then, scalar sequence {tn } defined for each n = 1, 2, . . . by
−tn +sn −tn )(tn+1 −tn )
t0 = 0, sn+1 = tn+1 + L(tn+1
1−L0 (tn −t0 +sn +s0 ) , f or each n = 1, 2, . . . ,
L(tn+1 −tn +sn −tn )(tn+1 −tn ) (14.7)
tn+2 = tn+1 + 1−L0 (tn+1 −t0 +sn+1 +s0 ) , f or each n = 0, 1, 2, . . .
King-Werner-like methods free of derivatives 243
is well defined, increasing, bounded above by

t1
t = (14.8)
1−α
and converges to its unique least upper bound t which satisfies
t1 ≤ t ≤ t . (14.9)
sn − tn ≤ α(tn − tn−1 ) ≤ α n (t1 − t0 ), (14.10)
tn+1 − tn ≤ α(tn − tn−1 ) ≤ α n (t1 − t0 ) (14.11)

and
tn ≤ sn (14.12)
for each n = 1, 2, ....
Throughout this chapter, we denote by U(w, ρ), U(w, ρ), the open and closed
balls in B1 , respectively, with center w ∈ B1 and of radius ρ > 0.
Now, using {tn } as a majorizing sequence we present our main semilocal
result.
Theorem 14.2.2 Let F : Ω ⊂ B1 → B2 be a Fréchet-differentiable operator. Sup-

pose that there exists a divided differentiable [., ., ; F] of order one for operator
F on Ω × Ω. Moreover, suppose that there exist x0 , y0 ∈ Ω, L0 > 0, L > 0, s0 ≥ 0,
t1 ≥ 0 such that
A−1
0 ∈ L(B2 , B1 ) (14.13)
A−1
0 F(x0 ) ≤ t1 , (14.14)
x0 − y0 ≤ s0 , (14.15)
A−1
0 ([x, y; F] − A0 ) ≤ L0 (x − x0 + y − y0 ), for each x, y ∈ Ω (14.16)
A−1
0 ([x, y; F] − [z, v; F]) ≤ L(x − z + y − v), (14.17)
1
for each x, y, z ∈ Ω1 = Ω ∩U(x0 , )
2L0
U(x0 ,t ) ⊆ Ω (14.18)
and hypotheses of Lemma 14.2.1 hold, where A0 = [x0 ; y0 ; F] and t is given in
Lemma 14.2.1. Then, sequence {xn } generated by method (14.2) is well defined,
remains in U(x0 ,t ) and converges to a unique solution x ∈ U(x0 ,t ) of equation
F(x) = 0. Moreover, the following estimates hold for each n = 0, 1, 2, ...
xn − x ≤ t − tn . (14.19)
Furthermore, if there exists R > t such that

U(x0 , R) ⊆ Ω (14.20)
and
L0 (t + R + s0 ) < 1, (14.21)

Proof: Simply notice that the iterates remain in Ω1 which is a more precise lo-
cation than Ω used in [6], since Ω1 ⊆ Ω. Then, in view of this the proof follows
from the corresponding one in [6].
2
Remark 14.2.3 (a) The limit point t can be replaced by t given in closed form
by (14.8) in Theorem 14.2.1.
(b) In [6], Argyros and Ren used the stronger condition
A−1
0 ([x, y; F] − [z, v; F]) ≤ L1 (x − z + y − v) for each x, y, z, v ∈ Ω.
Notice that from we have

L0 ≤ L1 and L ≤ L1
L1
holds in general and can be arbitrarily large [2–6]. Moreover, it follows from
L0
the proof of Theorem 14.2.2 that hypothesis (14.17) is not needed to compute an
upper bound for A−1
0 F(x1 ). Hence, we can define the more precise (than {tn })
majorizing sequence {t n } (for {xn }) by
t 0 = 0,t 1 = t1 , s0 = s0 , s1 = t 1 + L0 (t 1 + s0 )t 1 ,
−t n +sn −t n )(t n+1 −t n ) (14.22)
sn+1 = t n+1 + L(t n+11−L0 (t n −t 0 +sn +s0 )
f or each n = 1, 2, ...
and
n+1 −t n +sn −t n )(t n+1 −t n )
t n+2 = t n+1 + L(t1−L 0 (t n+1 −t 0 +sn+1 +s0 )
f or each n = 0, 1, ... (14.23)
Then, from a simple induction argument we have that
t n ≤ tn , (14.24)
sn ≤ sn , (14.25)
t n+1 − t n ≤ tn+1 − tn , (14.26)
sn − t n ≤ sn − tn (14.27)
and
t = lim t n ≤ t . (14.28)
n→∞
Furthermore, if L0 < L, then (14.24)-(14.27) are strict for n ≥ 2, n ≥ 1, n ≥ 1,
n ≥ 1, respectively. Clearly, sequence {t n } increasing converges to t under the
hypotheses of Lemma 14.2.1 and can replace {tn } as a majorizing sequence for
{xn } in Theorem 14.2.2. Finally, the old sequences using L1 instead of L in [6]
are less precise than the new ones.

Now, we present the main local convergence result.
Theorem 14.3.1 Let F : Ω ⊆ B1 → B2 be a Fréchet-differentiable operator. Sup-

pose that there exist x ∈ Ω, l0 > 0 and l > 0 such that for each x, y, z, u ∈ Ω
F(x ) = 0, F (x )−1 ∈ L(B2 , B1 ), (14.29)
F (x )−1 ([x, y; F] − F (x )) ≤ l0 (x − x + y − x ) for each x, y ∈ Ω

(14.30)
F (x )−1 ([x, y; F] − [z, u; F]) ≤ l(x − z + y − u), (14.31)

1
for each x, y, z, u ∈ Ω2 := Ω ∩U(x∗ , )
2l0
and
U(x , ρ) ⊆ Ω, (14.32)
where
1
ρ= √ . (14.33)
(1 + 2)l + 2l0
Then, sequence {xn } generated by method (14.2) √ is well defined, remains in

U(x , ρ) and converges to x with order of 1 + 2 at least, provided that
x0 , y0 ∈ U(x , ρ). Furthermore, the following estimates
√
2−1
xn+2 − x ≤ xn+1 − x 2 xn − x (14.34)
ρ2
and √
2 − 1 Fn −1
xn − x ≤ ( ) x1 − x Fn (14.35)
ρ
hold for each n = 1, 2, . . ., where Fn is a generalized Fibonacci sequence defined
by F1 = F2 = 1 and Fn+2 = 2Fn+1 + Fn .
Proof: Similar to the Proof of Theorem 14.2.2.

2
Remark 14.3.2 (a) For the special case B1 = B2 = R, the radius of convergence
ball for method (14.2) is given in [10] by
s
ρ = , (14.36)
M
where s ≈ 0.55279 is a constant and M > 0 is the upper bound for
|F(x )−1 F (x)| in the given domain Ω. Using (14.31) we have
F (x )−1 (F (x) − F (y)) ≤ 2lx − y f or any x, y ∈ Ω. (14.37)

That is, we can choose l =Simply set l0 = l, we have from (14.33) that
M
2.
√
2 2(3 − 2) 0.63432 s
ρ= √ = ≈ > = ρ . (14.38)
(3 + 2)M 5M M M
Therefore, even in this special case, a bigger radius of convergence ball for
method (14.2) has been given in Theorem 14.3.1.
(b) Notice that we have
l0 ≤ l1 and l ≤ l1 (14.39)
A−1
0 ([x, y; F] − [z, u; F]) ≤ l1 (x − z + y − u) for each x, y, z, u ∈ Ω.
The radius given in [6]:
1
ρ0 = √ ≤ ρ. (14.40)
(1 + 2)l1 + l0
Moreover, if l < l1 , then ρ0 < ρ and the new error bounds (14.34) and (14.35)
are tighter than the old ones in [6] using ρ0 instead of ρ.

Example 14.4.1 Let B1 = B2 = R, Ω = (−1, 1) and define F on Ω by
F(x) = ex − 1. (14.41)
Then, x = 0 is a solution of (14.41), and F (x ) = 1. Note that for any x, y, z, u ∈
Ω, we have
|F (x )−1 ([x, y; F] − [z, u; F])|

1
= | 0 (F (tx + (1 − t)y) − F (tz + (1 − t)u))dt|
11
=| 0 0 (F θ (tx + (1 − t)y) +(1 − θ )(tz + (1 − t)u) )
× tx + (1 − t)y − (tz + (1 − t)u) dθ dt|
11
= | 0 0 (eθ (tx+(1−t)y)+(1−θ )(tz+(1−t)u) ) tx + (1 − t)y − (tz + (1 − t)u) dθ dt|
1
≤ 0 e|t(x − z) + (1 − t)(y − u)|dt
≤ 2e (|x − z| + |y − u|)
(14.42)
and
1
|F (x )−1 ([x, y; F] − [x , x ; F])| = | 0 F (tx + (1 − t)y)dt − F (x )|
1 tx+(1−t)y
= | 0 (e − 1)dt|
1
+ (tx+(1−t)y)
2
= | 0 (tx + (1 − t)y)(1 + tx+(1−t)y + · · · )dt| (14.43)
1 2! 3!
≤ | 0 (tx + (1 − t)y)(1 + 2! + 3! + · · · )dt|
1 1

≤ e−1 2 (|x − x | + |y − x |).
That is to say, the Lipschitz condition (14.31) and the center-Lipschitz condition
1
(14.30) are true for l1 = 2e , l = e 2 and l0 = e−1

e−1
2 , respectively. Using (14.33) in
Theorem 14.3.1, we can deduce that the radius of convergence ball for method
(14.2) is given by
1 2
ρ0 = √ = √ ≈ 0.200018471, (14.44)
(1 + 2)l1 + 2l0 (3 + 2)e − 2
which is smaller than the corresponding radius
1
ρ= √ ≈ 0.2578325131698342986 (14.45)
(1 + 2)l + 2l0
Let us choose x0 = 0.2, y0 = 0.199. Suppose sequences {xn } and {yn } are
generated by method (14.2). Table 14.1 gives a comparison results of error es-
timates, which shows that tighter error estimates can be obtained from the new
(14.34) or (14.35) using ρ instead of ρ0 used in [6].
Table 14.1: The comparison results of error estimates.
n using ρ using ρ0
new (14.35) old (14.35)
3 0.0498 0.0828
4 0.0031 0.0142
5 2.9778e-06 1.7305e-04
6 1.7108e-13 4.4047e-09
6 5.4309e-31 3.4761e-20
Hence the new results are more precise than the old ones in [6].
Example 14.4.2 Let B1 = B2 = C[0, 1], the space of continuous functions de-
fined on [0, 1], equipped with the max norm and Ω = U(0, 1). Define function F
on Ω, given by
1
F(x)(s) = x(s) − 5 stx3 (t)dt, (14.46)
0
and the divided difference of F is defined by
1
[x, y; F] = F (tx + (1 − t)y)dt. (14.47)
0
Then, we have
1
[F (x)y](s) = y(s) − 15 stx2 (t)y(t)dt, f or all y ∈ Ω. (14.48)
0
We have x (s) = 0 for all s ∈ [0, 1], l0 = 3.75 and l = l1 = 7.5. Using Theorem
14.3.1, we can deduce that the radius of convergence ball for method (14.2) is
given by
1
ρ0 = ρ = √ ≈ 0.039052429. (14.49)
(1 + 2)l + 2l0
Example 14.4.3 Let B1 = B2 = C[0, 1] be equipped with the max norm and Ω =
U(0, r) for some r > 1. Define F on Ω by
1
F(x)(s) = x(s) − y(s) − μ G(s,t)x3 (t)dt, x ∈ C[0, 1], s ∈ [0, 1].
0
y ∈ C[0, 1] is given, μ is a real parameter and the Kernel G is the Green’s function
defined by
(1 − s)t if t ≤ s
G(s,t) =
s(1 − t) if s ≤ t.
Then, the Fréchet derivative of F is defined by
1

(F (x)(w))(s) = w(s) − 3μ G(s,t)x2 (t)w(t)dt, w ∈ C[0, 1], s ∈ [0, 1].
0
Let us choose x0 (s) = y0 (s) = y(s) = 1 and |μ| < 83 . Then, we have that
I − A0 ≤ 38 μ, A−1
0 ∈ L(B2 , B1 ),
−1 |μ| 3(1+r)|μ|
A0 ≤ 8−3|μ| , s0 = 0, t1 = 8−3|μ|
8
, L0 = 2(8−3|μ|) ,
and
3r|μ|
L= .
8 − 3|μ|
Let us choose r = 3 and μ = 12 . Then, we have that
t1 = 0.076923077, L0 ≈ 0.461538462, L = L1 ≈ 0.692307692
and
L(t1 + s0 )
≈ 0.057441746, α ≈ 0.711345739,
1 − L0 (t1 + s1 + s0 )
2L0t1
1− ≈ 0.928994083.
1 − L0 s0
That is, condition (14.5) is satisfied and Theorem 14.2.2 applies.
References

Elservier Publ. Co. New York, USA, 2007.
[4] I. K. Argyros and S. Hilout, Computational methods in nonlinear analysis.
Efficient algorithms, fixed point theory and applications, World Scientific,
2013.
[6] I. K. Argyros and H. Ren,√On the convergence of efficient King-Werner-
type methods of order 1 + 2 free of derivatives, App. Math. Comput., 217
(2), 612–621, 2010.
Oxford, 1982.
[8] R. F. King, Tangent methods for nonlinear equations, Numer. Math., 18,
298–304, 1972.
[9] T. J. McDougall and S. J. Wotherspoon, A simple
√ modification of Newton’s
method to achieve convergence of order 1 + 2, Appl. Math. Let., 29, 20–
25, 2014.
[10] H. Ren, Q. Wu and W. Bi, On convergence of a new secant like method for
solving nonlinear equations, Appl. Math. Comput., 217, 583–589, 2010.
249
250 References

129–142, 1977.
[12] J. F. Traub, Iterative Methods for the Solution of Equations, Englewood
Cliffs, Prentice Hull, 1984.
√
[13] W. Werner, Uber ein Verfahren der Ordnung 1 + 2 zur Nullstellenbestim-
mung, Numer. Math., 32, 333–342, 1979.
√
[14] W. Werner, Some supplementary results on the 1 + 2 order method for
the solution of nonlinear equations, Numer. Math., 38, 383–392, 1982.
Chapter 15
Müller’s method
Ioannis K. Argyros

CONTENTS
15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
15.2 Convergence ball for method (15.1.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
15.1 Introduction
In this chapter we are concerned with approximating a solution of the equation
f (x) = 0, (15.1)
where f is defined on an open domain or closed domain D on a real space R.
The widely used [4] Müller’s method is defined in [2] by
2C
xn+1 = xn − n , n = 0, 1, 2, ..., x−2 , x−1 , x0 ∈ D, (15.2)
Bn ± B2n − 4AnCn
where,
An = f [xn , xn−1 , xn−2 ], Bn = f [xn , xn−1 ]+An (xn − xn−1 ), Cn = f (xn ), (15.3)
251
and f [., .], f [., ., .] are divided differences of order one and two, respectively (see
[3]). The sign in the denominator of (15.2) is chosen so as to give the larger value.
It is a free-derivative method and has a convergence order 1.839 . . . under rea-
sonable conditions [2]. Xie in [5] established a semilocal convergence Theorem
of the method by means of bounding third and fourth derivatives. Bi et al. [6] pre-
sented a new semilocal convergence Theorem of the method using γ−condition.
Wu et al. [7] gave the convergence ball and error analysis of the method under
the hypotheses that the second-order and third-order derivative of function f are
bounded.
In this chapter, we will use a weaker condition to provide a new estimate
on the radius of convergence ball of Müller’s method. We will assume that f is
differentiable in D, f (x ) = 0, and the following Lipschitz conditions are true:
| f (x )−1 ( f [x, y] − f [u, v])| ≤ K(|x − u| + |y − v|) , f or any x, y, u, v ∈ D (15.4)
and
| f (x )−1 ( f [x, x ] − f [x , x ])| ≤ K |x − x | , f or any x ∈ D, (15.5)
where, K > 0 and K > 0 are constants, and x ∈ D is a solution of (15.1).

The chapter is organized as follows: In Section 15.2 we present the conver-
gence ball analysis of method (15.2). Finally, some numerical examples are pro-
vided in Section 15.3.
15.2 Convergence ball for method (15.1.2)

Throughout this chapter we will denote U(x, r) as an open ball around x with
radius r.
Next, we have the following result.
Theorem 15.2.1 Suppose x ∈ D is a solution of equation (15.1), f (x ) = 0,

conditions (15.4) and (15.5) are satisfied. Denote
1
R = √ . (15.6)
5K + K + 2 4K 2 + 2KK
Assume
U(x , R ) ⊆ D. (15.7)
Then, the sequence {xn } generated by Müller’s method (15.2) starting from any
three distinct points x−2 , x−1 , x0 ∈ U(x , R ) is well defined, and converges to x .
Moreover, the following estimates hold:
C Fn+1 −1
|x − xn+1 | ≤ ( ) |x − x0 |Fn |x − x−1 |Fn−1 , f or any n = 0, 1, 2, ..., (15.8)
R
Müller’s method 253
where,
2K
C= √ ,
4K 2 + 2KK
and {Fn } is Fibonacci sequence, and is defined by F−1 = 1, F0 = 1 and Fn+1 =
Fn + Fn−1 for any n = 0, 1, ....
Proof: We will use induction. First of all, denote en = x − xn (n = −2, −1, ...).
Let x−2 , x−1 , x0 ∈ U(x , R ) be distinct points. By (15.3) and (15.4), we have
| f (x )−1 A0 | = | f (x )−1 f [x0 , x−1 , x−2 ]| = | f (x )−1 f [x0 ,x−2x0]− f [x−1 ,x−2 ]
−x−1 | ≤ K.
(15.9)
Now, using x−1 , x0 ∈ U(x , R ), (15.3), (15.5) and (15.9), we get
|1 − f (x )−1 B0 | = |1 − f (x )−1 ( f [x0 , x−1 ] + A0 (e−1 − e0 ))|
= | f (x )−1 ( f [x , x ] − f [x , x0 ] + f [x0 , x ] − f [x0 , x−1 ] + A0 (e0 − e−1 ))|
≤ K |e0 | + K|e−1 | + K(|e0 | + |e−1 |) = (K + K )|e0 | + 2K|e−1 |
< (3K + K )R = √
3K+K
< 1,
5K+K +2 4K +2KK
2
(15.10)
which means
√
4K 2 +2KK
| f (x )−1 B0 | > 1 − (3K + K )R =
2K+2
√ >0 (15.11)
5K+K +2 4K 2 +2KK
and √
5K+K +2
√ 4K +2KK .
2
|B−1
0 f (x )| < 1
1−(3K+K )R = (15.12)
2K+2 4K 2 +2KK
Using x0 ∈ U(x , R ), f (x ) = 0, (15.3) and (15.5), we have
| f (x )−1C0 | = | f (x )−1 ( f (x0 ) − f (x ))| = | f (x )−1 ( f [x , x0 ]
− f [x , x ] + f [x , x ])e0 | (15.13)
≤ K |e0 |2 + |e0 | < K R + R .
2
In view of (15.9), (15.11), (15.13) and (15.6), we have

( f (x )−1 )2 B20 − 4( f (x )−1 )2 A0C0 ≥ | f (x )−1 B0 |2 − 4| f (x )−1 A0 |
| f (x )−1C0 |
> (1 − (3K + K )R )2 − 4K(K R + R )
2
= [(3K + K )2 − 4KK ]R − [2(3K + K ) + 4K]R + 1

2
= (9K 2 + 2KK + K2 )R − 2(5K + K )R + 1 = 0.

2
(15.14)
Therefore, x1 can be defined. Denoting sign(s) as sign(s) = 1 if s ≥ 0 and
sign(s) = −1 if s < 0. Now, considering (15.2), we obtain
√
B −sign(B0 ) B20 −4A0C0
e1 = e0 + 2C√0 2 = e0 + 0 2A0
B0 +sign(B0 ) B0 −4A0C0
√
2A0 e0 +B0 −sign(B0 ) B20 −4A0C0 (2A0 e0 +B0 )2 −B20 +4A0C0
= 2A0 = √ 2 (15.15)
2A0 (2A0 e0 +B0 +sign(B0 ) B0 −4A0C0 )
2(A0 e20 +B0 e0 +C0 )
= √ ,
2A0 e0 +B0 +sign(B0 ) B20 −4A0C0
so we have
2|A0 e20 +B0 e0 +C0 |
|e1 | = √ . (15.16)
|2A0 e0 +B0 +sign(B0 ) B20 −4A0C0 |
Next we shall show 2A0 e0 + B0 has the same sign as B0 . In fact, using (15.9),
(15.6) and (15.12), we get
2A0 e0 +B0
= 1 + 2 f (x )−1 A0 f (x )B−1 −1 −1
0 e0 ≥ 1 − 2| f (x ) A0 || f (x )B0 ||e0 |
B0 √
2K(5K+K +2 4K 2 +2KK )
> 1− √ R
√ 2K+2 4K 2 +2KK
4K 2 +2KK
= √ > 0.
K+ 2 4K +2KK
(15.17)
Hence, by (15.16), (15.3), (15.9), (15.4), (15.14), (15.11) and (15.17), we obtain
2|A0 e20 +B0 e0 +C0 |
|e1 | = √ = 2|(A0 e0 +B0 )e√0 − f [x ,x0 ]e0 |
|2A0 e0 +B0 |+ B20 −4A0C0 |2A0 e0 +B0 |+ B20 −4A0C0
2|A0 e−1 +A0 (e0 −e−1 )+ f [x0 ,x−1 ]−A0 (e0 −e−1 )− f [x0 ,x ]|
= √ |e0 |
|2A0 e0 +B0 |+ B20 −4A0C0
2| f (x )−1 (A0 e−1 + f [x0 ,x−1 ]− f [x0 ,x ])|
= 2A e +B
√ |e0 |
| f (x )−1 B0 | 0 B0 0 + ( f (x )−1 B0 )2 −4 f (x )−1 A0 f (x )−1C0
0
(15.18)
4K|e−1 |
≤ √
2
2K+2 4K +2KK
√
4K 2 +2KK
|e0 |
√ √
5K+K +2 √
4K 2 +2KK K+ 4K 2 +2KK
2K(5K+K +2 4K 2 +2KK )
= √ |e0 ||e−1 |.
4K 2 +2KK
as a consequence, from the definition of C in this Theorem, we obtain

|e0 | |e−1 |
|e1 |
R ≤√ 2K
R = C |eR0| |eR−1 | < 1, (15.19)
4K 2 +2KK R
that is, x1 ∈ U(x , R ).

Using an inductive procedure on n = 0, 1, 2, ..., we get xn ∈ U(x , R ), and
|en+1 |
R ≤ C |eRn| |en−1 |
R , f or any n = 0, 1, 2, ....
(15.20)
Denote
ρn = C |eRn| , n = 0, 1, 2, ...., (15.21)
we obtain
ρn+1 ≤ ρn ρn−1 , n = 0, 1, 2, .... (15.22)
Clearly the following relation holds:
F
ρn+1 ≤ ρ0Fn ρ−1
n−1
, n = 0, 1, 2, .... (15.23)
Consequently, estimates (15.8) are true. 2

Remark 15.2.2
(a) Condition (15.4) used in Theorem 15.2.1 is weaker than bounded conditions
of the second-order and third-order derivative of function f used in [7]. In
fact, suppose that f is twice differentiable on D, and
| f (x )−1 f (x)| ≤ N , f or any x ∈ D, (15.24)
then, for any x, y, u, v ∈ D, we get

1
| f (x )−1 ( f [x, y] − f [u, v])| = | f (x )−1 0 ( f (tx + (1 − t)y)
− f (tu + (1 − t)v))dt|

−1 1 1

=
| f (x ) 0 0
f s(tx + (1 − t)y) + (1 − s)(tu + (1 − t)v)
t(x − u) + (1 − t)(y − v) dsdt|
≤ N2 (|x − u| + |y − v|),
(15.25)
which shows condition (15.4) holds for K = N2 .
(b) Notice that [6] a semilocal convergence result is given under the γ-condition
of order two which is a condition of third − order derivative of function
f , however, we don’t use any information of f in our Theorem.
Remark 15.2.3 It is worth noticing that from (15.4) and (15.5)
K ≤ K (15.26)
holds in general and KK can be arbitrarily large [1]. Therefore, the condition K ≤
K in Theorem 15.2.1 involves no loss of generality. If we use only condition
(15.4) in Theorem 15.2.1, we can establish a similar Theorem by replacing (15.6)
and (15.8) by
1
R = √ (15.27)
2(3 + 6)K
and
√
|x − xn+1 | ≤ ((4 + 2 6)K)Fn+1 −1 |x − x0 |Fn |x − x−1 |Fn−1 , f or any n = 0, 1, 2, ...,
(15.28)
respectively. On the other hand, if (15.26) holds strictly, it is clear that the fol-
lowing inequalities are true:

R < R (15.29)
and
√ C 2K(5K + K )
(4 + 2 6)K < = √ 2 + 4K, (15.30)
R 4K + 2KK
which means that we have bigger radius of convergence ball of Müller’s method
and tighter errors by using conditions (15.4) and (15.5) simultaneously instead
of using only condition (15.4).

Example 15.3.1 Let f be defined on D = [− 52 , 12 ] by
3 2
x ln x + x5 − x4 + 4x, x = 0;
f (x) = (15.31)
0, x = 0.
Then, we have

3x2 ln x2 + 5x4 − 4x3 + 2x2 + 4, x = 0,
f (x) = (15.32)
4, x = 0,

6x ln x2 + 20x3 − 12x2 + 10x, x = 0,
f (x) = (15.33)
0, x=0
and
6 ln x2 + 60x2 − 24x + 22, x = 0,
f (x) = (15.34)
−∞, x = 0,
which means f is continuous on D and f is unbounded on D. Then, Theorem
15.2.1 applies. However, the Theorem in [7] cannot apply.
Example 15.3.2 Let f be defined on D = [−1, 1] by
f (x) = ex − 1. (15.35)
Then, we have
f (x) = ex , f (x) = ex , f (x) = ex . (15.36)

It is obvious that x = 0, f (x ) = 1 and for any x, y, u, v ∈ D, we have
1
| f (x )−1 ( f [x, y] − f [u, v])| = | 0 ( f (tx + (1 − t)y) − f (tu + (1 − t)v))dt|
1
= | 0 (etx+(1−t)y − etu+(1−t)v )dt|
11
= | 0 0 es(tx+(1−t)y)+(1−s)(tu+(1−t)v) ds(t(x − u) + (1 − t)(y − v))dt|
≤ 2e (|x − u| + |y − v|).
(15.37)
Moreover, for any x ∈ D, we have
1
| f (x )−1 ( f [x, x ] − f [x , x ])| = | 0 ( f (tx) − f (0))dt|
1 1
= | 0 (etx − 1)dt| = | 0 (tx + (tx)
2
+ ...)dt| (15.38)
1 2!
≤ 0 t|x|(1 + 2! + ...)dt| = 2 |x − x |.
1 e−1
Therefore, we can choose K = 2e and K = e−1 2 in (15.4) and (15.5), respectively.

By (15.6), we have R ≈ 0.0720 and (15.7) is true. Therefore, all conditions of
Theorem 15.2.1 hold, so it applies. Moreover, if we use only condition (15.4),
we have only R ≈ 0.0675. Note also that we can choose constants N (the upper
bound for | f (x )−1 f (x)| on D) and M (the upper bound for | f (x )−1 f (x)|
on D) of Theorem 1 in [7] as N = M = e. However, the other condition 1215N 2 ≤
32M of Theorem 1 in [7] is not satisfied, and so Theorem 1 in [7] cannot apply.
Example 15.3.3 Let f be defined on D = [−1, 1] by

1 2
f (x) = 2 x + 3x, 0 ≤ x ≤ 1,
(15.39)
− 12 x2 + 3x, −1 ≤ x < 0.
Then, we have x = 0,

x + 3, 0 ≤ x ≤ 1,
f (x) = |x| + 3 = (15.40)
−x + 3, −1 ≤ x < 0
and ⎧
⎨ 1, 0 < x ≤ 1,
f (x) = does not exist, x = 0, (15.41)
⎩
−1, −1 ≤ x < 0.
Notice that, for any x, y, u, v ∈ D, we have
1
| f (x )−1 ( f [x, y] − f [u, v])| = 13 | 0 ( f (tx + (1 − t)y) − f (tu + (1 − t)v))dt|
1
= 13 | 0 (|tx + (1 − t)y| − |tu + (1 − t)v|)dt|
1
≤ 13 0 ||tx + (1 − t)y| − |tu + (1 − t)v||dt
1
≤ 13 0 |tx + (1 − t)y − (tu + (1 − t)v)|dt
1
≤ 13 0 (t|x − u| + (1 − t)|y − v|)dt
= 16 (|x − u| + |y − v|).
(15.42)

That is, condition (15.4) holds for K = 6 but f (x) is not differentiable at 0 ∈ D.
1
References

Elservier Publ. Co. New York, USA, 2007.
[2] D. E. Müller, A method for solving algebraic equations using an automatic
computer, Math. Tables Other Aids Comput., 10, 208–215, 1956.
[3] J. Stoer and R. Bulirsch, Introduction to Numerical Analysis, Springer-
Verlag, New York, 1980.
[4] C. F. Gerald and P. O. Wheatley, Applied Numerical Analysis, Addison-
Wesley, Reading, MA, 1994.
[5] S. Q. Xie, Convergence of Müller’s method for finding roots, Mathematics
in Practice and Theory, 2, 18–25, 1980 (in Chinese).
[6] W. H. Bi, H. M. Ren and Q. B. Wu, A new semilocal convergence Theorem
of Müller’s method, Appl. Math. Comput., 199, 375–384, 2008.
[7] Q. B. Wu, H. M. Ren and W. H. Bi, Convergence ball and error analysis of
Müller’s method, Appl. Math. Comput., 184, 464–470, 2007.
258
Chapter 16
Generalized Newton
method with applications
Ioannis K. Argyros

CONTENTS
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
16.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
16.1 Introduction
In this chapter we are interested in the approximately solving the generalized
equation: Find x∗ ∈ H such that
0 ∈ F(x∗ ) + T (x∗ ). (16.1)
where F : H −→ H is a Fréchet differentiable function, H is a Hilbert space and
T : H ⇒ H is a set valued maximal monotone operator.
Importance of studying problem (16.1) and its applications in the physical
and engineering sciences and many other areas can be found in [1]– [33].
259
The Generalized Newton method for equation (16.1) is defined as
0 ∈ F(xn ) + F (xn )(xn+1 − xn ) + T (xn+1 ), (16.2)
for n = 1, 2, 3, . . ..
Using classical Lipschitz condition Uko [26, 27] established the convergence
of (16.2). The convergence (16.2) has been studied by many other authors under
various conditions (see Robinson [17], Josephy [12] (also see [18, 28])).
The convergence of generalized Newton method (16.2) was shown in [29,33]
using Lipschitz continuity conditions on F . However, there are problems where
the Lipschitz continuity of F does not hold. Motivated by this constrains, we
present a convergence analysis of the generalized Newton method (16.2) using
generalized continuity on F . Our results are weaker even, if we specialize the
conditions on F to the condition given in [23].
The rest of the chapter is organized as follows: In Section 16.2 we present
some needed preliminaries. Then, in Section 16.3 the convergence of generalized
Newton method (16.2) is presented.
16.2 Preliminaries
Let x ∈ H and r > 0. Throughout this chapter we will use U(x, r) and U(x, r) to
denote the open and closed metric ball, respectively, at x with radius r
U(x, r) := {y ∈ H : x − y < r} and U(x, r) := {y ∈ H : x − y ≤ r}.
Recall that a bounded linear operator G : H → H is called a positive operator

if G is self conjugate and Gx, x ≥ 0 for each x ∈ H(cf. [22], p.313).
We present the following lemma taken from [29].
Lemma 16.2.1 Let G be a positive operator. Then the following conditions hold:
G2 = G2
If G−1 exists, then G−1 is also a positive operator and
x2
Gx, x ≥ , for each x ∈ H. (16.3)
G−1
Now, let T : H ⇒ H be a set valued operator. The domain domT of T is

defined as domT := {x ∈ H : T x = ∅}. Next, let us recall notions of montonicity
for set-values operators taken from [32].
Definition 16.2.2 Let T : H ⇒ H be a set valued operator. T is said to be
(a) monotone if
u − v, y − x ≥ 0 for each u ∈ T (y) and v ∈ T (x) (16.4)

Generalized Newton method with applications 261
maximal monotone if it is monotone and the following implications hold:
u−v, x−y ≥ 0 for each y ∈ domT and v ∈ T (y) ⇒ x ∈ domT and u ∈ T (x).
(16.5)
Let G : H −→ H be a bounded linear operator. Then G ( := 1 (G + G∗ ) where
2
G∗ is the adjoint of G. Hereafter, we assume that T : H ⇒ H is a set valued
maximal monotone operator and F : H −→ H is a Fréchet differentiable operator.

First, we begin with some needed definitions.
−1
Definition 16.3.1 Let R > 0 and x̄ ∈ H be such that F (x̄) exists. Then, opera-
−1
tor F (x̄) F is said to satisfy the center-Lipschitz condition with L− average
at x̄ ∈ U(x̄, R), if
−1
x−x̄
F
(x̄)
F (x) − F (x̄) ≤ L0 (u)du,
0
for each x̄ ∈ U(x̄, R), where L0 is a non-negative non-decreasing integrable func-

R
tion on the interval [0, R] satisfying 0 0 L0 (u)du = 1, for some R0 ≥ 0.
−1
−1
tor F
(x̄) F is said to satisfy the center-Lipschitz condition in the inscribed
sphere with (L0 , L)−average at x̄ on U(x̄, R), R = min{R0 , R}, if
−1
x−x̄+y−x
F
(x̄)
F (y) − F (x) ≤
L(u)du,
x−x̄
for each x, y ∈ U(x̄, R0 ), with x − x̄ + y − x < R and L is a non-negative non-

decreasing integrable function on the interval [0, R].
−1
−1
tor F
(x̄) F is said to satisfy the center-Lipschitz condition in the inscribed
sphere with L1 −average at x̄ on U(x̄, R), if
−1
x−x̄+y−x
F
(x̄)
F (y) − F (x) ≤
L1 (u)du
x−x̄
for each x, y ∈ U(x̄, R), with x − x̄ + y − x < R and L1 is a non-negative non-
decreasing integrable function on the interval [0, R].
The convergence analysis of generalized Newton’s method (16.2) was based

on Definition 16.3.3 [33]. However, we have that
L0 (u) ≤ L1 (u) (16.6)
and
L(u) ≤ L1 (u) (16.7)
for each u ∈ [0, R], since R ≤ R.
Hence, if functions L0 and L are used instead of L1 we can obtain a more
precise convergence analysis, which is the main objective of the present chapter.
It is worth noticing that in practice the computation of function L1 requires,
as special cases, the computation of functions L0 and L.
Now, let β > 0. Define scalar majorizing function h by
t
h1 (t) = β − t + L1 (u)(t − u)du, for each t ∈ [0, R] (16.8)
0
and majorizing sequences {tn } by t0 = 0, tn+1 = tn − h1 (tn )−1 h1 (tn ).

Sequence {tn } can also be written as t0 = 0, t1 = β ,
1 tn−1 +θ (tn −tn−1 )
0 tn−1
L1 (u)(tn − tn−1 )dθ
tn+1 = tn + tn . (16.9)
1 − 0 L1 (u)du
Sequence {tn } was used in [33]. However, we will use in this chapter the
sequence {sn } defined by s0 = 0, s1 = β ,
1 sn−1 +θ (sn −sn−1 )
0 sn−1
L(u)du(sn − sn−1 )dθ
sn+1 = sn + sn . (16.10)
1 − 0 L0 (u)du
We need an auxiliary result where we compare sequence {tn } to {sn }.
Let R1 > 0 and b > 0 be such that
R1 R1
L1 (u)du = 1 and b = L1 (u)udu. (16.11)
0 0
Lemma 16.3.4 The following items hold.

(a) The function h1 is monotonically decreasing on [0, R1 ] and monotonically
increasing on [R1 , R]. Suppose that
β ≤ b. (16.12)
Then, h has a unique zero, respectively in [0, R1 ] and [R1 , R], which are
denoted by r1 and r2 which satisfy
R1
β < r1 < β < R1 < r2 < R. (16.13)
b
Moreover, if β ≤ b and r1 = r2 if β = b.
(b) Under hypothesis (16.12) sequences {tn } to {sn } are non-decreasing,

converge to r1 and s∗ = lim sn , respectively so that
n→∞
sn ≤ tn ≤ r1 (16.14)
0 ≤ sn+1 − sn ≤ tn+1 − tn (16.15)

and
s∗n ≤ r1 . (16.16)
Proof: The proof of part (a) and that sequence {tn } non-decreasingly converges
to r1 can be found in [29, 33]. Using simple induction (16.6), (16.7), (16.9),
(16.10) estimates (16.14) and (16.15) are obtained. Therefore, sequence {sn } is
non-decreasing and bounded above by r1 and as such it converges to s∗ which
satisfies (16.16) 2
We also need the following lemma.
Lemma 16.3.5 Let r < R0 . Let also x̄ ∈ H be such that F (x̄) is a positive op-
−1
erator and F (x̄) exists. Suppose that F
(x̄)F satisfies the center-Lipschitz
condition with L0 −average at x̄ ∈ U(x̄, r). Then, for each x ∈ U(x̄, r), F (x) is a
−1
positive operator, F (x̄) exists and
−1
−1 F (x̄)
F
(x) ≤ x−x̄ . (16.17)
1− 0 L0 (u)du
Proof: Simply use L0 instead of L1 in the proof of corresponding result in [29,

33].
Remark 16.3.6 The estimate corresponding to (16.17) is

−1
−1 F (x̄)
F
(x) ≤ x−x̄ . (16.18)
1− 0 L1 (u)du
In view of (16.6), (16.17) and (16.18), estimate (16.17) is more precise than
(16.18), if L0 (u) < L1 (u).
Next, we present the main semilocal result of the chapter.

−1
Theorem 16.3.7 Suppose that there exists x0 ∈ D be such that F (x )
0 exists,
−1

(16.12) holds, F (x0 ) F satisfies the center-Lipschitz condition in the in-
scribed sphere with L0 −average at x0 on U(x0 , s∗ ), the center-Lipschitz condi-
tion in the inscribed sphere with (L0 , L)−average at x̄ on U(x0 , s∗ ), and F (x0 )
is a positive operator. Then, sequence {xn } generated with initial point x0 by

Newton’s method (16.1) provided that
x1 − x0 ≤ β (16.19)
is well defined in U(x0 , s∗ ), remains in U(x0 , s∗ ) for each n = 0, 1, 2, · · · and con-
verges to a solution x∗ of (16.1) in U(x0 , s∗ ). Moreover, the following estimates
hold
xn+1 − xn ≤ s∗ − sn for each n = 0, 1, 2, · · · (16.20)
Proof: We shall show using mathematical induction that sequence {xn } is well
defined and satisfies
xk+1 − xk ≤ sk+1 − sk for each k = 0, 1, 2, · · · (16.21)
Estimate (16.21) holds for k = 0 by (16.19). Let us suppose that (16.21) holds
for n = 0, 1, 2, · · · , k − 1. We shall show that (16.21) holds for n = k. Notice that
xk − x0 ≤ xk − xk+1 + · · · + x1 −x0 ≤ sk −sk−1 +· · · +s1 −s0 = sk −s0 < s∗ .
(16.22)
−1
By (16.22) and Lemma 16.3.5, F (xk ) is a positive operator, F (x )
k exists and
−1
−1 F (x )
F
(x ) 0
k ≤ xk −x0 . (16.23)
1− 0
L0 (u)du
By Lemma 16.2.1, we have that
x2
−1
≤ F (xk )x, x = F (xk )x, x for each x ∈ H, (16.24)
F
(x )
k
so by Remark 2.5 [29, 33]:
0 ∈ F(xk ) + F (xk )(xk+1 − xk ) + T (xk+1 ). (16.25)
From hypotheses we obtain
0 ∈ F(xk−1 ) + F (xk−1 )(xk − xk−1 ) + T (xk ). (16.26)
Using (16.25), (16.26) and T being a maximal monotone operator we get that
−F(xk−1 ) − F (xk−1 )(xk − xk−1 ) + F(xk ) + F (xk )(xk+1 − xk ), xk − xk+1 ≥ 0, so
F(xk )−F(xk−1 )−F (xk−1 )(xk −xk−1 ), xk −xk+1 ≥ F (xk )(xk −xk+1 ), xk −xk+1 .
(16.27)
By (16.24), we have that
xk − xk+1 2
−1
≤ F
(x )(x − x
k k

k+1 ), xk − xk+1 = F (xk )(xk − xk+1 ), xk − xk+1
F
(x )
k
leading together with (16.27) to

−1
xk − xk+1 ≤ F
(x )
k F(xk ) − F(xk−1 ) − F (xk−1 )(xk − xk−1 )
−1
1

≤ F (xk ) (F (xk−1 + θ (xk − xk−1 )))
0
−F (xk−1 )(xk − xk−1 )dθ
−1 1
F
(x )
(F (xk−1 + θ (xk − xk−1 )))
0
≤ xk −x0
1− 0 L0 (u)du 0
−F (xk−1 )
×(xk − xk−1 )dθ
1 xk−1 −x0 +θ xk −xk−1
1
≤ sk −s0 L(u)du
1− 0 L0 (u)du 0 xk−1 −x0
xk − xk−1 dθ
1 sk−1 +θ (sk −sk−1 )
1
≤ s L(u)du(sk − sk−1 )dθ
1 − 0 k L0 (u)du 0 sk−1
= sk+1 − sk . (16.28)
It follows from (16.28) that sequence {xk } is complete in a Hilbert space H and
as such it converges to some x∗ ∈ U(x0 , s∗ ) (since U(x∗ , s∗ ) is a closed set).
Moreover, since T is a maximal monotone and F ∈ C1 , we deduce that x∗ solves
(16.1). Furthermore, (16.20) follows from (16.21) by using standard majorization
techniques [1, 4, 5]. 2
Next, we present the following result related to the uniqueness.
Proposition 16.3.8 Suppose that the hypotheses of Theorem 16.3.7 hold except
the center-Lipschitz (L0 , L) condition for x1 ∈ H such that
0 ∈ F(x0 ) + F (x0 )(x1 − x0 ) + T (x1 ).
Then, there exists a unique solution x∗ of (16.1) in U(x0 , s∗ ).
Proof: Simply replace function L1 by L0 in the proof of Theorem 4.1 in [33].
Remark 16.3.9 (a) Notice that function L1 is not needed in the proof of The-
orem 4.1 in [33], since it can be replaced by L0 which is more precise
than L1 (see (16.6)) in all stages of the proof. On the other hand, in view
of (16.16), we obtain a better information on the uniqueness of x∗ .
(b) The results obtained in this chapter can be improved even further, if in
Definition 16.3.2 we consider instead the center Lipschitz condition in
the inscribed sphere with (L0 , K)−average at x̄ on U(x̄, R − x1 − x0 )

(or U(x̄, R − β )) given by
−1
x−x̄+y−x
F
(x̄)
F (y) − F (x) ≤ K(u)du,
x−x̄
for each x, y ∈ U(x̄, R − x1 − x0 ) (or U(x̄, R − β )). It is worth noticing

that
K(u) ≤ L(u), for each u ∈ U(0, R). (16.29)
Then, by simply noticing that the iterates {xn } remain in U(x0 , R − x1 −
x0 ) which is more precise location than U(x0 , R), function K can replace
L in all the preceding results.
Furthermore, define sequence {αn } by α0 = 0, α1 = β ,
1 αn−1 +θ (αn −αn−1 )
0 αn−1
K(u)du(αn − αn−1 )dθ
αn+1 = αn + αn . (16.30)
1 − 0 L0 (u)du
Then, in view of (16.10), (16.29) and (16.30), we have that
αn ≤ sn , (16.31)
0 ≤ αn+1 − αn ≤ sn+1 − sn (16.32)

and
α ∗ = lim αn ≤ s∗ (16.33)
n→∞
hold under the hypotheses (16.12).

(c) We have extended the applicability of Newton’s method under hypothe-
ses (16.12). At this point we are wondering, if sequences {αn } and {sn }
converge under a hypotheses weaker than (16.12). It turns out that this is
indeed the case, when L0 , L, L1 , K are constant functions.
It follows from (16.9), (16.10) and (16.30) that sequences {tn }, {sn } and
{αn } reduce respectively to
L(tn − tn−1 )2
t0 = 0, t1 = β , tn+1 = tn + , (16.34)
2(1 − Ltn )
L(sn − sn−1 )2
s0 = 0, s1 = β , sn+1 = sn + , (16.35)
2(1 − L0 sn )
K(αn − αn−1 )2
α0 = 0, α1 = β , αn+1 = αn + . (16.36)
2(1 − L0 αn )
Sequence {tn }, converges provided that the Kantorovich condition [13,

14, 25]
1
hk = Lβ ≤ (16.37)
2
is satisfied, whereas {sn } and {αn } converge [6], if
1
h1 = L1 β ≤ (16.38)
2
and
1
h 2 = L2 β ≤ (16.39)
2

are satisfied, respectively, where L1 = 18 (L+4L0 + L2 + 8L0 L) and L2 =

8 (L + 4L0 + L2 + 8L0 K). Notice that
1
1 1 1
hk ≤ ⇒ h1 ≤ ⇒ h2 ≤ (16.40)
2 2 2
but not necessarily vice versa, unless, if L0 = L = K.
Examples, where L0 < K < L < L1 can be found in [6].
Hence, we have also extended the convergence domain in this case. Similar
advantages are obtained in the case of the Smale’s α−theory [23, 24] of Wang’s
γ−condition [30], [33].
References

quadratic majorants, J. Appl. Math. Computing, 29, 391–400, 2009.
447–466, 2010.
Computers Math. Appl., 60, 2292–2301, 2010.
majorant condition, J. Comput. Appl. Math., 236, 1892–1902 (2012).
[5] I. K. Argyros and Á. A. Magreñán, Local convergence analysis of proxi-
mal Gauss-Newton method for penalized nonlinear least squares problems,
Appl. Math. Comput., 241, 401–408 , 2014.
pings, Springer Monographs in Mathematics, Springer, Dordrecht, 2009.
rant condition, J. Comput. Appl. Math., 279, 318–335, 2015.
tions with values in a cone, arXiv: 1510.01947, 2015.
268
References 269

Akad Nauk SSSR (N.S.), 59, 1237–1240, 1948.
[14] L. V. Kantorovic̆ and G. P. Akilov, Functional analysis, Oxford, Pergamon,
1982.
[16] F. A. Potra, The Kantorovich theorem and interior point methods, Mathem.
Programming, 102 (1), 47–70, 2005.
5 (1), 43–62, 1980.
(2L), 415–421, 2015.
291, 468–477, 2016.
[22] W. Rudin, Functional Analysis, McGraw-Hill, Inc., 1973.
tions in pure, Applied and Computational Mathematics, 185–196, Springer
New York, 1986.
[24] S. Smale, Complexity theory and numerical analysis, Acta. Numer., 6, 523–
551, 1997.
270 References

1979.
2009.
[27] L. U. Uko, Generalized equations and the generalized Newton method,
Math. Program., 73, 251–268, 1996.
1859, 2015.
[29] J. H. Wang, Convergence ball of Newton’s method for inclusion problems
and uniqueness of the solution, J. Nonlinear Convex Anal. (To Appear).
[30] X. H. Wang, Convergence of Newton’s method and uniqueness of the so-
lution of equations in Banach space, IMA J. Numer. Anal., 20, 123–134,
2000.
Newton-Kantorivich approximations and the Pták error estimates, Numer.
Funct. Anal. Optim., 9 (5-6), 671–684, 1987.
[32] E. Zeidler, Non-linear Functional analysis and Applications IIB, Non-linear
Monotone operators, Springer, Berlin, 1990.
Chapter 17
Newton-secant methods
with values in a cone
Ioannis K. Argyros

CONTENTS
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
17.2 Convergence of the Newton-secant method . . . . . . . . . . . . . . . . . . . . . . . . . 273
17.1 Introduction
We study the variational inclusion
0 ∈ F(x) + G(x) + E(x), (17.1)
where X, Y are Banach space D ⊂ X is an open set F : D −→ Y is a smooth

operator, G : D −→ Y is continuous operator, [., .; G] is a divided difference of
order one for G [13] and E : X ⇒ Y is a set-valued operator.
equations in a form like (17.1) [1]– [28].
271
In particular, the inclusion

0 ∈ F(xm ) + G(xm ) + (F (xm ) + [xm−1 , xm ; G])(xm+1 − xm ) + E(xm+1 ), (17.2)
where x0 ∈ D is an initial point was studied in [19]. If E = {0}, the method was
studied by Cătinas [5]. Then, assumimg E(x) = −C for each x ∈ X and C ⊆ Y
being a closed convex cone, Pietrus and Alexis [19] studied the algorithm
mimimize{x − xm /F(xm ) + G(xm ) + (F (xm ) + [xm−1 , xm ; G])(x − xm ) ∈ C}
(17.3)
Next, we shall define the Algorithm for solving (17.1). Let u1 , u2 ∈ D. Define a
set-valued operator Q(u1 , u2 ) : X −→ Y by
Q(u1 , u2 )x := (F (u2 ) + [u1 , u2 ; G])x −C. (17.4)
Then, T (u1 , u2 ) is a normed convex process. The inverse, defined for each y ∈ Y
by
Q−1 (u1 , u2 )y := {z ∈ X : (F (u2 ) + [u1 , u2 ; G])z ∈ y +C}
is also a normed convex process. Then, we have the following [19].
Algorithm. Newton-secant-cone (F, G,C, x0 , x1 , ε)
1. If Q−1 (x0 , x1 )[−F(x1 ) − G(x1 )] = ∅, stop.
2. Do while e > ε.
(a) Choose x as a solution of the problem
mimimize{x−x1 /F(x1 )+G(x1 )+(F (x1 )+[x0 , x1 ; G])(x−x1 ) ∈ C}.
(b) Compute e : x − x1 ; x0 := x1 ; x1 := x.
3. Return x.
Remark 17.1.1 Notice that the continuity of the linear operator F (xm ) and G
being closed and convex, imply that the feasible set of (17.3) is a closed convex
set for all m. Hence, the existence of a feasible point x̄ implies that each solution
of (17.3) lies in the intersection of the feasible set of (17.3) with the closed ball of
center xm and radius x̄ − xm . Then, by [6–10] a solution of (17.3) exists, since
X is reflexive and the function x − xm is weakly lower semicontinuous.
The convergence of Newton-secant method (17.3) was shown in [19] using
Lipschitz continuity conditions on F and divided differences of order one and
two for G. However, there are problems where the Lipschitz continuity of F does
not hold or the divided differences of order two of G do not exist. Motivated
by these constrains, we present a convergence analysis of the Newton-secant
method (17.3) using generalized continuity on F and hypotheses only on the
divided difference of order one for G. Our results are weaker even, if we special-
ize the conditions on F to the condition given in [19]. This way we expand the
applicability of Newton-secant method (17.3).
Newton-secant methods with values in a cone 273
The rest of the chapter is organized as follows: In Section 17.2 we present the
convergence of Newton-secant method (17.3).
17.2 Convergence of the Newton-secant method

First of all, we present some needed auxiliary result on majorizing sequences.
Lemma 17.2.1 Let a ≥ 0, b ≥ 0 and c > 0 be given parameters. Let also w :
R+ −→ R+ , w1 : R3+ −→ R+ , w2 : R4+ −→ R+ be continuous, nondecreasing
functions. Suppose that
a ≤ b ≤ 2a (17.5)
and equation
(t − a)(1 − q(t)) + a − b = 0 (17.6)
has zeros greater than a, where
1
c
q(t) = ( w(aθ )dθ + w1 (a, a, a)) (17.7)
1 − cp(t) 0
and
p(t) = w(t − a) + w2 (t,t − a,t − a). (17.8)
∗
Denote by t the smallest such zero. Then, scalar sequence {tn } defined by
t0 = 0, t1 = a,t2 ≥ b,
1
c
tn+1 = tn + [ w(θ (tn − tn−1 ))dθ
1 − cpn 0
+w1 (tn − tn−1 ,tn − tn−2 ,tn−1 − tn−2 )](tn − tn−1 ) (17.9)
is well defined, nondecreasing, bounded from above by t ∗∗ given by

t2 − t1
t ∗∗ = +a (17.10)
1−q
and converges to its unique least upper bound t ∗ which satisfies
b ≤ t ∗ ≤ t ∗∗ , (17.11)
where
pn = w(tn − a) + w2 (tn−1 ,tn−1 − a,tn ,tn−1 − t1 ),
c
q=
1 − cp
and
p = p(t ∗ ).
Even more, the following estimates hold
0 ≤ t2 − t1 ≤ t1 − t0 (17.12)
0 ≤ tn+1 − tn ≤ q(tn − tn−1 ) (17.13)

and
qn−1
0 ≤ t ∗ − tn ≤ (t2 − t1 ) n = 2, 3, . . . . (17.14)
1−q
Proof: By (17.5), we have that t0 ≤ t1 ≤ t2 and t2 −t1 ≤ t1 −t0 . It then follows by
(17.9) and a simple inductive argument that
t0 ≤ t1 ≤ t2 ≤ . . . ≤ tk ≤ tk+1
and
tk+1 − t0 = (tk+1 − tk ) + (tk − tk−1 ) + . . . + (t2 − t1 ) + (t1 − t0 )

≤ (qk−1 + . . . + 1)(t2 − t1 ) + t1 − t0
t2 − t1
≤ + t1 − t0 = t ∗∗ .
1−q
Therefore, sequence {tk } is nondecreasing, bounded from above by t ∗∗ and as

such it converges to t ∗ which satisfies (17.11). Let m ≥ 1. We can write
tn+m − tn = (tn+m − tn+m−1 ) + . . . + (tn+1 − tn )

≤ (qn+m−2 + . . . qn−1 )(t2 − t1 )
1 − qm n−1
= q (t2 − t1 ).
1−q
By letting m −→ ∞ in the preceding estimate we show (17.14).
2
Now, we present the main convergence result of the chapter.
Theorem 17.2.2 Let D, X,Y, F, G, Q be as defined above. Suppose:

(i) There exist points x0 , x1 ∈ D such that Q(x0 , x1 ) carries X onto Y.
(ii) There exist c > 0 and a ≥ 0 such that
Q−1 (x0 , x1 ) ≤ c
and
x1 − x0 ≤ a.
(iii) There exist w : R+ −→ R+ , w1 : R3+ −→ R+ , w2 : R4+ −→ R+ continuous,

nondecreasing functions such that for each x, y, z ∈ D
F (x) − F (y) ≤ w(x − y),
[x, y; G] − [z, y; G] ≤ w1 (x − z, x − y, z − y)

and
[x, y; G] − [x0 , x1 ; G] ≤ w2 (x − x0 , y − x1 , x − x1 , y − x0 ).
(iv) x2 − x1 ≤ b − a, where x2 defined by the algorithm and Remark 17.1.1.

(v) Hypotheses of Lemma 17.2.1 hold.
(vi) Ū(x0 ,t ∗ ) ⊂ D.
(vii) w(0) = w1 (0, 0, 0) = w2 (0, 0, 0, 0) = 0.
Then, there exists at least a sequence {xn } generated by method (17.3) which is
well defined in U(x0 ,t ∗ ), remains in U(x0 ,t ∗ ) for each n = 0, 1, 2, . . . and con-
verges to some x∗ ∈ Ū(x0 ,t ∗ ) such that F(x∗ ) + G(x∗ ) ∈ C. Moreover, the fol-
lowing estimates hold
xn − x∗ ≤ t ∗ − tn , (17.15)
where sequence {tn } is defined by (17.9) and t ∗ = limn−→∞ tn .
Proof: We begin by defining operator Am for each m = 1, 2, . . . as
Am = F(xm+1 ) + G(xm+1 ) − F(x∗ ) − G(x∗ )

−[F(xm+1 ) − F(xm ) − (F (xm ) + [xm−1 , xm ; G])(xm+1 − xm ).
+G(xm+1 ) − G(xm )]. (17.16)
Then, we obtain Am ∈ C − F(x∗ ) − G(x∗ ) and by continuity limn−→∞ Am = 0. It

follows that F(x∗ ) + G(x∗ ) ∈ C, since C − F(x∗ ) − G(x∗ ) is closed. Therefore,
the point x∗ solves (17.1). Using induction, we shall show that sequence {xm } is
well defined and satisfies
xm+1 − xm ≤ tm+1 − tm for each m = 0, 1, 2, . . . , (17.17)
where sequence {tm } is defined by (17.9). By the second hypothesis in (ii), (iv)
and (17.9), we obtain the existence of x2 which solves (17.1) for m = 1, x1 −
x0 ≤ a = t1 − t0 , x2 − x1 ≤ b − a = t2 − t1 , which shows (17.17) for m = 0, 1.
Suppose that (17.17) holds for i = 3, . . . m, where x1 , x2 , . . . xm are defined by
(17.3). Then, we get that xi − x0 ≤ ti − t0 < t ∗ . That is xi ∈ U(x0 ,t ∗ ).
Now, it is easy that
Bi (x) = −(F (xi ) − F (x1 ) + [xi−1 , xi ; G] − [x0 , x1 ; G])x (17.18)

that
Q(xi−1 , xi )x = (Q(x0 , x1 ) − Bi )x. (17.19)
By (ii), (iii), (17.19), Lemma 17.2.1 and the induction hypotheses we get
Q−1 (x0 , x1 )Bi ≤ c[w(xi − x1 ) + w2 (xi−1 − x0 , xi−1 − x1 ,

xi − x0 , xi−1 − x1 )] (17.20)
≤ c[w(ti − t1 ) + w2 (ti−1 − t0 ,ti−1 − t1 ,ti − t0 ,ti−1 − t1 )]
≤ c[w(t ∗ − a) + w2 (t ∗ ,t ∗ − a,t ∗ ,t ∗ − a)] < 1.
It follows from (17.21) and the Banach perturbation Lemma [13] that
Q−1 (xi−1 , xi ) ∈ L(Y, X) (17.21)
and
Q−1 (x0 , x1 )
Q−1 (xi−1 , xi ) ≤
1 − A−1 (x0 , x1 )Bi
c
≤ . (17.22)
1 − cpi
The existence of xi solving (17.3) with i = k follows from the fact that Q(xi−1 , xi ) :
X −→ Y. Next, we must find a solution of
F(xm ) + G(xm ) + [xm−1 , xm ; G](x − xm ) ∈

F(xm−1 ) + (F (xm−1 ) + [xm−2 , xm−1 ; G](xm − xm−1 ) + G(xm−1 )) +C. (17.23)
The right hand side of (17.23) is contained in the cone C, since xm solves (ii).
That is any x satisfying (17.23) is feasible for (17.3). Using (17.23), we can get
x as the solution of
x − xm ∈ Q−1 (xm−1 , xm )(−F(xm ) − G(xm ) + F(xm−1 ))

+G(xm−1 ) + (F (xm−1 ) + [xm−2 , xm−1 ; G](xm − xm−1 )).
(17.24)
The right hand side of (17.24) contains an element of least norm, so there exists
x̄ satisfying (17.22) and (17.23) such that
x̄ − xm ≤ Q−1 (xm−1 , xm )( − F(xm ) + F(xm−1 ) + F (xm−1 )(xm − xm−1 )

+F(xm ) − G(xm−1 − [xm−2 , xm−1 ; G](xm − xm−1 )). (17.25)
In view of (17.9), (iii), (17.22), (17.25) and the induction hypotheses we obtain
1
c
x̄ − xm ≤ [ w(θ xm − xm−1 )dθ xm − xm−1
1 − cpm 0
+[xm , xm−1 ; G] − [xm−2 , xm−1 ; G]xm − xm−1 ]
1
c
≤ [ w(θ xm − xm−1 )dθ
1 − cpm 0
+w1 (xm − xm−1 , xm − xm−2 , xm−1 − xm−2 )]xm − xm−1
1
c
≤ [ w(θ (tm − tm−1 ))dθ
1 − cpm 0
+w1 (tm − tm−1 ,tm − tm−2 ,tm−1 − tm−2 )]|tm − tm−1 |
= tm+1 − tm . (17.26)
By Lemma 17.2.1, sequence {tm } is complete. In view of (17.26), sequence {xm }
is complete in a Banach space X and
xm+1 − xm ≤ x̄ − xm ≤ tm+1 − tm .
2
Remark 17.2.3 (a) Our results can specialize to the corresponding ones in
[19]. Choose w(t) = Lt, w1 (t) = L1 and w2 (t,t,t,t) = L2 . But even in this
case, our results are weaker, since we do not use the hypothesis on the
divided difference of order two
[x, y, z; G] ≤ K
but use instead the second hypothesis in (iii) which involves only the di-
vided difference of order one. Moreover, the results in [19] cannot be used
to solve the numerical examples at the end of the chapter, since F is not
Lipschitz continuous. However, our results can apply to solve inclusion
problems. Hence, we have expanded the applicability of Newton-secant
method (17.3).
(b) Our results can be improved even further as follows: Let r0 be defined as
the smallest positive zero of equation
w2 (t,t,t,t) = 1.
Define D0 = D ∩U(x0 , r0 ). Suppose that the first and second hypotheses
in (iii) are replaced by
F (x) − F (y) ≤ w̄(x − y)
and
[x, y; G] − [z, y; G] ≤ w̄1 (x − z, x − y, z − y)
for each x, y, z ∈ D0 . Denote the new conditions by (iii)’. Then, clearly

condition (iii)’ can replace (iii) in Theorem 17.2.2, since the iterates {xm }
lie in D0 which is a more accurate location than D. Moreover, since D0 ⊆
D, we have that
w̄(t) ≤ w(t)
and
w̄1 (t,t,t) ≤ w1 (t,t,t)
for each t ∈ [0, r0 ) hold. Notice that the definition of functions w̄ and w̄1
depends on w and r0 . Another way of extending our results is to consider
instead of D0 the set D1 = D ∩U(x1 , r0 − x1 − x0 ). Then, corresponding
functions w̄¯ and w̄¯ 1 will be at least as small as w̄, w̄1 , respectively, since
D1 ⊆ D0 . Notice that the construction of the w functions is based on
the initial data F, G,C, x0 , x1 .
References
[1] I. K. Argyros, S. George and Á. A. Magreñán, Local convergence for multi-
point-parametric Chebyshev-Halley-type methods of high convergence or-
der, J. Comput. Appl. Math., 282, 215–224, 2015.
[2] I. K. Argyros and S. George, Ball comparison between two optimal eight-
order methods under weak conditions, SeMA, 73 (1), 1–1, 2015.
[3] J. P. Aubin, Lipschitz behaviour of solutions to convex minimization prob-
lems, Math. Oper. Res., 9, 87–111, 1984.
[4] J. P. Aubin and H. Frankowska, Set-Valued Analysis, Systems Control
Found. Appl. Boston, MA, Birkhauser Boston, Inc.,1990.
[5] E. Catinas, On some iterative methods for solving nonlinear equations, Rev.
Anal. Nwner. Thdor. Approx., 23, 1, 47–53, 1994.
[6] A. L. Dontchev and W. W. Hager, An inverse mapping theorem for set
valued maps, Proc. Amer. Math. Soc., 121, 2, 481–489, 1994.
[7] A. L. Dontchev, Local convergence of the Newton method for generalized
equation, G. R. Acad. Sci., Paris, Ser. I, 322, 4, 327–331, 1996.
[8] A. L. Dontchev, Uniform convergence of the Newton method for Aubin
continuous maps, Serdica Math. J., 22, 3, 385–398, 1996.
[9] A. L. Dontchev, M. Quincampoix and N. Zlateva, Aubin Criterion for Met-
ric Regularity, J. Convex Anal., 13, 2, 281–297 (2006).
[10] A. L. Dontchev and R. T. Rockafellar, Implicit functions and solution map-
pings. Springer Monographs in Mathematics. New York, Springer (2009).
[11] M. H. Geoffroy and A. Pietrus, Local convergence of some iterative meth-
ods for generalized equations, J. Math. Anal Appl., 290, 2, 497–505, 2004.
279
280 References
[12] M. Grau-Sanchez, M. Noguera and S. Amat, On the approximation of

derivatives using divided difference operators preserving the local conver-
gence order of iterative methods, J. Comput. Appl. Math., 237, 1, 363–372,
2013.
[13] L. V. Kantorovich, On Newton’s method, Tr. Mat. Inst Steklova, 28, 104–
144, 1949.
[14] E. S. Levitin and B. T. Polyak, Constrained minimization methods. Zh. Vy-
chisl. Mat. Mat. Fiz., 6, 5, 787–823, 1966 (in Russian).
[15] A. S. Lewis, Ill-conditioned convex processes and Conic linear systems,
Math. Oper. Res., 24, 4, 829–834, 1999.
[16] A. S. Lewis, Ill-Conditioned Inclusions, Set-Valued Anal., 9, 4, 375–381,
2001.
[17] B. S. Mordukhovich, Complete characterization of openess metric regular-
ity and lipschitzian properties of multifunctions,Trans. Amer. Math. Soc.,
340, 1, 1–35, 1993.
[18] B. S. Mordukhovich, Variational analysis and generalized differentiation
I: Basic theory. Grundlehren der Mathematischen Wissenschaften [Funda-
mental Principles of Mathematical Sciences] Vol. 330, Berlin, Springer,
2006.
[19] A. Pietrus and C. J. Alexis, Newton-secant method for functions with values
in a cone, Serdica Math., 39, 271–296, 2013.
[20] W. C. Rheinboldt, A unified convergence theory for a class of iterative pro-
cesses, SIAM J. Numer. Anal., 5, 1, 42–63, 1968.
[21] S. M. Robinson, Normed convex processes, Trans. Amer. Math. Soc., 174,
127–140, 1972.
[22] S. M. Robinson, Extension of Newton’s method to non linear functions
with values in a cone, Numer. Math., 19, 341–347, 1972.
[23] S. M. Robinson, Generalized equations and their solutions, part I: basic
theory, Math. Program. Study, 10, 128–141, 1979.
[24] S. M. Robinson. Generalized equations and their solutions, part II: applica-
tion to nonlinear programming, Math. Program. Study, 19, 200–221, 1982.
[25] R. T. Rockafellar, Lipsehitzian properties of multifunctions. Nonlinear
Anal., Theory Methods Appl., 9, 867–885, 1985.
[26] R. T. Rockafellar and R. J. B. Wets, Variational analysis. A Series of Com-
prehensives Studies in Mathematics Vol. 317, Berlin, Springer, 1998.
References 281
[27] R. T. Rockafellar, Monotone processes of convex and concave type, Mem.

Am. Math. Soc., 77, 1967.
[28] R. T. Rockafellar, Convex Analysis. Princeton, Princeton University Press,
XVIII, 1970.
Chapter 18
Gauss-Newton method
with applications to
convex optimization
Ioannis K. Argyros

CONTENTS
18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
18.2 Gauss-Newton Algorithm and Quasi-Regularity condition . . . . . . . . . . . 283
18.2.1 Gauss-Newton Algorithm GNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
18.2.2 Quasi Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
18.3 Semilocal convergence for GNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
18.4 Specializations and numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
18.1 Introduction
In this chapter we will study the convex composite optimizations problem.
We present a convergence analysis based on the majorant function presented
282
Gauss-Newton method with applications to convex optimization 283
in [17] of Gauss-Newton method (defined by Algorithm (GNA) in Sec. 18.2).

We will use the same formulation using the majorant function provided in [23].
In [3, 5, 8], a convergence study in a Banach space setting was shown for (GNM)
defined by
−1
xk+1 = xk − F (xk )+ F (xk ) F (xk )+ F(xk ) for each k = 0, 1, 2, . . . ,
where x0 is an initial point and F (x)+ in the Moore-Penrose inverse [11–13,

19, 26] of operator F (x) with F : Rn → Rm being continuously differentiable.
In [23], a semilocal convergence analysis using a combination of a majorant and
a center majorant function was given with the advantages (A):
tighter error estimates on the distances involved and
the information on the location of the solution is at least as precise.
These advantages were obtained, under the same computational cost, using same
or weaker sufficient convergence hypotheses. In this chapter, we even extend the
same advantages (A) for GNA method.
The chapter is organized as follows: In Section 18.2 the definition of GNA
method is presented. In Section 18.3, we provide the semilocal convergence anal-
ysis of GNA. Finally, some Numerical examples and applications are provided
in the concluding Section 18.4.
18.2 Gauss-Newton Algorithm and Quasi-Regularity

condition
18.2.1 Gauss-Newton Algorithm GNA
From the idea of restricted convergence domains, we analyze the convex com-
posite optimization problem defined as follows
minn p(x) := h(F(x)), (18.1)

x∈IR
where h : IRm −→ IR is convex, F : IRn −→ IRm is Fréchet-differentiable operator

and m, l ∈ IN . The importance of (18.1) has been show in [2, 10, 12, 19, 21–23,
25, 27]. We assume that the minimum hmin of the function h is attained. Problem
(18.1) is related to
F(x) ∈ C, (18.2)
where
C = argmin h (18.3)
is the set of all minimum points of h.
Now, let ρ ∈ [1, ∞[, Δ ∈]0, ∞] and for each x ∈ IRn , define DΔ (x) by
DΔ (x) = {d ∈ IRn : d ≤ Δ, h(F(x) + F (x) d) ≤ h(F(x) + F (x) d )

for all d ∈ IRn with d ≤ Δ}.
(18.4)
Let x0 ∈ IRn be an initial point. The GNA associated with (ρ, Δ, x0 ) as defined
in [12, 17] is :
Algorithm GNA : (ρ, Δ, x0 )
I NICIALIZATION. Take ρ ∈ [1, ∞), Δ ∈ (0, ∞] and x0 ∈ Rn , set k = 0.

S TOP C RITERION. Compute DΔ (xk ). If 0 ∈ DΔ (xk ), S TOP. Otherwise.
I TERATIVE S TEP. Compute dk satisfying dk ∈ DΔ (xk ), dk ≤ ρd(0, DΔ(xk )),
Then, set xk+1 = xk + dk , k = k + 1 and GO TO STOP CRITERION.
Where d(x,W ) denotes the distance from x to W in the finite dimensional Banach
space containing W . It is worth noticing that the set DΔ (x) (x ∈ IRn ) is nonempty
and is the solution of the following convex optimization problem
min
n
h(F(x) + F (x) d), (18.5)
d∈IR , d≤Δ
which can be solved by well known methods such as the subgradient or cutting
plane or bundle methods (see, e.g., [12, 19, 25–27]). Throughout the chapter we
will denote U(x, r) as the open ball in Rn (or Rm ) centered at x and of radius
r > 0 and by U(x, r) its closure. Let W be a closed convex subset of Rn (or Rm ).
The negative polar of W denoted by W is defined as
W = {z : < z, w >≤ 0 for each w ∈ W }. (18.6)
18.2.2 Quasi Regularity

In this section, we mention some concepts and results on regularities which can
be found in [12] (see also, e.g., [17, 21–23, 25]).
For a set-valued mapping T : Rn ⇒ Rm and for a set A in Rn or Rm , we denote
by #
D(T ) = {x ∈ Rn : T x = ∅}, R(T ) = T x, (18.7)
x∈D(T )
−1
T y = {x ∈ R : y ∈ T x}
n
and A = inf a .
a∈A
Consider the inclusion

F(x) ∈ C, (18.8)
where C is a closed convex set in Rm . Let x ∈ Rn and

D(x) = {d ∈ Rn : F(x) + F (x) d ∈ C}. (18.9)
Definition 18.2.1 Let x0 ∈ Rn .
(a) x0 is quasi-regular point of (18.8) if there exist R0 ∈]0, +∞[ and an in-
creasing positive function β on [0, R0 [ such that
D(x) = ∅ and d(0, D(x)) ≤ β ( x − x0 ) d(F(x),C) for all x ∈ U(x0 , R0 ).
(18.10)
β ( x − x0 ) is an “error bound” in determining how for the origin is away
from the solution set of (18.8).
(b) x0 is a regular point of (18.8) if
ker(F (x0 )T ) ∩ (C − F(x0 )) = {0}. (18.11)
In order for us to make the chapter as self contained as possible, the notion of
quasi-regularity is also re-introduced (see, e.g., [12, 17, 21]).
Proposition 18.2.2 (see, e.g., [12,17,21,25]) Let x0 be a regular point of (18.8).
Then, there are constants R0 > 0 and β > 0 such that (18.10) holds for R0 and
β (·) = β . Therefore, x0 is a quasi-regular point with the quasi-regular radius
Rx0 ≥ R0 and the quasi-regular bound function βx0 ≤ β on [0, R0 ].
Remark 18.2.3 (a) D(x) can be considered as the solution set of the lin-
earized problem associated to (18.8)
F(x) + F (x) d ∈ C. (18.12)
(b) If C defined in (18.8) is the set of all minimum points of h and if there
exists d0 ∈ D(x) with d0 ≤ Δ, then d0 ∈ DΔ (x) and for each d ∈ Rn , we
have the following equivalence
d ∈ DΔ (x) ⇐⇒ d ∈ D(x) ⇐⇒ d ∈ D∞ (x). (18.13)
(c) Let Rx0 denote the supremum of R0 such that (18.10) holds for some func-
tion β defined in Definition 18.2.1. Let R0 ∈ [0, Rx0 ] and BR (x0 ) denotes
the set of function β defined on [0, R0 ) such that (18.10) holds. Define
βx0 (t) = inf{β (t) : β ∈ BRx0 (x0 )} for each t ∈ [0, Rx0 ). (18.14)
All function β ∈ BR (x0 ) with lim β (t) < +∞ can be extended to an ele-
t→R−
ment of BRx0 (x0 ) and we have that
βx0 (t) = inf{β (t) : β ∈ BR (x0 )} for each t ∈ [0, R0 ). (18.15)
Rx0 and βx0 are called the quasi-regular radius and the quasi-regular func-
tion of the quasi-regular point x0 , respectively.
Definition 18.2.4 (a) A set-valued mapping T : Rn ⇒ Rm is convex if the

following items hold
(i) T x + Ty ⊆ T (x + y) for all x, y ∈ Rn .
(ii) T λ x = λ T x for all λ > 0 and x ∈ Rn .
(iii) 0 ∈ T 0.
18.3 Semilocal convergence for GNA

In this section we provide the main semilocal convergence result for GNA. First,
we need some auxiliary definitions.
Definition 18.3.1 Let R > 0, x0 ∈ IRn and F : IRn → IRm be continuously Fréchet-
differentiable. A twice-differentiable function f0 : [0, R) → R is called a center-
majorant function for F on U(x0 , R), if for each x ∈ U(x0 , R),
(h00 ) F (x) − F (x0 ) ≤ f0 (x − x0 ) − f0 (0);
(h01 ) f0 (0) = 0, f0 (0) = −1;
and
Definition 18.3.2 [5,8,17] Let x0 ∈ IRn and F : IRn → IRm be continuously differ-
entiable. Define R0 = sup{t ∈ [0, R) : f0 (t) < 0}. A twice-differentiable function
f : [0, R0 ) → R is called a majorant function for F on U(x0 , R0 ), if for each
x, y ∈ U(x0 , R0 ), x − x0 + y − x < R0 ,
(h0 ) F (y) − F (x) ≤ f (y − x + x − x0 ) − f (x − x0 );
(h1 ) f (0) = 0, f (0) = −1;
and
(h2 ) f is convex and strictly increasing.
On the other hand, assume that

(h3 ) f0 (t) ≤ f (t) and f0 (t) ≤ f (t) for each t ∈ [0, R0 ).
Remark 18.3.3 Suppose that R0 < R. If R0 ≥ R, then we do not need to introduce

Definition 18.3.2.
Let ρ > 0 and α > 0 be fixed and define auxiliary function ϕ : [0, R0 ) → R
by
ϕ(t) = ρ + (α − 1)t + α f (t). (18.16)
We shall use the following hypotheses
(h4 ) there exists s∗ ∈ (0, R) such that for each t ∈ (0, s∗ ), ϕ(t) > 0 and ϕ(s∗ ) = 0;
(h5 ) ϕ(s∗ ) < 0.
From now on we assume the hypotheses (h0 ) − (h4 ) and (h00 ) − (h02 ) which
will be called the hypotheses (H).
Now, we provide the main convergence result of the Gauss-Newton method
generated by the Algorithm GNA for solving (18.1).
Theorem 18.3.4 Suppose that the (H) conditions are satisfied. Then,
(i) sequence {sk } generated by the Gauss-Newton method for s0 = 0, sk+1 = sk −
ϕ(sk )
ϕ (sk ) for solving equation ψ(t) = 0 is: well defined; strictly increasing;
remains in [0, s∗ ) and converges Q-linearly to s∗ .
Let η ∈ [1, ∞], Δ ∈ (0, ∞] and h : Rm → R be real-valued convex with
minimizer set C such that C = ∅.
(ii) Suppose that x0 ∈ Rn is a quasi-regular point of the inclusion
F(x) ∈ C,
with the quasi-regular radius rx0 and the quasi-regular bound function
βx0 defined by (18.2.14) and (18.2.15), respectively. If d(F(x0 ),C) > 0,
s∗ ≤ rx0 , Δ ≥ ρ ≥ ηβx0 (0)d(F(x0 ),C),
- .
ηβx0 (t) ∗
α ≥ sup :ρ ≤t <s
ηβx0 (t)(1 + f (t)) + 1
then, sequence {xk } generated by GNA is well defined, remains in

U(x0 , s∗ ) for each k = 0, 1, 2, . . ., such that
F(xk ) + F (xk )(xk+1 − xk ) ∈ C for each k = 0, 1, 2 . . . . (18.17)
xk+1 − xk ≤ sk+1 − sk , (18.18)

sk+1 − sk
xk+1 − xk ≤ xk − xk−1 2 , (18.19)
(sk − sk−1 )2
for each k = 0, 1, 2 . . . , and k = 1, 2, . . ., respectively and converges to a
point x∗ ∈ U(x0 , s∗ ) satisfying F(x∗ ) ∈ C and
x∗ − xk ≤ t ∗ − sk for each k = 0, 1, 2, . . . . (18.20)
The convergence is R-linear. If hypothesis (h5 ) holds, then the sequences

{sk } and {xk } converge Q-quadratically and R-quadratically to s∗ and x∗ ,

respectively. Furthermore, if
- .
ηβx0 (t) ∗
α > α := sup :ρ ≤t <s ,
ηβx0 (t)(1 + f (t)) + 1
then, the sequence {xk } converges R-quadratically to x∗ .

Proof: Simply replace function g in [23] (see also [17]) by function f in the
proof, where g is a majorant function for F on U(x0 , R). That is we have instead
of (h0 ):
(h0 )
F (y) − F (x) ≤ g (y − x + x − x0 ) − g (x − x0 ) (18.21)
for each x, y ∈ U(x0 , R) with x − x0 + y − x < R. The iterates xn lie in
U(x0 , R0 ) which is a more precise location than U(x0 , R).
Remark 18.3.5 (a) If f (t) = g(t) = f0 (t) for each t ∈ [0, R0 ) and R0 = R,
then Theorem 18.2.1 reduces to the corresponding Theorem in [17]. Fur-
thermore, if f0 (t) ≤ f (t) = g(t) we obtain the results in [23]. It is worth
noticing that, we have that
f0 (t) ≤ g (t) for each t ∈ [0, R) (18.22)
and
f (t) ≤ g (t) for each t ∈ [0, R0 ). (18.23)
Hence, if
f0 (t) ≤ f (t) < g (t) for each t ∈ [0, R0 ), (18.24)
then the following advantages denoted by (A) are obtained: weaker suf-
ficient convergence criteria, tighter error bounds on the distances xn −
x∗ , xn+1 − xn and an at least as precise information on the location of
the solution x∗ . These advantages are obtained using less computational
cost, since in practice the computation of function g requires the com-
putation of functions f0 and f as special cases. Realize that under (h00 )
function f0 is defined and therefore R0 which is at least as small as R.
Therefore the majorant function to satisfy (i.e., f ) is at least as small as
the majorant function satisfying (h0 ) (i.e., g ) leading to the advantages
of the new approach over the approach in [17] or [23]. Indeed, we have
that if function ψ has a solution t ∗ , then, since ϕ(t ∗ ) ≤ ψ(t ∗ ) = 0 and
ϕ(0) = ψ(0) = ρ > 0, we get that function ϕ has a solution r∗ such that
r∗ ≤ t ∗ . (18.25)
but not necessarily vice versa. If also follows from (18.25) that the new
information about the location of the solution x∗ is at least as precise as
the one given in [18].
(b) Let us specialize conditions (18.8)–(18.10) even further in the case when
L0 , K and L are constant functions and α = 1. Then, function correspond-
ing to (18.16) reduce to
L
ψ(t) = t 2 − t + ρ (18.26)
2
[17, 23] and
K 2
ϕ(t) = t − t + ρ, (18.27)
2
respectively. In this case the convergence criteria become, respectively
1
h = Lρ ≤ (18.28)
2
and
1
h1 = Kρ ≤ . (18.29)
2
Notice that
1 1
h≤ =⇒ h1 ≤ (18.30)
2 2
but not necessarily vice versa. Unless, if K = L. Criterion (18.28) is the
celebrated Kantorovich hypotheses for the semilocal convergence of New-
ton’s method [20]. In the case of Wang’s condition [29] we have
γt 2
ϕ(t) = − t + ρ,
1 − γt
βt 2
ψ(t) = − t + ρ,
1 − βt
2γ 1
L(u) = , γ > 0, 0 ≤ t ≤
(1 − γu)3 γ
and
2β 1
K(u) = , β > 0, 0 ≤ t ≤
(1 − β u)3 β
with convergence criteria, given respectively by
√
H = γρ ≤ 3 − 2 2 (18.31)
√
H1 = β ρ ≤ 3 − 2 2. (18.32)
Then, we also have
√ √
H ≤ 3 − 2 2 =⇒ H1 ≤ 3 − 2 2
but not necessarily vice versa, unless, if β = γ.

(c) Concerning the error bounds and the limit of majorizing sequence, let us
define majorizing sequence {rα,k } by
ϕ(rα,k )
rα,0 = 0; rα,k+1 = rα,k − (r
ϕα,0 α,k )
for each k = 0, 1, 2, . . . , where

t
ϕα,0 (t) = ρ − t + α L0 (t)(t − u)du.
0
Suppose that
ϕ(r) ϕ(s)
− (r) ≤ − ϕ (s)
ϕα,0
for each r, s ∈ [0, R0 ] with r ≤ s. According to the proof of Theorem 18.3.1
sequence {rα,k } is also a majorizing sequence for GNA. On the other hand,
a simple inductive argument shows that
rk ≤ sk ,
rk+1 − rk ≤ sk+1 − sk
and
r∗ = lim rk ≤ s∗ .
k−→∞
Moreover, the first two preceding inequality are strict for n ≥ 2, if
L0 (u) < K(u) for each u ∈ [0, R0 ].
Similarly, suppose that
ϕ(s) ψ(t)
−
≤− (18.33)
ϕ (s) ψ (t)
for each s,t ∈ [0, R0 ] with s ≤ t. Then, we have that
sα,k ≤ tα,k
sα,k+1 − sα,k ≤ tα,k+1 − tα,k

and
s∗ ≤ t ∗ .
The first two preceding inequalities are also strict for k ≥ 2, if strict in-
equality holds in (18.33).
18.4 Specializations and numerical examples

Specializations of Theorem 18.3.3 to some interesting cases such as Smale’s
α−theory (see also Wang’s γ−theory) and Kantorovich theory have been re-
ported in [17, 23, 25], if f0 (t) = f (t) = g (t) for each t ∈ [0, R) with R0 = R and
in [23], if f0 (t) < f (t) = g (t) for each t ∈ [0, R0 ). Next, we present examples
where f0 (t) < f (t) < g (t) for each t ∈ [0, R0 ) to show the advantages of the new
approach over the ones in [17, 24, 26]. We choose for simplicity m = n = α = 1.
Example 18.4.1 Let x0 = 1, D = U(1, 1 − q), q ∈ [0, 12 ) and define function F

on D by
F(x) = x3 − q. (18.34)
Then, we have that ρ = 13 (1 − q), L0 = 3 − q, L = 2(2 − q) and K = 2(1 + L10 ).
The Newton-Kantorovich condition (18.28) is not satisfied, since
1 1
h> for each q ∈ [0, ). (18.35)
2 2
Hence, there is no guarantee by the Newton-Kantorovich Theorem [17] that New-
ton’s method (18.1) converges to a zero of operator F. Let us see what gives: We
have by (18.29) that
1 1
h1 ≤ , if 0.461983163 < q < (18.36)
2 2
References

197–205, 2003.
putational Mathematics, 15, Editors: K. Chui and L. Wuytack. Elsevier,
1–14, 2010.
58, 23–52, 2011.
methods, Math. Comput., AMS, 80, 327–343, 2011.
1892–1902, 2012.
ton’s method, J. Complexity, 28, 364–387 (2012).
World Scientific Publ. Comp., New Jersey, 2013.
292
References 293
[11] A. Ben-Israel and T. N. E. Greville, Generalized inverses. CMS Books in

Mathematics/Ouvrages de Mathematiques de la SMC, 15. Springer-Verlag,
New York, second edition, Theory and Applications, 2003.
[12] J. V. Burke and M. C. Ferris, A Gauss-Newton method for convex compos-
ite optimization, Math. Programming Ser A., 71, 179–194, 1995.
optimization and nonlinear equations, Classics in Appl. Math., 16, SIAM,
Philadelphia, PA, 1996.
analysis of the Gauss–Newton method under a majorant condition, J. Com-
plexity, 27 (1), 111–125, 2011.
in Riemmanian manifolds, J. Complexity, 18 (1), 304–329, 2002.
[17] O. P. Ferreira, M. L. N. Gonçalves and P. R. Oliveira, Convergence of the
[18] W. M. Häussler, A Kantorovich-type convergence analysis for the Gauss-
Newton method. Numer. Math., 48, 119–125, 1986.
methods, 305 and 306, Springer-Verlag, Berlin, 1993.
Oxford, 1982.
[21] C. Li and K. F. Ng, Majorizing functions and convergence of the Gauss-
Newton method for convex composite optimization, SIAM J. Optim., 18
(2), 613–692, 2007.
[22] C. Li and X. H. Wang, On convergence of the Gauss-Newton method for
convex composite optimization, Math. Program. Ser A., 91, 349–356, 2002.
[23] A. Magrenán and I. K. Argyros, Expanding the applicability of the Gauss-
Newton method for convex optimization under a majorant condition, SeMA,
65, 37–56, 2014.
[24] F. A. Potra, Sharp error bounds for a class of Newton-like methods, Libertas
Mathematica, 5, 71–84, 1985.
294 References

[26] R. T. Rockafellar, Monotone proccesses of convex and concave type, Mem-
oirs of the American Mathematical Society, 77. American Mathematical
Society, Providence, R.I., 1967.
[27] R. T. Rockafellar, Convex analysis, Princeton Mathematical Series, 28,
Princeton University Press, Princeton, N.J., 1970.
[28] S. Smale, Newton’s method estimates from data at one point. The merging
of disciplines: new directions in pure, applied, and computational mathe-
matics (Laramie, Wyo., 1985), 185–196, Springer, New York, 1986.
2000.
Chapter 19
Directional Newton
methods and restricted
domains
Ioannis K. Argyros

CONTENTS
19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
19.2 Semilocal convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
19.1 Introduction
Let F : D ⊆ Rn → R be a differentiable function. In computer graphics, we often
need to compute and display the intersection C = A ∩ B of two surfaces A and
B in R3 [5], [6]. If the two surfaces are explicitly given by
A = {(u, v, w)T : w = F1 (u, v)}
295
and
B = {(u, v, w)T : w = F2 (u, v)},
then the solution x = (u , v , w )T ∈ C must satisfy the nonlinear equation
F1 (u , v ) = F2 (u , v )
and
w = F1 (u , v ).
Therefore, we must solve a nonlinear equation in two variables x = (u, v)T of
the form
F(x) = F1 (x) − F2 (x) = 0.
The marching method can be used to compute the intersection C. In this
method, we first need to compute a starting point x0 = (u0 , v0 , w0 )T ∈ C, and
then compute the succeeding intersection points by succesive updating.
In mathematical programming [9], for an equality–constraint optimization

problem, e.g.,
min ψ(x)
s.t. F(x) = 0
where, ψ, F : D ⊆ Rn −→ R are nonlinear functions, we need a feasible point
to start a numerical algorithm. That is, we must compute a solution of equation
F(x) = 0.
In the case of a system of nonlinear equations G(x) = 0, with G : D ⊆ Rn −→

R , we may solve instead
n
G(x) 2 = 0,
if the zero of function G is isolated or locally isolated and if the rounding error
is neglected [3], [7], [10], [11], [12].
We use the directional Newton method (DNM) [5] given by
F(xk )
xk+1 = xk − dk (k ≥ 0)
∇F(xk ) · dk
to generate a sequence {xk } converging to x .
Now, we will explain DNM. First of all, we begin by choosing an initial guess
x0 ∈ U0 , where F is differentiable and a direction vector d0 .
Then, we restrict F on the line A = {x0 + θ d0 , θ ∈ R}, where it is a function

of one variable f (θ ) = F(x0 + θ d0 ).
Directional Newton methods and restricted domains 297
Set θ0 = 0 to obtain the Newton iteration for f , that is the next point:
f (0)
v1 = − .
f (0)
The iteration for F is
F(x0 )
x1 = x0 − d0 .
∇F(x0 ) · d0
Note that f (0) = F(x0 ) and f (0) is the directional derivative
f (0) = F (x0 , d0 ) = ∇F(x0 ) · d0 .
By repeating this process we arrive at DNM.
If n = 1, DNM reduces to the classical Newton method [1]– [3], [7].
In [5], Levin and Ben–Israel provide a semilocal convergence analysis for the
DNM.
The quadratic convergence of the method was established for directions
dk sufficiently close to the gradients ∇F(xk ), and under standard Newton–
Kantorovich–type hypotheses [1]– [3], [7].
In this chapter, motivated by the aforementioned [5], optimization consider-

ations, and using the new idea of restricted convergence domains, we find more
precise location where the iterates lie than in earlier studies such as [1, 5]. This
way, we provide a semilocal convergence analysis with the following advantages
1. Weaker hypotheses;
2. Larger convergence domain for DNM;
3. Finer error bounds on the distances xk+1 − xk , xk − x (k ≥ 0);
4. An at least as precise information on the location of the zero x .
A numerical example where our results apply, but the earlier ones in [5] are
violated is also provided at the end of this chapter.
Here after, we use the Euclidean inner product, the corresponding norm
x , and the corresponding matrix norm A , except in Section 19.3 where
the ∞–norm is used for vectors and matrices, denoted by x ∞ , and A ∞ ,
respectively.
19.2 Semilocal convergence analysis

First of all we present some needed lemma on majorizing sequences for DNM.
Lemma 19.2.1 Suppose that there exist constants L0 > 0, L > 0 and η ≥ 0 such
that:
1
q0 := L η ≤ (19.1)
2
where,
1 √
L= 4 L0 + L0 L + 8 L0 + L0 L .
2 (19.2)
8
Then, sequence {tk } (k ≥ 0) given by
L 0 (t1 − t0 )2 L (tk − tk−1 )2

t0 = 0, t1 = η, t2 = t1 + , tk+1 = tk + (k ≥ 2),
2 (1 − L0 t1 ) 2 (1 − L0 tk )
(19.3)
is well defined, nondecreasing, bounded above by t , and converges to its unique
least upper bound t ∈ [0,t ], where

L0 η
t = 1+ η, (19.4)
2(1 − δ )(1 − L0 η)
and
2L
δ= , L0 , L = 0. (19.5)
L+ L2 + 8 L0 L
The proof is shown in [3]. Here, ∠ denotes the angle between two vectors u and
v, given by
u·v
∠(u, v) = arccos , u = 0, v = 0.
u·v
Throughout the chapter B(x, ρ), and B(x, ρ) stand, respectively for the open
and closed ball in Rn with center x ∈ Rn and radius ρ > 0.
Now, we provide the main semilocal convergence theorem for DNM.
Theorem 19.2.2 Let F : D ⊆ Rn −→ R be a differentiable function. Assume that

(i) there exist a point x0 ∈ D, such that
F(x0 ) = 0, ∇F(x0 ) = 0.
Let d0 ∈ Rn be such that d0 = 1, and set
F(x0 )
h0 = − d0 ,
∇F(x0 ) · d0
x1 = x0 + h0 .
(ii) For F ∈ C 2 [D], there exist constants M0 > 0 and M > 0 such that
∇F(x) − ∇F(x0 ) ≤ M0 x − x0 , x ∈ D, (19.6)
sup F (x) = M, D0 = D ∩ B(x0 , 1/M0 ), (19.7)

x∈D0
1
p0 = |F(x0 )| M |∇F(x0 ) · d0 |−2 ≤ , (19.8)
2
and
B0 := B(x0 ,t ) ⊆ D (19.9)
where
1 √
M= 4 M0 + M0 M + 8 M0 + M0 M .
2 (19.10)
8
(iii) Sequence {xk } (k ≥ 0) given by
xk+1 = xk + hk , (19.11)
where,
F(xk )
hk = − dk , (19.12)
∇F(xk ) · dk
satisfies
∠(dk , ∇F(xk )) ≤ ∠(d0 , ∇F(x0 )) k ≥ 0, (19.13)
where, each dk ∈ Rn is such that dk = 1.
Then, sequence {xk }k≥0 remains in B0 and converges to a zero x ∈ B0 of

function F.
On the other hand, ∇F(x ) = 0 unless x − x0 = t .
Moreover, the following estimates hold for all k ≥ 0:
xk+1 − xk ≤ tk+1 − tk (19.14)
and
xk − x ≤ t − tk , (19.15)
where, iteration {tk } is given by (19.3), for
L0 = |∇F(x0 )|−1 M0 , L = |∇F(x0 ) · d0 |−1 M, η = |∇F(x0 ) · d0 |−1 |F(x0 )|.
Note that condition (19.13) is equivalent to

|∇F(xk ) · dk | |∇F(x0 ) · d0 |
≥ k ≥ 0. (19.16)
∇F(xk ) ∇F(x0 )
Proof: The iterates {xn } are shown to be in D0 which is more precise location
than D used in [1, 5], since D0 ⊆ D leading to advantages. We shall show using
mathematical induction on k ≥ 0:
xk+1 − xk ≤ tk+1 − tk , (19.17)
and
B(xk+1 ,t − tk+1 ) ⊆ B(xk ,t − tk ). (19.18)

For each z ∈ B(x1 ,t − t1 ), we have
z − x0 ≤ z − x1 + x1 − x0
≤ t − t1 + t1 − t0 = t − t0 ,
which shows that z ∈ B(x0 ,t − t0 ).

Since, also
x1 − x0 = h0 ≤ η = t1 − t0 ,
estimates (19.17) and (19.18) hold for k = 0.
Assume (19.17) and (19.18) hold for all i ≤ k. Then we have:
xk+1 − x0 ≤ xk+1 − xk + xk − xk−1 + · · · + x1 − x0

(19.19)
≤ (tk+1 − tk ) + (tk − tk−1 ) + · · · + (t1 − t0 ) = tk+1 ,
and
xk + t (xk+1 − xk ) − x0 ≤ tk + t (tk+1 − tk ) ≤ t , t ∈ [0, 1].
Using condition (19.6) for x = xk , we obtain:
∇F(xk ) ≥ ∇F(x0 ) − ∇F(xk ) − ∇F(x0 )

≥ ∇F(x0 ) −M0 xk − x0
≥ ∇F(x0 ) −M0 (tk − t0 )
≥ ∇F(x0 ) −M0 tk > 0 (by (19.18), and Lemma 19.2.1).
(19.20)
We also get
xk
(xk − x) F (x) dx = −(xk − xk−1 ) ∇F(xk−1 ) + F(xk ) − F(xk−1 )
xk−1
= −hk−1 ∇F(xk−1 ) + F(xk ) − F(xk−1 )
F(xk−1 )
= (dk−1 · ∇F(xk−1 )) + F(xk )
(F(xk−1 ) · dk−1 )
−F(xk−1 )
= F(xk ).
(19.21)
Now, we introduce a change of variable given by x = xk−1 + t hk−1 , t ∈ [0, 1].

We can write
xk − x = xk − xk−1 − t hk−1 = hk−1 − t hk−1 = (1 − t) hk−1 , dx = hk−1 dt.
Then (19.21) can be written as:

1
F(xk ) = (1 − t) hk−1 F (xk−1 + θ hk−1 ) hk−1 dθ . (19.22)
0
Using (19.7), (19.11)–(19.16), we obtain

|F(xk )|
xk+1 − xk = hk =
|∇F(x
1 k ) · dk |
| (1 − t) hk−1 F (x0 + θ hk−1 ) hk−1 dθ |
≤ 0
|∇F(xk ) · dk |
M hk−1 2
≤
2 |∇F(xk ) · dk |
M hk−1 2 ∇F(x0 )
≤
2 ∇F(xk ) |∇F(x0 ) · d0 |
M hk−1 2 ∇F(x0 )
≤
2 ( ∇F(x0 ) −M0 tk ) |∇F(x0 ) · d0 |
M hk−1 2
≤
2 (1− ∇F(x0 ) −1 M0 tk ) |∇F(x0 ) · d0 |
M (tk − tk−1 )2
≤
2 (1 − |∇F(x0 ) · d0 |−1 M0 tk ) |∇F(x0 ) · d0 |
= tk+1 − tk ,
which shows (19.17) for all k ≥ 0.
Then, for each w ∈ B(xk+2 ,t − tk+2 ), we get
w − xk+1 ≤ w − xk+2 + xk+2 − xk+1

≤ t − tk+2 + tk+2 − tk+1 = t − tk+1 ,
showing (19.18) for all k ≥ 0.

Lemma 19.2.1 implies that {tn } is a Cauchy sequence. It then follows from
(19.17), and (19.18) that {xn } is a Cauchy sequence too. Since B0 is a closed set,
it follows that {xn } converges to some x ∈ B0 .
The point x is a zero of F, since

1
0 ≤ |F(xk )| ≤ M (tk − tk−1 )2 −→ 0 as k → ∞.
2
On the other hand, we prove ∇F(x ) = 0, except if, x − x0 = t .
Using (19.6) and the definition of constant L0 , we get

∇F(x) − ∇F(x0 ) ≤ M0 x − x0
≤ M t ≤ |∇F(x0 ) · d0 | ≤ ∇F(x0 )
If x − x0 < t , then by (19.6), we obtain:
∇F(x) − ∇F(x0 ) ≤ M0 x − x0 < M0 t ≤ ∇F(x0 ) ,
or
∇F(x0 ) > ∇F(x) − ∇F(x0 ) ,
which shows ∇F(x) = 0. 2
Now, we present the following very useful remark.
Remark 19.2.3 (a) t in (19.9) can be replaced by t given in closed form by
(19.4).
(b) The proof followed the corresponding proof in [3], but the new sufficient
convergence condition (19.8) is weaker than the one in [3].
To show the second part of the previous remark we need the following definition.
Definition 19.2.4 Let F : D ⊆ Rn −→ R be a differentiable function, where D
is a convex region. Function ∇F is called Lipschitz continuous, if there exists a
constant M1 ≥ 0, such that:
∇F(x) − ∇F(y) ≤ M1 x − y for all x, y ∈ D. (19.23)
If F is twice differentiable, then we can set
sup F (x) := M1 . (19.24)
x∈D0
Notice that
M0 ≤ M1 (19.25)
and
M ≤ M1 , ( since D0 ⊆ D) (19.26)
M1
hold in general, and can be arbitrarily large [1]– [3].
M0
Remark 19.2.5 If F is twice differentiable, then, in view of the proof of Theorem

19.2.2, constant L0 can be defined by the more precise:
L0 = ∇F(x0 ) −1 M0 . (19.27)
Remark 19.2.6 Our Theorem 19.2.2 improves Theorem 2.2 in [1].
Theorem 2.2 in [5] uses condition

1
p = |F(x0 )| N |∇F(x0 ) · d0 |−2 ≤ (19.28)
2
corresponding to our condition (19.8), where

N = 4M0 + M1 + M12 + 8M0 M1 .
But, we have
M < N. (19.29)
That is,
1 1
p≤ =⇒ p0 ≤ , (19.30)
2 2
but not necessarily vice verca (unless if M0 = M = M1 ).
Moreover corresponding error bounds as well as the information on the location
of we improved if M0 < M or M < M1 .
Examples were M0 < M < M < M1 can be found in [1–4]. The rest of the
results in [1] (and consequently the results in [5] can be improved along the
same lines) It is worth noting that in the proof of Theorem 2.2 in [3], we used M1
instead of M (i.e., D instead of D0 ). However, the iterates xn remain in D0 which
is more precise domain than D, since D0 ⊆ D.
References

methods, Math. Compt. AMS, 80, 327–343 (2011).
[2] I. K. Argyros, Convergence and applications of Newton-type iterations,
Springer–Verlag, New York, 2008.
[4] A. Ben–Israel and Y. Levin, Maple programs for directional Newton meth-
ods, are available at ftp://rutcor.rutgers.edu/pub/bisrael/Newton–Dir.mws.
[5] Y. Levin and A. Ben–Israel, Directional Newton methods in n variables,
Math. Comput. AMS, 71, 237, 251–262, 2002.
[6] G. Lukács, The generalized inverse matrix and the surface–surface inter-
section problem. Theory and practice of geometric modeling (Blaubeuren,
1988), 167–185, Springer, Berlin, 1989.
[7] J. M. Ortega and W. C. Rheinboldt, Iterative solution of nonlinear equations
in several variables, Academic Press, New York, 1970.
[8] V. Pereyra, Iterative methods for solving nonlinear least square problems,
SIAM J. Numer. Anal., 4, 27–36, 1967.
[9] B. T. Polyak, Introduction to optimization. Translated from the Russian.
With a foreword by D. P. Bertsekas. Translations series in Mathematics
and engineering, Optimization Software, Inc., Publications Division, New
York, 1987.
[10] A. Ralston and P. Rabinowitz, A first course in numerical analysis, 2nd
Edition, Mc Graw-Hill, 1978.
304
References 305
[11] J. Stoer and K. Bulirsch, Introduction to Numerical Analysis, Springer–

Verlag, 1976.
[12] H. F. Walker and L. T. Watson, Least–change Secant update methods, SIAM
J. Numer. Anal., 27, 1227–1262, 1990).
Chapter 20
Expanding the
applicability of the
Gauss-Newton method for
convex optimization
under restricted
convergence domains and
majorant conditions
Ioannis K. Argyros

CONTENTS
20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
306
Gauss-Newton method for convex optimization 307
20.2 Gauss-Newton Algorithm and Quasi-Regularity condition . . . . . . . . . . . 307

20.2.1 Gauss-Newton Algorithm GNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
20.2.2 Quasi Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
20.3 Semi-local convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
20.1 Introduction
In this chapter we are concerned with the convex composite optimizations prob-
lem. This work is mainly motivated by the work in [17,23]. We present a conver-
gence analysis of Gauss–Newton method (defined by Algorithm (GNA) in Sec.
20.2). The convergence of GNA is based on the majorant function in [17] (to
be precised in Sec. 20.2). They follow the same formulation using the majorant
function provided in [23] (see [21,23,28,29]). In [3,5,8], a convergence analysis
in a Banach space setting was given for (GNM) defined by
−1
xk+1 = xk − F (xk )+ F (xk ) F (xk )+ F(xk ) for each k = 0, 1, 2, . . . ,
where x0 is an initial point and F (x)+ in the Moore-Penrose inverse [11–13,
19, 26] of operator F (x) with F : Rn → Rm being continuously differentiable.
In [23], a semilocal convergence analysis using a combination of a majorant
and a center majorant function was given with the advantages (A): tighter error
estimates on the distances involved and the information on the location of the
solution is at least as precise. These advantages were obtained (under the same
computational cost) using same or weaker sufficient convergence hypotheses.
Here, we extend the same advantages (A) but to hold for GNA.
The chapter is organized as follows: Section 20.2 contains the definition of
GNA. In order for us to make the chapter as self contained as possible, the notion
of quasi-regularity is also re-introduced (see, e.g., [12, 17, 21]). The semilocal
convergence analysis of GNA is presented in Section 20.3. Numerical examples
and applications of our theoretical results and favorable comparisons to earlier
studies (see, e.g., [12, 17, 18, 21, 22]) are presented in Section 20.4.
20.2 Gauss-Newton Algorithm and Quasi-Regularity

condition
20.2.1 Gauss-Newton Algorithm GNA
Using the idea of restricted convergence domains, we study the convex composite
optimization problem
minn p(x) := h(F(x)), (20.1)
x∈IR
where h : IRm −→ IR is convex, F : IRn −→ IRm is Fréchet-differentiable operator
and m, l ∈ IN . The importance of (20.1) can be found in [2, 10, 12, 19, 21–23,
25, 27]. We assume that the minimum hmin of the function h is attained. Problem
(20.1) is related to
F(x) ∈ C, (20.2)
where
C = argmin h (20.3)
is the set of all minimum points of h.
Let ξ ∈ [1, ∞[, Δ ∈]0, ∞] and for each x ∈ IRn , define DΔ (x) by
DΔ (x) = {d ∈ IRn : d ≤ Δ, h(F(x) + F (x) d) ≤ h(F(x) + F (x) d )
for all d ∈ IRn with d ≤ Δ}.
(20.4)
Let x0 ∈ IRn be an initial point. The Gauss-Newton algorithm GNA associated
with (ξ , Δ, x0 ) as defined in [12] (see also [17]) is as follows:
Algorithm GNA : (ξ , Δ, x0 )
I NICIALIZATION. Take ξ ∈ [1, ∞), Δ ∈ (0, ∞] and x0 ∈ Rn , set k = 0.

S TOP C RITERION. Compute DΔ (xk ). If 0 ∈ DΔ (xk ), S TOP. Otherwise.
I TERATIVE S TEP. Compute dk satisfying dk ∈ DΔ (xk ), dk ≤ ξ d(0, DΔ(xk )),
Then, set xk+1 = xk + dk , k = k + 1 and GO TO STOP CRITERION.
Here, d(x,W ) denotes the distance from x to W in the finite dimensional Banach
space containing W . Note that the set DΔ (x) (x ∈ IRn ) is nonempty and is the
solution of the following convex optimization problem
min
n
h(F(x) + F (x) d), (20.5)
d∈IR , d≤Δ
which can be solved by well known methods such as the subgradient or cutting
plane or bundle methods (see, e.g., [12, 19, 25–27]).
Let U(x, r) denote the open ball in Rn (or Rm ) centered at x and of radius
r > 0. By U(x, r) we denote its closure. Let W be a closed convex subset of Rn
(or Rm ). The negative polar of W denoted by W is defined as
W = {z : < z, w >≤ 0 for each w ∈ W }. (20.6)
20.2.2 Quasi Regularity

In this section, we mention some concepts and results on regularities which can
be found in [12] (see also, e.g., [17, 21–23, 25]). For a set-valued mapping T :
Rn ⇒ Rm and for a set A in Rn or Rm , we denote by
#
D(T ) = {x ∈ Rn : T x = ∅}, R(T ) = T x, (20.7)
x∈D(T )
T −1 y = {x ∈ Rn : y ∈ T x} and A = inf a .
a∈A
Consider the inclusion
F(x) ∈ C, (20.8)
where C is a closed convex set in R . Let x ∈ R and
m n
D(x) = {d ∈ Rn : F(x) + F (x) d ∈ C}. (20.9)

Definition 20.2.1 Let x0 ∈ Rn .
(a) x0 is quasi-regular point of (20.8) if there exist R0 ∈]0, +∞[ and an in-
creasing positive function β on [0, R0 [ such that
D(x) = ∅ and d(0, D(x)) ≤ β ( x − x0 ) d(F(x),C) for all x ∈ U(x0 , R0 ).
(20.10)
β ( x − x0 ) is an “error bound” in determining how for the origin is away
from the solution set of (20.8).
(b) x0 is a regular point of (20.8) if
ker(F (x0 )T ) ∩ (C − F(x0 )) = {0}. (20.11)
Proposition 20.2.2 (see, e.g., [12,17,21,25]) Let x0 be a regular point of (20.8).

Then, there are constants R0 > 0 and β > 0 such that (20.10) holds for R0 and
β (·) = β . Therefore, x0 is a quasi-regular point with the quasi-regular radius
Rx0 ≥ R0 and the quasi-regular bound function βx0 ≤ β on [0, R0 ].
Remark 20.2.3 (a) D(x) can be considered as the solution set of the lin-
earized problem associated to (20.8)
F(x) + F (x) d ∈ C. (20.12)
(b) If C defined in (20.8) is the set of all minimum points of h and if there
exists d0 ∈ D(x) with d0 ≤ Δ, then d0 ∈ DΔ (x) and for each d ∈ Rn , we
have the following equivalence
d ∈ DΔ (x) ⇐⇒ d ∈ D(x) ⇐⇒ d ∈ D∞ (x). (20.13)
(c) Let Rx0 denote the supremum of R0 such that (20.10) holds for some func-
tion β defined in Definition 20.2.1. Let R0 ∈ [0, Rx0 ] and BR (x0 ) denotes
the set of function β defined on [0, R0 ) such that (20.10) holds. Define
βx0 (t) = inf{β (t) : β ∈ BRx0 (x0 )} for each t ∈ [0, Rx0 ). (20.14)
All function β ∈ BR (x0 ) with lim− β (t) < +∞ can be extended to an ele-
t→R
ment of BRx0 (x0 ) and we have that
βx0 (t) = inf{β (t) : β ∈ BR (x0 )} for each t ∈ [0, R0 ). (20.15)
Rx0 and βx0 are called the quasi-regular radius and the quasi-regular func-
tion of the quasi-regular point x0 , respectively.
Definition 20.2.4 (a) A set-valued mapping T : Rn ⇒ Rm is convex if the

following items hold
(i) T x + Ty ⊆ T (x + y) for all x, y ∈ Rn .
(ii) T λ x = λ T x for all λ > 0 and x ∈ Rn .
(iii) 0 ∈ T 0.
20.3 Semi-local convergence

In this section we present the semi-local convergence of GNA. First, we study the
convergence of majorizing sequences for GNA. Then, we study the convergence
of GNA. We need the definition of the center-majorant function and the definition
of the majorant function for F.
Definition 20.3.1 Let R > 0, x0 ∈ IRn and F : IRn → IRm be continuously Fréchet-
differentiable. A twice-differentiable function f0 : [0, R) → R is called a center-
majorant function for F on U(x0 , R), if for each x ∈ U(x0 , R),
(h00 ) F (x) − F (x0 ) ≤ f0 (x − x0 ) − f0 (0);
(h01 ) f0 (0) = 0, f0 (0) = −1;
and
Definition 20.3.2 [5,8,17] Let x0 ∈ IRn and F : IRn → IRm be continuously differ-
entiable. Define R0 = sup{t ∈ [0, R) : f0 (t) < 0}. A twice-differentiable function
f : [0, R0 ) → R is called a majorant function for F on U(x0 , R0 ), if for each
x, y ∈ U(x0 , R0 ), x − x0 + y − x < R0 ,
(h0 ) F (y) − F (x) ≤ f (y − x + x − x0 ) − f (x − x0 );
(h1 ) f (0) = 0, f (0) = −1;
and
(h2 ) f is convex and strictly increasing.
Moreover, assume that

(h3 ) f0 (t) ≤ f (t) and f0 (t) ≤ f (t) for each t ∈ [0, R0 ).
Remark 20.3.3 Suppose that R0 < R. If R0 ≥ R, then we do not need to introduce

Definition 20.3.2.
In Section 20.4, we present examples where hypothesis (h3 ) is satisfied. Let

ξ > 0 and α > 0 be fixed and define auxiliary function ϕ : [0, R0 ) → R by
ϕ(t) = ξ + (α − 1)t + α f (t). (20.16)
We shall use the following hypotheses
(h4 ) there exists s∗ ∈ (0, R) such that for each t ∈ (0, s∗ ), ϕ(t) > 0 and ϕ(s∗ ) = 0;
(h5 ) ϕ(s∗ ) < 0.
From now on we assume the hypotheses (h0 ) − (h4 ) and (h00 ) − (h02 ) which
will be called the hypotheses (H).
Next, we present the main semi-local convergence result of the Gauss-
Newton method generated by the Algorithm GNA for solving (20.1).
Theorem 20.3.4 Suppose that the (H) conditions are satisfied. Then,
(i) sequence {sk } generated by the Gauss-Newton method for s0 = 0, sk+1 = sk −
ϕ(sk )
ϕ (sk ) for solving equation ψ(t) = 0 is: well defined; strictly increasing;
remains in [0, s∗ ) and converges Q-linearly to s∗ .
Let η ∈ [1, ∞], Δ ∈ (0, ∞] and h : Rm → R be real-valued convex with
minimizer set C such that C = ∅.
(ii) Suppose that x0 ∈ Rn is a quasi-regular point of the inclusion
F(x) ∈ C,
with the quasi-regular radius rx0 and the quasi-regular bound function βx0
defined by (20.14) and (20.15), respectively. If d(F(x0 ),C) > 0, s∗ ≤ rx0 ,
Δ ≥ ξ ≥ ηβx0 (0)d(F(x0 ),C),
- .
ηβx0 (t) ∗
α ≥ sup :ξ ≤t <s
ηβx0 (t)(1 + f (t)) + 1
then, sequence {xk } generated by GNA is well defined, remains in

U(x0 , s∗ ) for each k = 0, 1, 2, . . ., such that
F(xk ) + F (xk )(xk+1 − xk ) ∈ C for each k = 0, 1, 2 . . . . (20.17)
xk+1 − xk ≤ sk+1 − sk , (20.18)

sk+1 − sk
xk+1 − xk ≤ xk − xk−1 2 , (20.19)
(sk − sk−1 )2
for each k = 0, 1, 2 . . . , and k = 1, 2, . . ., respectively and converges to a

point x∗ ∈ U(x0 , s∗ ) satisfying F(x∗ ) ∈ C and
x∗ − xk ≤ t ∗ − sk for each k = 0, 1, 2, . . . . (20.20)
The convergence is R-linear. If hypothesis (h5 ) holds, then the sequences

{sk } and {xk } converge Q-quadratically and R-quadratically to s∗ and x∗ ,
respectively. Furthermore, if
- .
ηβx0 (t)
α > α := sup : ξ ≤ t < s∗ ,
ηβx0 (t)(1 + f (t)) + 1
then, the sequence {xk } converges R-quadratically to x∗ .
Proof: Simply replace function g in [23] (see also [17]) by function f in the
proof, where g is a majorant function for F on U(x0 , R). That is we have instead
of (h0 ):
(h0 )
F (y) − F (x) ≤ g (y − x + x − x0 ) − g (x − x0 ) (20.21)
for each x, y ∈ U(x0 , R) with x − x0 + y − x < R. The iterates xn lie in

U(x0 , R0 ) which is a more precise location than U(x0 , R).
Remark 20.3.5 (a) If f (t) = g(t) = f0 (t) for each t ∈ [0, R0 ) and R0 = R, then
Theorem 2.1 reduces to the corresponding Theorem in [17]. Moreover, if
f0 (t) ≤ f (t) = g(t) we obtain the results in [23]. Notice that, we have that
f0 (t) ≤ g (t) for each t ∈ [0, R) (20.22)
and
f (t) ≤ g (t) for each t ∈ [0, R0 ). (20.23)
Therefore, if
f0 (t) ≤ f (t) < g (t) for each t ∈ [0, R0 ), (20.24)
then the following advantages denoted by (A) are obtained: weaker suf-
ficient convergence criteria, tighter error bounds on the distances xn −
x∗ , xn+1 − xn and an at least as precise information on the location of
the solution x∗ . These advantages are obtained using less computational
cost, since in practice the computation of function g requires the compu-
tation of functions f0 and f as special cases. It is also worth noticing that
under (h00 ) function f0 is defined and therefore R0 which is at least as small
as R. Therefore the majorant function to satisfy (i.e., f ) is at least as small
as the majorant function satisfying (h0 ) (i.e., g ) leading to the advantages
of the new approach over the approach in [17] or [23]. Indeed, we have
that if function ψ has a solution t ∗ , then, since ϕ(t ∗ ) ≤ ψ(t ∗ ) = 0 and
ϕ(0) = ψ(0) = ξ > 0, we get that function ϕ has a solution r∗ such that
r∗ ≤ t ∗ . (20.25)
but not necessarily vice versa. If also follows from (20.25) that the new
information about the location of the solution x∗ is at least as precise as
the one given in [18].
(b) Let us specialize conditions (20.8)–(20.10) even further in the case when
L0 , K and L are constant functions and α = 1. Then, function correspond-
ing to (20.16) reduce to
L
ψ(t) = t 2 − t + ξ (20.26)
2
[17, 23] and
K 2
ϕ(t) = t −t +ξ, (20.27)
2
respectively. In this case the convergence criteria become, respectively
1
h = Lξ ≤ (20.28)
2
and
1
h1 = Kξ ≤ . (20.29)
2
Notice that
1 1
h≤ =⇒ h1 ≤ (20.30)
2 2
but not necessarily vice versa. Unless, if K = L. Criterion (20.28) is
the famous for its simplicity and clarity Kantorovich hypotheses for the
semilocal convergence of Newton’s method to a solution x∗ of equation
F(x) = 0 [20]. In the case of Wang’s condition [29] we have
γt 2
ϕ(t) = −t +ξ,
1 − γt
βt 2
ψ(t) = −t +ξ,
1 − βt
2γ 1
L(u) = , γ > 0, 0 ≤ t ≤
(1 − γu)3 γ
and
2β 1
K(u) = , β > 0, 0 ≤ t ≤
(1 − β u)3 β
with convergence criteria, given respectively by

√
H = γξ ≤ 3 − 2 2 (20.31)
√
H1 = β ξ ≤ 3 − 2 2. (20.32)
Then, again we have that
√ √
H ≤ 3 − 2 2 =⇒ H1 ≤ 3 − 2 2
but not necessarily vice versa, unless, if β = γ.

(c) Concerning the error bounds and the limit of majorizing sequence, let us
define majorizing sequence {rα,k } by
ϕ(rα,k )
rα,0 = 0; rα,k+1 = rα,k − (r
ϕα,0 α,k )
for each k = 0, 1, 2, . . . , where

t
ϕα,0 (t) = ξ − t + α L0 (t)(t − u)du.
0
Suppose that
ϕ(r) ϕ(s)
− ≤−
ϕα,0 (r) ϕ (s)
for each r, s ∈ [0, R0 ] with r ≤ s. According to the proof of Theorem 20.3.1
sequence {rα,k } is also a majorizing sequence for GNA.
Moreover, a simple inductive argument shows that
rk ≤ sk ,
rk+1 − rk ≤ sk+1 − sk
and
r∗ = lim rk ≤ s∗ .
k−→∞
Furthermore, the first two preceding inequality are strict for n ≥ 2, if
L0 (u) < K(u) for each u ∈ [0, R0 ].
Similarly, suppose that

ϕ(s) ψ(t)
−
≤− (20.33)
ϕ (s) ψ (t)
for each s,t ∈ [0, R0 ] with s ≤ t. Then, we have that
sα,k ≤ tα,k
sα,k+1 − sα,k ≤ tα,k+1 − tα,k

and
s∗ ≤ t ∗ .
The first two preceding inequalities are also strict for k ≥ 2, if strict in-
equality holds in (20.33).

Specializations of Theorem 20.3.3 to some interesting cases such as Smale’s
α−theory (see also Wang’s γ−theory) and Kantorovich theory have been re-
ported in [17, 23, 25], if f0 (t) = f (t) = g (t) for each t ∈ [0, R) with R0 = R and
in [23], if f0 (t) < f (t) = g (t) for each t ∈ [0, R0 ). Next, we present examples
where f0 (t) < f (t) < g (t) for each t ∈ [0, R0 ) to show the advantages of the new
approach over the ones in [17, 24, 26]. We choose for simplicity m = n = α = 1.
Example 20.4.1 Let x0 = 1, D = U(1, 1 − q), q ∈ [0, 12 ) and define function F

on D by
F(x) = x3 − q. (20.34)
Then, we have that ξ = 3 (1 − q), L0
1
= 3 − q, L = 2(2 − q) and K = 2(1 + L10 ).
The Newton-Kantorovich condition (20.28) is not satisfied, since
1 1
h> for each q ∈ [0, ). (20.35)
2 2
Hence, there is no guarantee by the Newton-Kantorovich Theorem [17] that New-
ton’s method (20.1) converges to a zero of operator F. Let us see what gives: We
have by (20.29) that
1 1
h1 ≤ , if 0.461983163 < q < (20.36)
2 2
Hence, we have demonstrated the improvements using this example.
References

197–205, 2003.
putational Mathematics, 15, Editors: K. Chui and L. Wuytack. Elsevier,
1–14, 2010.
58, 23–52, 2011.
methods, Math. Comput., AMS, 80, 327–343, 2011.
1892–1902, 2012.
World Scientific Publ. Comp., New Jersey, 2013.
316
References 317
[11] A. Ben-Israel and T. N. E. Greville, Generalized inverses. CMS Books in

Mathematics/Ouvrages de Mathematiques de la SMC, 15. Springer-Verlag,
New York, second edition, Theory and Applications, 2003.
[12] J. V. Burke and M. C. Ferris, A Gauss-Newton method for convex compos-
ite optimization, Math. Programming Ser A., 71, 179–194, 1995.
optimization and nonlinear equations, Classics in Appl. Math., 16, SIAM,
Philadelphia, PA, 1996.
analysis of the Gauss–Newton method under a majorant condition, J. Com-
plexity, 27 (1), 111–125, 2011.
in Riemmanian manifolds, J. Complexity, 18 (1), 304–329, 2002.
[17] O. P. Ferreira, M. L. N. Gonçalves and P. R. Oliveira, Convergence of the
[18] W. M. Häussler, A Kantorovich-type convergence analysis for the Gauss-
Newton method. Numer. Math., 48, 119–125 , 1986.
methods, 305 and 306, Springer–Verlag, Berlin, 1993.
Oxford, 1982.
[21] C. Li and K. F. Ng, Majorizing functions and convergence of the Gauss-
Newton method for convex composite optimization, SIAM J. Optim., 18,
2, 613–692 (2007).
[22] C. Li and X. H. Wang, On convergence of the Gauss-Newton method for
convex composite optimization, Math. Program. Ser A., 91, 349–356, 2002.
[23] A. Magrenán and I. K. Argyros, Expanding the applicability of the Gauss-
Newton method for convex optimization under a majorant condition, SeMA,
65, 37–56, 2014.
[24] F. A. Potra, Sharp error bounds for a class of Newton-like methods, Libertas
Mathematica, 5, 71–84, 1985.
318 References

[26] R. T. Rockafellar, Monotone proccesses of convex and concave type, Mem-
oirs of the American Mathematical Society, 77. American Mathematical
Society, Providence, R.I., 1967.
[27] R. T. Rockafellar, Convex analysis, Princeton Mathematical Series, 28,
Princeton University Press, Princeton, N.J., 1970.
[28] S. Smale, Newton’s method estimates from data at one point. The merging
of disciplines: new directions in pure, applied, and computational mathe-
matics (Laramie, Wyo., 1985), 185–196, Springer, New York, 1986.
2000.
Chapter 21
Ball Convergence for

eighth order method
Ioannis K. Argyros

CONTENTS
21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
21.2 Local convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
21.1 Introduction
Consider the problem of approximating a locally unique solution x∗ of the non-
linear equation
F(x) = 0, (21.1)
where F is a Fréchet-differentiable operator defined on a convex subset D of
a Banach space X with values in a Banach space Y. The equation (21.1) cov-
ers wide range of problems in classical analysis and applications [1–30]. Closed
form solutions of these nonlinear equations exist only for few special cases which
319
may not be of much practical value. Therefore solutions of these nonlinear equa-
tions (21.1) are approximated by iterative methods.
Newton-like methods are undoubtedly the most popular method for approxi-
mating a locally unique solution x∗ provided that the initial point is close enough
to the solution [1–30]. The number of function evaluations per step increases
with the order of convergence. In the scalar case the efficiency index [3, 6, 21]
1
EI = p m provides a measure of balance where p is the order of the method and
m is the number of function evaluations.
It is well known that according to the Kung-Traub conjuncture the conver-
gence of any multi-point method without memory cannot exceed the upper bound
2m−1 [21] (called the optimal order). Hence the optimal order for a method
with three function evaluations per step is 4. The corresponding efficiency in-
1
dex is EI = 4 3 = 1.58740... which is better than Newtons method which is
1
EI = 2 2 = 1.414.... Therefore, the study of new optimal methods of order four
is important.
In the present paper, we consider the following method:
1
yn = xn − F (xn )−1 F(xn )
2
1
zn = (4yn − xn )
3
un = yn + (F (xn ) − 3F (zn ))−1 F(xn ) (21.2)
vn = un + 2(F (xn ) − 3F (zn ))−1 F(un )
xn+1 = vn + 2(F (xn ) − 3F (zn ))−1 F(vn ).
Using hypotheses up to the first derivative of function F and contractions on a
Banach space setting we proved the convergence of the method. Moreover we
avoid Taylor expansions and use instead Lipschitz parameters. We do not have
to use higher order derivatives to show the convergence of method (21.2). This
way we expand the applicability of method (21.2).
The rest of the paper is organized as follows. In Section 21.2 we present the
local convergence analysis. We also provide a radius of convergence, computable
error bounds and uniqueness result not given in the earlier studies using Taylor
expansions [13, 18, 20]. Special cases and numerical examples are presented in
the concluding Section 21.3.
21.2 Local convergence analysis

The local convergence analysis of method (21.2) that follows is based on some
scalar functions and parameters. Let v, w0 , w be continuous and non-decreasing
functions defined on [0,+∞) with values in [0,+∞) satisfying w0 (0) = w(0) = 0.
Let r0 be the smallest positive solution of the equation
w0 (t) = 1. (21.3)
Ball Convergence for eighth order method 321
Define functions g1 , g2 , p, h p , on the interval [0,r0 ) by,

1 1
w((1 − θ )t)dθ + 12 v(θt)dθ
g1 (t) = 0 0
,
1 − w0 (t)
1
g2 (t) = (1 + 4g1 (t)),
2
1
p(t) = (w0 (t) + 3w0 (g2 (t)t).
2
h p (t) = p(t) − 1.
We have that h p (0) = −1 < 0 and h p (t) → +∞ as t → r0− . It follows again from
the intermediate value theorem that function h p has zeros in the interval (0, r0 ).
Denote by r p the smallest such zero. Moreover define functions gi , hi , i = 3, 4, 5
on the interval [0, r0 ) by
1 1
w((1 − θ )t)dθ 3 [w0 (t) + w0 (g2 (t)t)] 0 v(θt)dθ
g3 (t) = 0
+ ,
1 − w0 (t) 4 (1 − w0 (t))(1 − p(t))
2
g4 (t) = (1 + )g3 (t),
1 − p(t)
2
g5 (t) = (1 + )g4 (t)
1 − p(t)
and
hi (t) = gi (t) − 1.
Then, we get that hi (0) = −1 < 0 and hi (t) → +∞ as t → r− p . Denote by ri
the smallest zeros of functions hi on the interval (0,r p ). Furthermore, define the
radius of convergence r by
r = min{ri }. (21.4)
Finally define functions r1 and r2 by r1 = g1 (r)r and r2 = g2 (r)r. Then, we have
that for each t ∈ [0, r)
0 ≤ gi (t) < 1, i = 3, 4, 5. (21.5)
Let U(z, ρ), U(z, ρ) stand for the open and closed balls in X, respectively with
center z ∈ X and of radius ρ > 0. Next, we present the local convergence analysis
of method (21.2) using the preceding notation.
Theorem 21.2.1 Let F : D ⊂ X → Y be a Fréchet differentiable operator. Sup-

pose:
(a) There exists x∗ ∈ D, such that F(x∗ ) = 0, F (x∗ )−1 ∈ L(Y, X);
(b) There exists function w0 : [0, +∞) → [0, +∞) continuous, non-
decreasing with w0 (0) = 0 such that for each x ∈ D,
F (x∗ )−1 (F (x) − F (x∗ ) ≤ w0 (x − x∗ ).
(c) Let R = max{r1 , r2 , r}. For r0 defined by (21.3), set D0 = D ∩ U(x∗ , r̄0 )
where r̄0 = max{r1 , r2 , r0 }. There exists functions w, v : [0, +∞) →
[0, +∞) continuous, non-decreasing with w(0) = 0 such that for each
x, y ∈ D0 ,
F (x∗ )−1 (F (x) − F (y) ≤ w(x − y)
and
F (x∗ )−1 F (x) ≤ v(x − x∗ ).
(d) U(x∗ , R) ⊆ D.
Then, the sequence {xn } generated for x0 ∈ U(x∗ , r) − {x∗ } is well de-
fined for each n = 0, 1, 2, . . . and converges to x∗ . Moreover, the following
estimates hold for each n = 0, 1, · · ·
yn − x∗ ≤ g1 (xn − x∗ )xn − x∗ ≤ g1 (r)r = r1 , (21.6)
zn − x∗ ≤ g2 (xn − x∗ )xn − x∗ ≤ g2 (r)r = r2 , (21.7)

un − x∗ ≤ g3 (xn − x∗ )xn − x∗ ≤ xn − x∗ < r (21.8)
∗ ∗ ∗ ∗
vn − x ≤ g4 (xn − x )xn − x ≤ xn − x (21.9)
and
xn+1 − x∗ ≤ g5 (xn − x∗ )xn − x∗ ≤ xn − x∗ (21.10)
where the gi , i = 1, 2, 3, 4, 5 functions are defined above the Theorem.

Furthermore, if there exists r∗ > r such that
1
w0 (θ r∗ )dθ < 1 (21.11)
0
then the point x∗ is the only solution of equation F(x) = 0 in D1 = D ∩

U(x∗ , r∗ ).
Proof: We shall show using mathematical induction that sequence {xn } is

well defined, converges to x∗ and estimates (21.6)-(21.10) are satisfied. Let
x ∈ U(x∗ , r). Using (21.3), (21.4) and (b) we have in turn that
F (x∗ )−1 (F (x) − F (x∗ ) ≤ w0 (x − x∗ ) ≤ w0 (r) < 1. (21.12)

It follows from (21.12) and the Banach Lemma on invertible operators [4, 5, 28]
that F (x)−1 ∈ L(Y, X) and
1
F (x)−1 F (x∗ ) ≤ . (21.13)
1 − w0 (x − x∗ )
In particular (21.13) holds for x = x0 , since by hypotheses x0 ∈ U(x∗ , r) − {x∗ },

y0 , z0 are well defined by the first and second substeps of method (21.2) for n = 0.
We can write by (a),
1
y0 − x∗ = x0 − x∗ − F (x0 )−1 F(x0 ) + F (x0 )−1 F(x0 ) (21.14)
2
1
F(x0 ) = F(x0 ) − F(x∗ ) = F (x∗ + θ (x0 − x∗ ))(x0 − x∗ )dθ . (21.15)
0
Notice that x∗ + θ (x0 − x∗ ) − x∗ = θ x0 − x∗ < r, for each θ ∈ [0, 1], so x∗ +
θ (x0 − x∗ ) ∈ U(x∗ , r).
Then, using (21.4), (21.5) (for i = 3), (c), (21.13), (21.14) and (21.15), we get
in turn that
1
y0 − x∗ ≤ F (x0 )−1 F (x∗ ) 0 F (x∗ )−1 [F (x∗ + θ (x0 − x∗ ))
−F (x0 )]dθ x0 − x∗ + 12 F (x0 )−1 F(x∗ )F (x∗ )−1 F(x0 )
1
w((1−θ )x0 −x∗ )dθ + 12 01 v(θ x0 −x∗ )dθ
≤ 0
1−w0 (x0 −x∗ ) ,
≤ g1 (r)r = r1 ,
(21.16)
which shows (21.6) for n = 0 and y0 ∈ U(x∗ , r1 ).
Then, from the second sub-step of method (21.2) and (21.16) we also get that
z0 − x∗ = ∗ ∗
3 4(y0 − x ) + (x0 − x )
1
∗ ∗
≤ 3 [4(y0 − x ) + (x0 − x )]
1
(21.17)
∗ ∗
≤ 3 [4g1 (x0 − x ) + 1](x0 − x )
1
≤ g2 (r)r = r2 ,
which shows (21.17) for n = 0 and z0 ∈ U(x∗ , r2 ). Next, we must show that
A0 = (F (x0 ) − 3F (z0 ))−1 ∈ L(Y, X). Using (b), (21.4) and (21.17), we obtain in
turn that
(−2F (x∗ ))−1 [A0 − F (x∗ )]
≤ 12 [F (x∗ )−1 [F (x0 ) − F (x∗ )] + 3F (x∗ )−1 (F (z0 ) − F (x∗ ))]
≤ 12 [w0 (x∗ − x0 ) + 3w0 (z0 − x∗ )
≤ 12 [w0 (x∗ − x0 ) + 3w0 (g2 (x∗ − x0 )x∗ − x0 )] = p(x∗ − x0 ) ≤ p(r) < 1,
(21.18)
so,
1
A−1 ∗
0 F (x ) ≤ . (21.19)
2(1 − p(x0 − x∗ ))
Hence, u0 , v0 and x1 are well defined. Then, using (21.4), (21.5), (21.13), (21.16),
(21.17), (c) and (21.19), we obtain in turn that
u0 − x∗ = (x0 − x∗ − F (x0 )−1 F(x0 )) + 32 [(F (x0 ) − F (x∗ ))
+(F (x∗ ) − F (z0 )]A−1

0 F(x0 )
1
w((1−θ )(x0 −x∗ ))dθ x0 −x∗
≤ 0
1−w0 (x0 −x∗ )

[w0 (x0 −x∗ )+w0 (g2 (x0 −x∗ )x0 −x∗ ) 01 v(θ x0 −x∗ )dθ ]x0 −x∗
+ 34 (1−w0 (x0 −x ))(1−p(x0 −x∗ ))
∗
= g3 (x0 − x∗ )x0 − x∗ ≤ x0 − x∗ < r,

(21.20)
1
v(θ x0 −x∗ )dθ x0 −x∗
v0 − x∗ ≤ u0 − x∗ + 2 0
1−p(x0 −x∗ )
≤ (1 + 1−p(x20 −x∗ ) )g3 (x0 − x∗ )x0 − x∗ (21.21)
= g4 (x0 − x∗ )x0 − x∗ ≤ x0 − x∗ < r
and
∗
2v0 −x
x1 − x∗ ≤ v0 − x∗ + 1−p(x ∗
0 −x )
≤ (1 + 1−p(x20 −x∗ ) )g4 (x0 − x∗ )x0 − x∗ (21.22)
= g5 (x0 − x∗ )x0 − x∗ ≤ x0 − x∗ < r,

which show (21.8)-(21.10) for n = 0 and u0 , v0 , x1 ∈ U(x∗ , r). By simply replac-

ing x0 , y0 , z0 , u0 , v0 , x1 by xk , yk , zk , uk , vk , xk+1 in the preceding estimates we arrive
at (21.6)-(21.10). Then, in view of the estimate
xk+1 − x∗ ≤ cxk − x∗ < r, (21.23)
where c = g5 (x0 − x∗ ) ∈ [0, 1), we conclude that lim xk = x∗ and xk+1 ∈

k→∞
1
U(x∗ , r). Let y∗ ∈ D1 with F(y∗ ) = 0 and set Q = 0 F (x∗ + θ (y∗ − x∗ ))dθ .
Using (b) and (21.11), we get that
1
F (x∗ )−1 (Q − F (x∗ )) ≤ 0 w0 (θ x∗ − y∗ )dθ
1 (21.24)
∗
≤ 0 w0 (θ r )dθ < 1,
so, Q−1 ∈ L(Y, X). Then, from the identity 0 = F(y∗ ) − F(x∗ ) = Q(y∗ − x∗ ), we
deduce that x∗ = y∗ . 2
Remark 21.2.2 (a) The radius r1 was obtained by Argyros in [4] as the con-
vergence radius for Newton’s method under condition (21.9)-(21.11). No-
tice that the convergence radius for Newton’s method given independently
by Rheinboldt [19] and Traub [21] is given by
2
ρ= < r1 .
3L
As an example, let us consider the function f (x) = ex − 1. Then x∗ = 0. Set
Ω = U(0, 1). Then, we have that L0 = e − 1 < l = e, so ρ = 0.24252961 <
r1 = 0.324947231.
Moreover, the new error bounds [2, 14] are:
L
xn+1 − x∗ ≤ xn − x∗ 2 ,
1 − L0 xn − x∗
whereas the old ones [26, 28]
L
xn+1 − x∗ ≤ xn − x∗ 2 .
1 − Lxn − x∗
Clearly, the new error bounds are more precise, if L0 < L. Clearly, we do
not expect the radius of convergence of method (21.2) given by r to be
larger than r1 (see (21.5)).
(b) The local results can be used for projection methods such as Arnoldi’s
method, the generalized minimum residual method (GMREM), the gen-
eralized conjugate method (GCM) for combined Newton/finite projection
methods and in connection to the mesh independence principle in order to
develop the cheapest and most efficient mesh refinement strategy [4–7].
(c) The results can be also be used to solve equations where the operator F 0
satisfies the autonomous differential equation [5, 7, 15, 17]:
F 0 (x) = P(F(x)),
where P : Y →Y is a known continuous operator. Since F 0 (x∗ ) = P(F(x∗ ))
= P(0), we can apply the results without actually knowing the solution x∗ .
Let as an example F(x) = ex − 1. Then, we can choose P(x) = x + 1 and
x∗ = 0.
(d) It is worth noticing that method (21.2) are not changing if we use the
new instead of the old conditions [23]. Moreover, for the error bounds in
practice we can use the computational order of convergence (COC)
ln kxkxn+2 −xn+1 k
n+1 −xn k
ξ= kxn+1 −xn k
, for each n = 1, 2, . . .
ln kx n −xn−1 k
or the approximate computational order of convergence (ACOC)

∗
ln kx n+2 −x k
kxn+1 −x∗ k
ξ∗ = ∗ , for each n = 0, 1, 2, . . .
ln kxkxn+1 −x k
∗
n −x k
instead of the error bounds obtained in Theorem 21.2.1.

(e) In view of (21.10) and the estimate
kF 0 (x∗ )−1 F 0 (x)k = kF 0 (x∗ )−1 (F 0 (x) − F 0 (x∗ )) + Ik
≤ 1 + kF 0 (x∗ )−1 (F 0 (x) − F 0 (x∗ ))k ≤ 1 + L0 kx − x∗ k
condition (21.12) can be dropped and M can be replaced by
M(t) = 1 + L0t
or
M(t) = M = 2,
since t ∈ [0, L10 ).

The numerical examples are presented in this section.
Example 21.3.1 Let X = Y = R3 , D = Ū(0, 1), x∗ = (0, 0, 0)T . Define function
F on D for w = (x, y, z)T by
e−1 2
F(w) = (ex − 1, y + y, z)T .
2
Then, the Fréchet-derivative is given by

⎡ x ⎤
e 0 0
F (v) = ⎣ 0 (e − 1)y + 1 0 ⎦ .
0 0 1
Notice that using the (21.11) conditions, we get w0 (t) = L0t, w(t) = Lt, v(t) = L,
1
L0 = e − 1, L = e L0 . The parameters are
r p = 0.1165, r3 = 0.0577, r4 = 0.0261, r5 = 0.0107 = r, r1 = 0.0098, r2 = 0.0250.
Example 21.3.2 Let X = Y = C[0, 1], the space of continuous functions defined
on [0, 1] and be equipped with the max norm. Let D = U(0, 1). Define function F
on D by
1
F(ϕ)(x) = ϕ(x) − 5 xθ ϕ(θ )3 dθ . (21.25)
0
We have that
1

F (ϕ(ξ ))(x) = ξ (x) − 15 xθ ϕ(θ )2 ξ (θ )dθ , for each ξ ∈ D.
0
Then, we get that x∗ = 0, w0 (t) = L0t, w(t) = Lt, v(t) = 2, L0 = 7.5, L = 15. The
parameters for method are
r p = 0.0054, r3 = 0.0022, r4 = 0.0229, r5 = 0.0115 = r, r1 = 0.0949, r2 = 0.02889.

References
[1] M. F. Abad, A. Cordero and J. R. Torregrosa, Fourth and Fifth-order meth-

ods for solving nonlinear systems of equations: An application to the global
positioning system, Abstract and Applied Analysis Volume 2013, Article ID
586708, 10 pages.
[2] S. Amat, M. A. Hernández and N. Romero, Semilocal convergence of a
sixth order iterative method for quadratic equations, Appl. Numer. Math.,
62, 833–841, 2012.
[6] I. K. Argyros and Said Hilout, Computational methods in nonlinear analy-
sis. Efficient algorithms, fixed point theory and applications, World Scien-
tific, 2013.
[7] I. K. Aryros and H. Ren, Improved local analysis for certain class of iter-
ative methods with cubic convergence, Numer. Algorithms, 59, 505–521,
2012.
ods II: The Chebyshev method, Computing, 45 (4), 355–367, 1990.
328
References 329
[10] C. Chun, P. Stănică and B. Neta, Third-order family of methods in Banach

spaces, Comput. Appl. Math. with Appl., 61, 1665–1675, 2011.
[11] A. Cordero, F. Martinez and J. R. Torregrosa, Iterative methods of order
four and five for systems of nonlinear equations, J. Comput. Appl. Math.,
231, 541–551, 2009.
[12] A. Cordero, J. Hueso, E. Martinez and J. R. Torregrosa, A modified
Newton-Jarratt’s composition, Numer. Algorithms, 55, 87–99, 2010.
[13] A. Cordero, J. R. Torregrosa and M. P. Vasileva, Increasing the order of
convergence of iterative schemes for solving nonlinear systems, J. Comput.
Appl. Math., 252, 86–94, 2013.
[14] R. Ezzati and E. Azandegan, A simple iterative method with fifth order
convergence by using Potra and Ptak’s method, Math. Sci., 3, 2, 191–200,
2009.
[15] J. M. Gutiérrez, Á. A. Magreñán and N. Romero, On the semi-local conver-
Appl. Math. Comput., 221, 79–88, 2013.
[16] V. I. Hasanov, I. G. Ivanov and F. Nebzhibov, A new modification of New-
ton’s method, Appl. Math. Eng., 27, 278–286, 2002.
[17] M. A. Hernández and M. A. Salanova, Modification of the Kantorovich
assumptions for semi-local convergence of the Chebyshev method, J. Com-
put. Appl. Math., 126, 131–143, 2000.
[18] J. P. Jaiswal, Semilocal convergence of an eighth-order method in Banach
spaces and its computational efficiency, Numer. Algorithms, 71, 933–951,
2016.
Oxford, 1982.
[20] J. S. Kou, Y. T. Li and X. H. Wang, A modification of Newton method with
third-order convergence, Appl. Math. Comput., 181, 1106–1111, 2006.
[21] Á. A. Magreñán, Different anomalies in a Jarratt family of iterative root
[23] M. S. Petkovic, B. Neta, L. Petkovic and J. Džunič, Multipoint methods for
solving nonlinear equations, Elsevier, 2013.
330 References
[24] F. A. Potra and V. Pták, Nondiscrete Induction and Iterative Processes, in:
Research Notes in Mathematics, Vol. 103, Pitman, Boston, 1984.
[25] H. Ren and Q. Wu, Convergence ball and error analysis of a family of
iterative methods with cubic convergence, Appl. Math. Comput., 209, 369–
378, 2009.
[26] W. C. Rheinboldt, An adaptive continuation process for solving systems of
nonlinear equations, In: Mathematical models and numerical methods (A.
N. Tikhonov et al. eds.) pub. 3, 129–142 Banach Center, Warsaw Poland,
1977.
[27] J. R. Sharma, P. K. Guha and R. Sharma, An efficient fourth order weighted-
Newton method for systems of nonlinear equations, Numer. Algorithms, 62
(2), 307–323, 2013.
[28] J. F. Traub, Iterative methods for the solution of equations, AMS Chelsea
Publishing, 1982.
[29] S. Weerakoon and T. G. I. Fernando, A variant of Newton’s method with
accelerated third-order convergence, Appl. Math. Let., 13, 87–93, 2000.
[30] T. Zhanlar, O. Chulunbaatar and G. Ankhbayar, On Newton-type methods
with fourth and fifth-order convergence, Bulletin of PFUR Series Mathe-
matics. Information Sciences, Physics, 2 (2), 29–34, 2010.
Chapter 22
Expanding Kantorovich’s
theorem for solving
Ioannis K. Argyros

CONTENTS
22.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
22.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
22.1 Introduction
In [18], G. S. Silva considered the problem of approximating the solution of the
generalized equation
F(x) + Q(x) 0, (22.1)
where F : D −→ H is a Fréchet differentiable function, H is a Hilbert space
with inner product ., . and corresponding norm ., D ⊆ H an open set and T :
331
H ⇒ H is set-valued and maximal monotone. It is well known that the system of

nonlinear equations and abstract inequality system can be modeled as equation of
the form (22.1) [17]. If ψ : H −→ (−∞, +∞] is a proper lower semi continuous
convex function and
Q(x) = ∂ ψ(x) = {u ∈ H : ψ(y) ≥ ψ(x) + u, y − x}, for all y ∈ H (22.2)
F(x) + ∂ ψ(x) 0,
including linear and nonlinear complementary problems. Newton’s method for

solving (22.1) for an initial guess x0 is defined by
F(xk ) + F (xk )(xk+1 − xk ) + Q(xk+1 ) 0, k = 0, 1, 2 . . . (22.3)
has been studied by several authors [1]– [24]. In [13], Kantorovich obtained a
convergence result for Newton’s method for solving the equation F(x) = 0 under
some assumptions on the derivative F (x0 ) and F (x0 )−1 F(x0 ). Kantorovich,
used the majorization principle to prove his results. Later in [16], Robinson con-
sidered generalization of the Kantorovich theorem of the type F(x) ∈ K, where
K is a nonempty closed and convex cone, and obtained convergence results and
error bounds for this method. Josephy [12], considered a semilocal Newton’s
method of the kind (22.3) in order to solving (22.1) with F = NC the normal
cone mapping of a convex set C ⊂ R2 .
The rest of this paper is organized as follows. Preliminaries are given in Sec-
tion 22.2 and the main results are presented in the concluding Section 22.3.
22.2 Preliminaries
Let U(x, ρ) and Ū(x, ρ) stand respectively for open and closed balls in H with
center x ∈ H and radius ρ > 0. The following Definitions and Lemmas are used
for proving our results. These items are stated briefly here in order to make the
paper as self contains as possible. More details can be found in [18].
Definition 22.2.1 Let D ⊆ H be an open nonempty subset of H, h : D −→ H be
a Fréchet differentiable function with Fréchet derivative h and Q : H ⇒ H be
a set mapping. The partial linearization of the mapping h + Q at x ∈ H is the
set-valued mapping Lh (x, .) : H ⇒ H given by
Lh (x, y) := h(x) + h (x)(y − x) + Q(y). (22.4)
For each x ∈ H, the inverse Lh (x, .)−1 : H ⇒ H of the mapping Lh (x, .) at z ∈ H

is defined by
Lh (x, .)−1 := {y ∈ H : z ∈ h(x) + h (x)(y − x) + Q(y)}. (22.5)

Expanding Kantorovich’s theorem for solving generalized equations 333

tone if for any x, y ∈ domQ and u ∈ Q(y), v ∈ Q(x) implies that the following
inequality holds:
u − v, y − x ≥ 0.

the following implication holds for all x, u ∈ H :
u − v, y − x ≥ 0 for each y ∈ domQ and v ∈ Q(y) ⇒ x ∈ domQ and v ∈ Q(y).

(22.6)
Lemma 22.2.4 ( [22]) Let G be a positive operator (i.e.,G(x), x ≥ 0). The fol-
lowing statements about G hold:
G2 = G2 ;
Lemma 22.2.5 ( [21, Lemma 2.2]) Let G be a positive operator. Suppose G−1
exists, then for each x ∈ H we have
x2
G(x), x ≥ .
G−1

We present the semilocal convergence analysis of generalized Newton’s method
using some more flexible scalar majorant functions than in [18].
Theorem 22.3.1 Let F : D ⊆ H −→ H be continuous with Fréchet derivative F
continuous on D. Let also Q : H ⇒ H be a set-valued operator. Suppose that there
exists x0 ∈ D such that F (x0 ) is a positive operator and F̂ (x0 )−1 exists. Let R > 0
and ρ := sup{t ∈ [0, R) : U(x0 ,t) ⊆ D}. Suppose that there exists f0 : [0, R) −→ R
twice continuously differentiable such that for each x ∈ U(x0 , ρ)
F̂ (x0 )−1 F (x) − F (x0 ) ≤ f0 (x − x0 ) − f0 (0). (22.7)

(h01 ) f0 (0) > 0 and f0 (0) = −1.
(h03 ) f0 (t) = 0 for some t ∈ (0, R).
Then, sequence {tn0 } generated by t00 = 0,
f0 (tn0 )
0
tn+1 = tn0 − , n = 0, 1, 2, . . .
f0 (tn0 )
is strictly increasing, remains in (0,t0∗ ) and converges to t0∗ , where t0∗ is the small-
est zero of function f0 in (0, R). Furthermore, suppose that for each x, y ∈ D1 :=
Ū(x0 , ρ) ∩U(x0 ,t0∗ ) there exists f1 : [0, ρ1 ) −→ R, ρ1 = min{ρ,t0∗ } such that
F̂ (x0 )−1 F (y) − F (x) ≤ f1 (x − y + x − x0 ) − f1 (x − x0 )
(22.8)
and x1 − x0 ≤ f1 (0)

(h11 ) f1 (0) > 0 and f1 (0) = −1.
(h13 ) f1 (t) = 0 for some t ∈ (0, ρ1 ).
(h14 ) f0 (t) ≤ f1 (t) and f0 (t) ≤ f1 (t) for each t ∈ [0, ρ1 ).
Then, f1 has a smallest zero t1∗ ∈ (0, ρ1 ), the sequences generated by generalized
Newton’s method for solving the generalized equation F(x) + Q(x) 0 and the
scalar equation f1 (0) = 0, with initial point x0 and t01 (or s0 = 0), respectively,
0 ∈ F(xn ) + F (xn )(xn+1 − xn ) + Q(xn+1 ),
f1 (tn1 ) f1 (sn )
1
tn+1 = tn1 − (or sn+1 = sn − )
f1 (tn )
1 f0 (sn )
are well defined, {tn1 } (or sn ) is strictly increasing, remains in (0,t1∗ ) and
converges to t1∗ . Moreover, sequence {xn } generated by generalized Newton’s
method (22.3) is well defined, remains in U(x0 ,t1∗ ) and converges to a point x∗ ∈
Ū(x0 ,t1∗ ), which is the unique solution of generalized equation F(x) + Q(x) 0
in Ū(x0 ,t1∗ ). Furthermore, the following estimates hold:
xn − x∗ ≤ t1∗ − tn1 , xn − x∗ ≤ t1∗ − sn ,
t1∗ − tn+1
1
xn+1 − x∗ ≤ xn − x∗ 2
(t1∗ − tn1 )2
t1∗ − s1n+1
xn+1 − x∗ ≤ xn − x∗ 2
(t1∗ − s1n )2
sn+1 − sn ≤ tn+1
1
− tn1 ,
and sequences {tn1 }, {sn } and {xn } converge Q−linearly as follows:

1
xn+1 − x∗ ≤ xn − x∗ ,
2
1
t1∗ − tn+1
1
≤ (t1∗ − tn ),
2
and
1
t1∗ − s1n+1 ≤ (t1∗ − sn ).
2
Finally, if
(h15 ) f1 (t1∗ ) < 0,
then the sequences {tn1 }, {sn } and {xn } converge Q−quadratically as follows:
D− f1 (t1∗ )
xn+1 − x∗ ≤ xn − x∗ 2 ,
−2 f1 (t1∗ )
D− f1 (t1∗ )
xn+1 − x∗ ≤ xn − x∗ 2 ,
−2 f0 (t1∗ )
D− f1 (t1∗ ) ∗
t1∗ − tn+1 ≤ (t − tn )2 ,
−2 f1 (t1∗ ) 1
and
D− f1 (t1∗ ) ∗
t1∗ − sn+1 ≤ (t − sn )2 ,
−2 f0 (t1∗ ) 1
where D− stands for the left directional derivative of function f1 .
Remark 22.3.2 (a) Suppose that there exists f : [0, R) −→ R twice continu-
ously differentiable such that for each x, y ∈ U(x0 , ρ)
F̂ (x0 )−1 F (y) − F (x) ≤ f (x − y + x − x0 ) − f (x − x0 ).

(22.9)
If f1 (t) = f0 (t) = f (t) for each t ∈ [0, R), then Theorem 22.3.1 specializes
to Theorem 4 in [18]. Otherwise, i.e., if
f0 (t) ≤ f1 (t) ≤ f (t) for each t ∈ [0, ρ1 ), (22.10)
then, our Theorem is an improvement of Theorem 4 under the same com-

putational cost, since in practice the computation of function f requires
the computation of functions f0 or f1 as special cases. Moreover, we have
that for each t ∈ [0, ρ1 )
f0 (t) ≤ f (t) (22.11)
and
f1 (t) ≤ f (t) (22.12)
leading to tn1 ≤ tn , sn ≤ tn ,
1
tn+1 − tn1 ≤ tn+1 − tn (22.13)
s1n+1 − s1n ≤ sn+1 − sn (22.14)

and
t1∗ ≤ t ∗ , (22.15)
where {tn } is defined by
f (tn )
t0 = 0, tn+1 = tn − ,
f (tn )
t ∗ = limn−→∞ tn and t ∗ is the smallest zero of function f in (0, R) (provided

that the “h” conditions hold for function f replacing f1 and f0 ). If
f1 (t) f (s)
− ≤− (22.16)
f1 (t) f (s)
or
f1 (t) f (s)
− ≤− , (22.17)
f0 (t) f (s)
respectively for each t ≤ s. Estimates (22.13) and (22.14) can be strict if
(22.16) and (22.17) hold as strict inequalities.
(b) So far we have improved the error bounds and the location of the solution
x∗ but not necessarily the convergence domain of the generalized New-
ton’s method (22.3). We can also show that convergence domain can be
improved in some interesting special cases. Let F ≡ {0},
L
f (t) = t 2 − t + η,
2
L0 2
f0 (t) = t −t +η
2
and
L1 2
f1 (t) =
t − t + η,
2
where x1 − x0 ≤ η and L, L0 and L1 are Lipschitz constants satisfying:
F̂ (x0 )−1 F (y) − F (x) ≤ Ly − x
F̂ (x0 )−1 F (x) − F (x0 ) ≤ L0 x − x0

and
F̂ (x0 )−1 F (y) − F (x) ≤ L1 y − x,
on the corresponding balls. Then, we have that
L0 ≤ L
and
L1 ≤ L.
The corresponding majorizing sequences are
f (tn ) L(tn − tn−1 )2

t0 = 0, t1 = η, tn+1 = tn − = tn + , n = 1, 2, . . .
f (tn ) 2(1 − Ltn )
f1 (tn1 ) L1 (tn1 − tn−1

1
)2
t01 = 0, t11 = η, tn+1
1
= tn1 − = t 1
+ , n = 1, 2, . . .
f1 (tn1 ) n
2(1 − Ltn1 )
f1 (sn ) − f1 (sn−1 ) − f1 (sn−1 )(sn − sn−1 )

s0 = 0, s1 = η, sn+1 = sn −
f0 (sn )
L1 (sn − sn−1 )2
= sn + , n = 1, 2, . . . .
2(1 − L0 sn )
Then, sequences converge provided, respectively that
1
q = Lη ≤ (22.18)
2
and for the last two
1
q1 = L1 η ≤ ,
2
so
1 1
q≤=⇒ q1 ≤ .
2 2
It turns out from the proof of Theorem 22.3.1 that sequence {rn } defined
by [6]
L0 (r1 − r0 )2
r0 = 0, r1 = η, r2 = r1 + ,
2(1 − L0 r1 )
L1 (rn+1 − rn )2
rn+2 = rn+1 +
2(1 − L0 rn+1 )
is also a tighter majorizing sequence than the preceding ones for {xn }.
The sufficient convergence condition for {rn } is given by [6]:
1
q2 = Kη ≤ ,
2
where
1 √
K = (4L0 + L1 L0 + 8L02 + L0 L1 ).
8
Then, we have that

1 1
=⇒ q2 ≤ .
q1 ≤
2 2
Hence, the old results in [18] have been improved. Similar improve-
ments can follow for the Smale’s alpha theory [2, 6] or Wang’s γ−theory
[18, 22, 24]. Examples where L0 < L or L1 < L or L0 < L1 can be found
in [6]. It is worth noticing that (22.18) is the famous for its simplicity
and clarity Newton-Kantorovich hypothesis for solving nonlinear equa-
tions using Newton’s method [13] employed as a sufficient convergence
condition in all earlier studies other than ours.
(c) The introduction of (22.8) depends on (22.7) (i.e., f1 depends on f0 ). Such
an introduction was not possible before (i.e., when f was used instead of
f1 ).
Proof of Theorem 22.3.1 Simply notice that the iterates {xn } ∈ D1 which is a
more precise location than Ū(x0 , ρ) used in [18], since D1 ⊆ Ū(x0 , ρ). Then,
the definition of function f1 becomes possible and replaces f in the proof [18],
whereas for the computation on the upper bounds F̂ (x)−1 we use the more
precise f0 than f as it is shown in the next perturbation Banach lemma [13].
2
Lemma 22.3.3 Let x0 ∈ D be such that F̄ (x0 ) is a positive operator and
F̄ (x0 )−1 exists. If x−x0 ≤ t < t ∗ , then F̄ (x) is a positive operator and F̄ (x)−1
exists. Moreover,
F̄ (x0 )−1
F̄ (x)−1 ≤ . (22.19)
f0 (t)
Proof: Observe that
1 1
F̄ (x)− F̄ (x0 ) ≤ F̄ (x)− F̄ (x0 )+ (F̄ (x)− F̄ (x0 ))∗ = F̄ (x)− F̄ (x0 ).
2 2
(22.20)
Let x ∈ Ū(x0 ,t), 0 ≤ t < t ∗ . Thus f (t) < 0. Using, (h11 ) and (h12 ), we obtain that
F̄ (x0 )−1 F̄ (x) − F̄ (x0 ) ≤ F̄ (x0 )−1 F̄ (x) − F̄ (x0 )
≤ f0 (x − x0 ) − f0 (0)
< f0 (t) + 1 < 1. (22.21)
So by Banach’s Lemma on invertible operators, we have F̄ (x)−1 exists. More-

over by above inequality,
F̄ (x0 )−1 F̄ (x0 )−1 F (x0 )−1

F̄ (x)−1 ≤ ≤ = − .
1 − F̄ (x0 )−1 F (x) − F (x0 ) 1 − ( f0 (t) + 1) f0 (t)
Using (22.21) we have

1
F̄ (x) − F̄ (x0 ) ≤ . (22.22)
F̄ (x0 )−1
Thus, we have
y2
(F̄ (x0 ) − F̄ (x))y, y ≤ F̄ (x0 ) − F̄ (x)y2 ≤ ,
F̄ (x0 )−1
which implies,
y2
F̄ (x0 )y, y − ≤ F̄ (x)y, y. (22.23)
F̄ (x0 )−1
Now since F̄ (x0 ) is a positive operator and F̄ (x0 )−1 exists by assumption, we
obtain that
y2
F̄ (x0 )y, y ≥ . (22.24)
F̄ (x0 )−1
The result now follows from (22.23) and (22.24).
2
Remark 22.3.4 This result improves the corresponding one in [18, Lemma 8]
(using function f instead of f0 or f1 ) leading to more precise estimates on the
distances xn+1 − x∗ which together with idea of restricted convergence do-
mains lead to the aforementioned advantages stated in Remark 22.3.2.
Next, we present an academic example to show that: (22.11), (22.12) hold
as strict inequalities; the old convergence criteria (22.18) does not hold but the
new involving q1 and q2 do hold and finally the new majorizing sequences {sn }
and {rn } are tighter than the old majorizing sequence {tn }. Hence, the advan-
tages of our approach (as already stated in the abstract of this study) over the
corresponding ones such as the ones in [18] follow.
Example 22.3.5 Let F ≡ {0}, H = R, D = U(x0 , 1 − p), x0 = 1, p ∈ (0, 12 ) and
define function F on D by
F(x) = x3 − p. (22.25)
Then, we get using (22.25) that L = 2(2 − p), L0 = 3 − p, L1 = 2(1 + L10 ) and
η = 13 (1 − p), so we have
L0 < L1 < L
and
1
f0 (t) < f1 (t) < f (t) for each p ∈ (0, ).
2
In particular the sufficient Kantorovich convergence criterion (22.18) used
in [18] is not satisfied, since
1 1
q = Lη > for each p ∈ (0, ). (22.26)
2 2
Hence, there is no guarantee under the results in [18] that Newton’s method con-
√
verges to the solution x∗ = 3 p. However, our convergence criteria are satisfied:
1 1
q1 = L1 η ≤ for each p ∈ (0.4619831630, ). (22.27)
2 2
and
1 1
q2 = Kη ≤ for each p ∈ (0.2756943786, ). (22.28)
2 2
In order for us to compare the majorizing sequences {tn }, {sn }, {rn }, let p = 0.6.
Then, old condition (22.26) as well as new conditions (22.27) and (22.28) are
satisfied. Then, we present the following table showing that the new error bounds
are tighter. It is worth noticing that these advantages are obtained under the same
computational cost since the evaluation of the old Lipschitz constants L requires
the computation of the new constants L0 , L1 and K as special cases.
Table 22.1: Error bound comparison.
n tn+1 − tn sn+1 − sn rn+1 − rn

2 0.0397 0.0370 0.0314
3 0.0043 0.0033 0.0023
4 5.1039e-05 2.6253e-05 1.2571e-05
5 7.2457e-09 1.6743e-09 3.7365e-10
References

447–466, 2010.
Comput. Math. with Appl., 60, 2292–2301, 2010.
[5] I. K. Argyros, Improved local converge analysis of inexact Gauss-Newton
like methods under the majorant condition, J. Franklin Inst., 350 (6), 1531–
1544, 2013.
pings, Springer Monographs in Mathematics, Springer, Dordrecht, 2009.
rant condition, J. Comput. Appl. Math., 279, 318–335, 2015.
tions with values in a cone, arXiv: 1510.01947, 2015.
341
342 References

Akad Nauk SSSR (N.S.), 59, 1237–1240, 1948.
[15] F. A. Potra, The Kantorivich theorem and interior point methods, Math.
Programming, 102 (1), 47–70, 2005.
5 (1), 43–62, 1980.
[18] G. N. Silva, On the Kantorovich’s theorem for Newton’s method for solving
generalized equations under the majorant condition, Appl. Math. Comput.,
286, 178–188, 2016.
tions in pure, Applied and Computational Mathematics, pages 185–196,
Springer New York, 1986.
1979.
2009.
1859, 2015.
Newton-Kantorivich approximations and the Pták error estimates, Numer.
Funct. Anal. Optim., 9 (5-6), 671–684, 1987.
Index
A basins of attraction 1, 3, 13–22, 72, 77, 80–84,

89, 97–103
A posteriori parameter choice 163 bifurcation diagram 73, 75
A priori parameter choice 162, 167 bundle methods 285, 309
adaptive algorithm 57
adaptive choice rule 46, 57 C
adaptive continuation process 107, 160, 192,
250, 330 Cauchy sequence 54, 303
adjoint 207, 262 center-Lipschitz condition 3, 139, 147, 150, 158,
DI¿QHLQYDULDQWFRQYHUJHQFH 219, 248, 262, 264
algorithm 46, 50, 57, 273, 276, 283–285, 288, center-Lipschitz condition with L-average 147,
297, 308, 309, 312 150, 262, 264
Amat et al. method 218 Center-Lipschitz-type condition 209
ammonia 237, 238 center-majorant function 121, 124, 311
analysis 2, 3, 25, 27, 46, 50, 91, 93, 110, 121, Chebyshev-Halley-type, King method 178
131, 136, 145, 146, 149, 164, 167, 177–179, Chebyshev’s method 179
184, 187, 206, 217, 219, 227, 228, 232, 243, chemical engineering problems 239
253, 261, 263, 273, 283, 284, 296, 298, 299, classical Kantorovich criterion 116
308, 320–322, 334 classical Lipschitz condition 261
analytic 136 closed convex set 273, 286, 310
angle between two vectors 299 complementary problems 195, 206, 333
applications 120, 121, 135, 145, 157, 167, 178, complexity analysis 136
260, 283, 284, 308, 320 complexity of Newton iteration 203, 215, 270,
approximate computational order of convergence 342
327 complexity of Newton’s method 136
asymptotic error constant 70 complexity theory 269
attractor 72–74, 98 computational order of convergence 55, 70, 327
automatic computer 258 computer graphics 296
autonomous differential equation 327 concave type 281, 294, 318
average Lipschitz-conditions 212, 296, 316 cone inclusion problems 202, 214, 268, 341
consecutive polynomials 181
B conservative problems 239
continuous derivative 150, 153, 157, 158
balancing principle 56, 57 continuous functions 58, 92, 173, 190, 248, 328
ball convergence 320 convergence 1–3, 11, 13, 15, 25–27, 37, 38, 46,
Banach-lemma on invertible operators 5, 12, 28, 49, 50, 55, 70, 71, 76, 77, 89–91, 93, 109–
29, 33, 113, 184, 187, 221, 324 111, 114–116, 120, 121, 122, 124, 125, 132,
Banach perturbation lemma 277 136–139, 145, 146, 149, 150, 177–179, 182,
Banach space 1, 5, 10, 26, 31, 109, 120–122, 135, 184, 187, 189, 194–196, 199–202, 205–209,
136, 177, 179, 185, 217, 219, 225, 227, 217–219, 221, 227, 228, 232, 234–238, 242,
228, 232, 243, 272, 278, 284, 285, 308, 309, 243, 246–249, 252, 253, 256, 260–263, 268,
320, 321 272–275, 283, 284, 287–290, 296, 298, 299,
basic assumptions 162–164
344 Iterative Methods and Their Dynamics with Applications: A Contemporary Study
303, 307, 308, 311–315, 320–322, 326, 327, Euclidean inner product 298
332–334, 337–341 Euler-Chebyshev method 90
convergence analysis 2, 3, 25, 27, 46, 50, 91, 93, exact Newton iteration map 130
121, 145, 149, 177–179, 184, 187, 206, 217, Ezquerro et al. method 218
219, 227, 228, 232, 243, 261, 263, 273, 283,
284, 296, 298, 299, 308, 320–322, 334 F
convergence ball 246–249, 252, 253, 256
Fatou set 72, 98
convergence hypotheses 284, 308
Fibonacci sequence 246, 254
convergence improvement 109, 111, 114
¿QLWHGLPHQVLRQDO%DQDFKVSDFH
convex composite optimization 283, 284, 308
¿[HGSRLQWSUREOHPV
convex composite optimization problems 284,
¿[HGSRLQWWKHRU\
308
fractional conversion 237
convex optimization 283, 285, 307, 309
Fréchet-derivative 121, 173, 178, 189, 212, 328
convex subset 1, 5, 11, 12, 177, 217, 219, 227,
Fréchet-differentiable operator 1, 25, 26, 31, 109,
228, 232, 243, 285, 309, 320
120, 177, 184, 186, 187, 207, 217, 219, 228,
convex type 195, 333
243, 244, 246, 262, 284, 308, 320, 322
critical point 63, 72, 74–79, 98
full rank 147, 148, 150, 151, 153, 155, 157, 158
cutting method 285, 309
fundamental theorem of integration 165, 170
D
G
Derivative free 179, 242, 253
Gauss-Legendre quadrature formula 41
different multiplicity 13, 15
Gauss-Newton Algorithm 283, 284, 308, 309
differentiable operator 1, 25, 26, 31, 109, 120,
Gauss-Newton method 146, 153, 283, 284, 288,
177, 184, 186, 187, 207, 217, 219, 228, 243,
307, 308, 312
244, 246, 262, 284, 308, 320, 322
generalized conjugate method 326
directional derivative 198, 298, 336
Generalized equations 194, 205, 332
Directional Newton methods 296
generalized minimum residual methods 326
discrepancy principal 167
Generalized Newton method 260, 261
discrepancy principle 48, 167
geometric construction 23, 140, 292, 316
discretization process 41, 233
Green’s kernel 11
divided difference 218, 227, 232, 233, 243, 248,
Green’s function 190, 249
253, 272, 273, 278
GLYLGHGGLIIHUHQFHRI¿UVWRUGHU
H
Dynamical study 75
dynamics 63, 71, 72, 90, 91, 98 Halley’s method 1–3, 5, 8, 11, 13–22, 179
Hammerstein integral equation of the second kind
E 11, 39
Hilbert space 47, 163, 194, 206, 260, 266, 332
HI¿FLHQF\LQGH[
Holder type 167
HI¿FLHQWPHVKUH¿QHPHQWVWUDWHJ\
Eighth order method 320
I
element of least norm 277
equality-constraint optimization problem 297 ill-posed equations 46, 162
error 3, 39, 40, 42, 46, 48, 50, 55, 56, 66–68, 70, ill-posed problems 46, 164
116, 121, 123, 125, 127, 128, 135–137, 140, implicit functions 202, 213, 268, 279, 341
162–164, 166, 167, 195, 199, 206, 209, 211, inclusion 132, 209, 272, 273, 275, 285, 288, 310,
212, 219, 247, 248, 253, 256, 284, 286, 289, 312
291, 297, 298, 304, 308, 310, 313, 315, 321, Inexact Gauss-Newton method 146, 153
326, 327, 333, 337, 341 Inexact Gauss-Newton-like method 145, 146, 153
error bounds 3, 39, 40, 46, 50, 55, 116, 125, 137, Inexact Newton’s method 120, 123, 126, 131,
140, 166, 195, 199, 209, 212, 219, 247, 286, 135, 137, 139
289, 291, 298, 304, 310, 313, 315, 321, 326, initial points 2, 10, 90, 110, 123, 179, 197, 218,
327, 333, 337, 341 243, 265, 273, 284, 285, 308, 309, 321, 335
error estimates 42, 121, 162, 163, 166, 206, 209, inner product 47, 163, 194, 298, 332
219, 248, 284, 308 integral expression 95
Index 345
interior point method 142, 203, 269, 342 majorizing sequence 26, 27, 29, 32, 34, 38, 41,
intermediate value theorem 92, 180, 322 42, 50, 54, 114, 115, 138, 177, 179, 200,
invariant Fatou component 74 219, 243–245, 263, 274, 291, 299, 311, 315,
invertible operators 5, 12, 28, 29, 33, 113, 124, 338, 340, 341
184, 187, 201, 221, 226, 324, 339 mathematical induction 8, 29, 93, 152, 156, 221,
iterative method 47, 63, 65, 66, 69–71, 75, 97, 230, 265, 301, 323
98, 110, 120, 121, 177, 178, 321 mathematical modelling 2
iterative regularization method 47 mathematical programming 297
iteratively regularized Lavrentiev method 47, 49 matrix form 41
matrix norm 298
J maximal monotone 194, 196, 206, 207, 260, 262,
265, 266, 333, 334
Julia set 72, 98
maximal monotone operator 207, 260, 262, 265
max-norm 11, 39, 58, 233, 235
K
mean value theorem 28
Kantorovich hypothesis 39, 110, 114, 138, 201, mesh independence principle 326
339 minimization algorithms 141, 293, 317
Kantorovich theorem 26, 110, 121, 195, 292, 316, monotone process of convex and concave type
332, 333 281
Kantorovich theory 40, 109, 111, 113, 115, 135, monotone processes 281
292, 316 monotonically decreasing 263
Kantorovich-type convergence criterion 116 monotonically increasing 48, 55, 149, 164, 263
k-Fréchet-differentiable 1, 26, 120, 136, 146, Moore-Penrose inverses 146
147, 177, 184, 186, 187, 217–219, 228, 243, Müller’s method 252, 253, 256
244, 246, 284, 308, 320
King-Werner-like methods 242 N
King-Werner-type method 177, 179, 182, 184,
necessary conditions 235
186, 187, 189
negative polar 285, 309
Kung and Traub method 65
Newton iteration mapping 211
Kung-Traub conjuncture 321
Newton-Kantorovich hypotheses 290, 314
Kurchatov method 218
Newton-Kantorovich theorem 26, 292, 316
Newton-secant cone 273
L
Newton-Secant method 272–274, 278
Laguerre’s method 90 Newton’s method 25–27, 29, 31, 39–41, 90,
Lavrentiev regularization 46, 162, 163, 167 110–115, 121, 137, 139, 146, 178, 179, 189,
Lavrentiev regularization method 162 194, 195, 197, 199, 201, 205–207, 219, 263,
Least-change Secant update methods 305 265, 267, 290, 292, 314, 316, 326, 333–335,
library ARPREC 71 337, 339, 341
linearized problem 286, 310 nitrogen-hydrogen feed 237
Lipschitz parameters 321 noisy data 47, 163
Lipschitz-type conditions 146, 219 nonlinear Hammerstein integral equation of the
local analysis 213, 328 second kind 11, 39
local convergence 89–91, 93, 145, 146, 149, 150, Nonlinear Ill-posed equations 46
178, 179, 187, 205–207, 217–219, 228, 232, nonlinear ill-posed operator equation 163
242, 243, 246, 308, 311, 312, 320–322 nonlinear least square problems 146
local convergence analysis 91, 93, 145, 149, 178, nonlinear monotone operators 46
187, 217, 219, 228, 232, 243, 320–322 norm of the maximum of the rows 236
local convergence study 206, 218, 219, 243 normal cone mapping 195, 333
local study 2 normed convex process 273
logarithmic type 167 numerical examples 1, 3, 10, 25, 27, 37, 39, 46,
50, 57, 90, 91, 102, 146, 162–164, 173, 177,
M 179, 189, 206, 217, 219, 233, 242, 243, 247,
252, 253, 257, 278, 283, 284, 292, 298, 308,
majorant function 121, 124, 125, 133–137, 196, 316, 320, 321, 327
210, 283, 284, 287, 289, 308, 311, 313, 134
346 Iterative Methods and Their Dynamics with Applications: A Contemporary Study
O Riemann sphere 72, 98

Riemannian manifolds 140, 141
omega-conditions 27 Robinson’s strong regularity 213
optimal error estímate 163, 167 robust convergence 120
optimal order error estímate 48 robust semi local convergence 202, 214, 268, 341
optimization software 304 rounding error 297
orbit of a point 72, 98 R-quadratically 289, 313
order of convergence 55, 70, 110, 321, 327
Osada’s method 91 S
Ostrowski method 64
scheme USS 68–70
P Secant-like methods 217, 218
self-concordant functions 136
parabolic 72, 73, 98 semilocal convergence 1–3, 25, 27, 110, 114,
partial linearization 195, 333 120–122, 136, 194, 196, 208, 211, 212, 217,
partial linearization of a mapping 195, 333 219, 221, 227, 228, 242, 243, 253, 256, 260,
periodic point 72, 98 262, 283, 284, 287, 290, 296, 298, 299, 308,
PIES energy model 203, 214, 269, 342 314, 332, 334
positive integrable function 147–149 semilocal convergence analysis 2, 3, 25, 27, 121,
positive operator 196, 201, 202, 207, 210, 211, 217, 219, 227, 243, 284, 296, 298, 299, 308,
261, 264, 265, 334, 339, 340 334
Potra-Pták method 179 semilocal results 110, 244, 264
projection methods 326 sequences 34, 35, 39, 41, 42, 50, 131, 132, 177,
proper function 196, 207, 334 179, 197, 198, 200, 245, 248, 263, 264, 267,
proper lower semicontinuous 333 274, 288, 299, 311, 313, 335, 336, 338, 340,
proper lower semicontinuous convex function 341
333 set mapping 195, 333
Pták error estimates 143, 203, 270, 342 set of minimum points 284, 286, 309, 310
set-valued mapping 195, 285, 287, 309, 311, 333
Q
set-valued operator 196, 207, 272, 273, 334
Q-linearly convergence 123, 132, 133, 136, 137, sharp error bounds 239, 293, 317
288, 312 sign of s 254
Q-quadratically 289, 313 Sixth-order iterative methods 63, 65, 69
quadratic minimization problems 136 Smale’s \alpha-theory 201, 268, 339
quartic equation 237 solution mappings 202, 213, 268, 279, 341
quasi-regular bound 286, 288, 310, 312 space of continuous functions 173, 190, 248, 328
quasi-regular bound function 286, 288, 310, 312 stopping-index 47, 48
quasi-regular function 286, 310 VWUDQJH¿[HGSRLQW±
quasi-regular point 286, 288, 310, 312 Strongly regular generalized equations 203, 214,
quasi-regular radius 286, 288, 310, 312 269, 342
quasi-regularity condition 283, 284, 308 subgradient 196, 207, 285, 309, 334
VXI¿FLHQWFRQYHUJHQFHFRQGLWLRQV
R 182, 200, 201, 219, 303, 338, 339
VXI¿FLHQWFULWHULRQ
radius Lipschitz condition 148, 150, 153, 157, super attractor 72, 73, 98
158
radius of convergence ball 246–249, 253, 256 T
real Hilbert space 47, 163
recurrence relations 2 tangent methods 192, 249
regular point 286, 288, 310, 312 Taylor expansions 321
regularization methods 47, 162, 163 Taylor’s series 30, 34, 66, 67
regularization parameter 56, 163, 167 three-step uniparametric family 64
repulsor 72, 73, 98 transform equation 233
residual relative error tolerance 123, 135, 137 triangle inequality 128, 165
restricted domains 205, 296 two-point method 321
Index 347
U W
uniqueness ball 11, 13 :DQJ¶VȖWKHRU\
uniqueness of solution 25, 32 weak center Lipschitz condition 146
weak Lipschitz condition 147
V weaker conditions 48, 253
valued operator 196, 206, 207, 261, 272, 273, 334 Werner method 177–179, 181, 184, 186, 187,
variational inequalities 202, 203, 268, 270, 341, 189, 242
342
variational inequality problem 195, 206, 333

Iterative Methods and Their Dynamics With Applications A Contemporary Study

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Iterative Methods and Their Dynamics With Applications A Contemporary Study

Transféré par

Droits d'auteur :

Formats disponibles

Iterative Methods and Their

Dynamics with Applications

© 2017 by Taylor & Francis Group, LLC

No claim to original U.S. Government works

Printed on acid-free paper

International Standard Book Number-13: 978-1-4987-6360-8 (Hardback)

Library of Congress Cataloging‑in‑Publication Data

Names: Argyros, Ioannis K. | Magreñán, Aì. Alberto (Aìngel Alberto)

Visit the Taylor & Francis Web site at

To my parents Mercedes and Alberto

4 Sixth-order iterative methods 63

9.3.2 Aposteriori parameter choice 166

16.2 Preliminaries 260

1.1 Basins of attraction associated to the Halley’s method applied to 14

1.16 Basins of attraction associated to the Halley’s method applied to 22

2.1 Sequences u–n and un and a priori error bounds 39

α To solve the generator algorithms have also been

Á. Alberto Magreñán

brought in a form similar to equation (1.1) using mathematical modelling [1, 3,

xn+1 = xn − ΓF (xn )F(xn ), f or each n = 0, 1, 2, . . . , (1.2)

The corresponding sufﬁcient convergence condition is given by

Therefore, there is no constant K satisfying (C4 ).

a tighter convergence analysis,

As a consequence of the idea, we can use a different approach than recur-

1.2 Semilocal convergence of Halley’s method

(C6 ) Suppose that

and there exists

xn+2 − xn+1 ≤ bxn+1 − xn . (1.13)

Furthermore, if there exists R ≥ R0 such that

Proof: By the deﬁnition of (C1 ) − (C3 ) we get that

Then, by (C2 ), (1.2) and the above estimate we obtain that

I − F (x0 )−1 F (x1 ) ≤ Lx1 − x0 ≤ Lγ0 < LR0 = β < 1.

Hence, F (x1 )−1 exists and F (x1 )−1 F (x0 ) is bounded by

So, x2 is well deﬁned, and using (1.2) we get

(iv) (I − LF (xn+1 ))−1 exists and

(v) xn+2 is well deﬁned and

(vi) xn+2 − x0 < R0 .

Here, sequence {dn } is deﬁned by

I − F (x0 )−1 F (xk+2 ) ≤ Lxk+2 − x0 < LR0 = β < 1.

Hence, F (xk+2 )−1 exists and

F(xk+2 ) = F(xk+2 ) − F(xk+1 ) − F (xk+1 )(xk+2 − xk+1 )

Hence, we get that

As a consequence, (I − LF (xk+2 ))−1 exists and

Furthermore, we have that

xk+3 − x0 ≤ xk+3 − xk+2 + xk+2 − xk+1 + · · · + x1 − x0

xk+m − xk ≤ xk+m − xk+m−1 + xk+m−1 − xk+m−2 + · · · + xk+1 − xk

0 = F (x0 )−1 (F(y ) − F(x )) = F (x0 )−1 T (y − x )

1.3 Numerical examples

F (x) = 3x2 , F (x) = 6x. (1.18)

So, F(x0 ) = −0.025875, F (x0 ) = 2.1675 and γ = 0.011937716.

1.4 Basins of attraction

Figure 1.1: Basins of attraction associated to the Halley’s method applied

Figure 1.2: Basins of attraction associated to the Halley’s method applied to

Figure 1.3: Basins of attraction associated to the Halley’s method applied to

Figure 1.4: Basins of attraction associated to the Halley’s method applied to

Figure 1.6: Basins of attraction associated to the√Halley’s method applied to

Figure 1.7: Basins of attraction associated to the√Halley’s method applied to

Figure 1.8: Basins of attraction associated to √

Figure 1.9: Basins of attraction associated to √

Figure 1.10: Basins of attraction associated to the Halley’s method applied to

Figure 1.11: Basins of attraction associated to the Halley’s method applied to

Figure 1.12: Basins of attraction associated to the Halley’s method applied to

Figure 1.14: Basins of attraction associated to the Halley’s method applied to

2 ) are satisﬁed denote v = vm−1 + τ(vm − vm−1 ) with τ ∈ [0, 1],