Vous êtes sur la page 1sur 8

An analysis of action-minimizing strategies for

narrow quality challenges

November 21, 2018

1 Introduction
Suppose that you’re raising your Master Thief quality as high as possible by
repeatedly robbing the Bazaar. Assume that your strategy is to:
1. Raise your Casing to a fixed level L and play the challenge.
2. If you succeed, you’re done. Otherwise, raise your Casing level back to
L, clear any menaces gained, and play the challenge again, repeating
until you succeed.
What choice of L will allow you to succeed at the challenge by spending
the least number of actions? On the one hand, playing the challenge at
the minimum required Casing level of 15 has a high chance of failing and
incurring an action penalty (in the sense that actions must be spent raising
Casing back up to 15 again) but the initial action investment needed to
reach a Casing level of 15 is much lower. On the other hand, playing the
challenge at a Casing level of 20 ensures that you will never fail, but the
action investment needed to reach this Casing level is much higher.
It turns out that for this particular challenge and strategy, attempting
to rob the Bazaar at a Casing level of 16 is optimal1 , in the sense that
it minimizes the number of actions needed to succeed at the challenge. The
rest of this writeup develops a framework for analyzing this and other narrow
quality challenges, which will allow us to determine the action-minimizing
quality level L for the strategy described above.
1
However, depending on your appetite for risk (or, more precisely, your lack thereof),
playing this challenge at a Casing level of 18 might be preferable. More on this later.

1
2 Analysis
Let E(A) be the expected number of actions invested after succeeding at
the progress-based challenge. Our goal is to derive an expression for E(A) in
terms of L, the quality level at which we choose to play the strategy described
in the introduction. Such an expression will allow us to determine the value
of L that minimizes E(A), either analytically or computationally, so we’ll
start by taking a look at how this can be done.

2.1 Initial action investment


We begin by noting that from standard Fallen London mechanics, the number
of change points CL needed to achieve a quality level of L is

L(L + 1)
CL =
2
Letting G be the number of change points grinded per action, which we
will assume to be deterministic i.e. nonrandom and fixed, the initial number
of actions AI needed to achieve a progress level of L and to play the challenge
is

CL L(L + 1)
AI = +1= +1
G 2G
Since G is deterministic, so is AI . Note that AI was deliberately chosen
to include the one action it takes to play the challenge - using this as our
reference state will simplify subsequent analysis a little.

2.2 Action penalty for failures


Next, let F be the random variable representing the number of times that we
fail at the challenge before succeeding, and let AP be the number of actions
needed in order to return our player-state to where we were right before
failing at the challenge, which we assume to be constant. This means that
AP includes the actions needed to clear menaces gained as a result of failing at
the challenge (e.g. Suspicion gained from failing to rob the Bazaar), raise the
relevant quality level being tested in the challenge back to where it was before
failing at the challenge (e.g. regaining Casing lost when failing to rob the
Bazaar), and also to play the challenge again. Then clearly, after investing

2
AI actions into initially raising the relevant quality level to L, the additional
number of actions needed to succeed at the progress-based challenge is F·AP ,
giving us the relation

A = AB + AP · F (1)
Here A is the number of actions spent after succeeding at the progress-
based challenge – note that since F is a random variable, so is A. Substituting
in our earlier expression for AB and using the linearity of expectations gives
L(L + 1)
E(A) = 1 + + AP · E(F)
2G

2.3 Failure probabilities and expectations


From the equation we just derived, we see that if we can find a way to express
E(F) in terms of L, we are nearly done. How are the possible outcomes of F
distributed probabilistically? Letting pL be the probability of succeeding at
the challenge when playing it at quality level L, we can deduce that:
• With probability pL , we will succeed on the very first try, so that F = 0.

• With probability (1 − pL ) · pL , we will fail on the first try but succeed


on the second try2 , so that F = 1.

• With probability (1 − pL )2 · pL , we will fail on the first two tries but


succeed on the third, in which case F = 2.
In general, letting P(F = k) be the probability of experiencing k failures
before finally succeeding, we have P(F = k) = (1−pL )k ·pL , which is precisely
the form of the probability distribution for a geometric random variable.
Consequently, it follows that E(F) = (1 − pL )/pL = (1/pL ) − 1, giving us
 
L(L + 1) 1
E(A) = 1 + + AP −1
2G pL
Finally, let LC be the challenge level for the narrow quality challenge,
then standard Fallen London mechanics tell us that
(L − LC ) + 6
pL =
10
2
Assuming the trials are independent.

3
This equation holds when LC −5 ≤ L ≤ LC +4, and we’ll assume that this
is always the case, since there is clearly no benefit to raising the quality level
above LC + 4 where success is guaranteed, and I haven’t yet come across an
interesting example of a narrow quality challenge that allows players to take
the challenge with a progress level below LC −5 (in fact, in the initial example
of robbing the Bazaar, the challenge can only be taken when L ≥ LC − 1).
Plugging this back into our earlier equation gives us the final form of our
expression for E(A):
 
L(L + 1) 10
E(A) = 1 + + AP −1 (2)
2G L − LC + 6
Note that other than L, everything on the right side of this equation is a
constant. As a quick sanity check, we observe that when L = LC + 4, where
the challenge success rate is 100%, the term in parentheses above vanishes,
so that E(A) = AI , meaning that the only actions spent are those initially
used to raise the quality level for the challenge, which is exactly what we
would expect.
Given this closed-form expression for E(A) in terms of L and the range
of possible values that L can take on, it’s certainly possible to determine
the value of L that minimizes E(A) by using standard results from calculus
that involve computing the first and second derivatives of E(A) with respect
to L. This is not difficult, but is rather time-consuming, so we’ll leave that
derivation to the interested reader. Instead, in what follows, we’ll use compu-
tational methods (i.e. Wolfram Alpha, though this could also be done with
a spreadsheet) to identify the action-minimizing value of L, which is possible
because the range of allowed values of L is small.

3 Applications
The explicit expression for E(A) in terms of L that we derived above allows
us to determine the action-minimizing quality level for narrow quality chal-
lenges. We’ll work through a couple of different examples to show how this
can be done.

4
3.1 Robbing the Bazaar
As in the introduction, we seek the value of the Casing level L at which to
repeatedly play the Casing challenge such that the number of actions spent
is minimized. In order to apply equation 2, we’ll first need to compute the
value of all the constants in the equation. We can do this as follows:
• By combining the optimum EPA grind with the option to purchase
Casing assistance from the Big Rat as described here, it’s now possible
to grind Casing at 4.37 change points per action, so that G ' 4.37 in
equation 2 above.
• Failing to rob the Bazaar incurs a 20 CP penalty to Casing, while
also increasing Suspicion by 10 CP and Nightmares by 36 CP. The
Casing penalty can be paid off in 20/4.37 ' 4.58 actions, and assuming
Suspicion and Nightmares are cleared using social actions at 4 and 6
CP per action respectively, that means that an additional 8.5 actions
are spent wiping menaces, so that AP = 4.58 + 8.5 + 1 = 14.08. Notice
the additional 1 action needed to play the challenge again is included
in AP .
• Finally, we know that LC , the Casing challenge level, is 16 in this
scenario.
Substituting all of these values into equation 2 gives us
 
L(L + 1) 10
E(A) = 1 + + 14.08 −1
8.74 L − 10
The Casing challenge can only be played when your Casing level is at
least 15, so using Wolfram Alpha to compute the values of E(A) as L varies
between 15 and 20 gives us

L 15 16 17 18 19 20
E(A) 42.54 41.51 42.05 43.65 46.04 49.05
We conclude that under the strategy described in the introduction, at-
tempting to rob the Bazaar at a Casing level of 16 minimizes the number of
actions required to succeed at the challenge, after accounting for the addi-
tional actions that need to be spent clearing menaces and re-raising Casing
on failure.

5
3.2 Hunting a spider-council
This time, we’re looking to determine the value of the Hunt Is On level L
that will minimize the number of actions spent hunting spider-councils. This
wiki page suggests it’s not possible to raise The Hunt is On more quickly
than 3 CP per action, so that in equation 2 above, we have G = 3. The
challenge level here is LC = 16, and since failing at the challenge gives 5
CP of Wounds, 2 CP of Nightmares and a 10 CP loss of The Hunt is On,
using social actions to clear menaces results in a action penalty on failure of
AP = 5/6 + 2/6 + 10/3 + 1 = 5.5. So, for this scenario, we have
 
L(L + 1) 10
E(A) = 1 + + 5.5 −1
6 L − 10
This results in the following values for E(A) as L varies between 15 (the
minimum required The Hunt is On level to play the challenge) and 20:

L 15 16 17 18 19 20
E(A) 46.5 50 54.36 59.38 64.94 71

We conclude that hunting spider councils at a Hunt is On level of 15 is


optimal.
Intuitively, we can understand the difference between this example and
the previous one by noting that the action penalty for failing to rob the
Bazaar is much higher and the quality level L can be raised more quickly
(the value of G is larger), both of which are factors which skew the action-
minimizing value of L higher. In this scenario, the action penalty for failing
at the Hunt is On challenge is much lower, and the rate at which the quality
level L can be raised is lower, causing the action-minimizing value of L to
skew lower.

4 Ideas for future analysis


There are many incremental extensions that could be made to this analysis,
such as computing the value of L that maximizes the EPA for grinds that
involve taking a narrow quality challenge while accounting for the Echoes
gained from raising the quality itself e.g. hunting goat demons in the Flit
before advancing through the Labyrinth of Tigers. Additionally, it could
help to try to extend this analysis to account for more complex strategies

6
beyond simply raising the challenge quality to a fixed level L every time - for
example, raising the challenge quality to L, playing the challenge over and
over until the challenge quality has dropped below the minimum required
level, and then re-raising the quality back to level L again.
However, in general, what I’d really be interested in seeing more of are
analyses that don’t just optimize for expected rewards/returns (which, ad-
mittedly, this writeup does as well), but rather also try to optimize for risk-
reward tradeoffs as well3 . All else being equal, we should prefer a strategy
that has less variability to one that has more, since the first strategy reduces
uncertainty and the need to prepare for unexpected losses.
Here’s an example of what that might look like, as applied to the analysis
above. For each possible choice of a Casing level L at which we might attempt
to rob the Bazaar using the strategy described in the introduction, we’d like
to determine not just the expected number of actions E(A) taken to succeed,
but how much variability there is in the expected number of actions taken
to succeed, represented by the variance Var(A). Equation 1 together with
standard properties of the variance tells us that Var(A) = A2P Var(F), and
since F has a geometric distribution, we know that Var(F) = (1 − pL )/p2L , so
that in the scenario where we’re robbing the Bazaar, we have
   2 !
1 1 10 10
Var(A) = (14.08)2 − = 198.2 −
p2L pL L − 10 L − 10
p
Using this expression to compute the standard deviation σ(A) = Var(A)
as L varies gives us

L 15 16 17 18 19 20
E(A) 42.54 41.51 42.05 43.65 46.04 49.05
σ(A) 19.91 14.83 11.02 7.87 4.94 0

Based on these numbers, a risk-averse player might reasonably conclude


that it might be better to attempt to rob the Bazaar at a Casing level of 18,
rather than 16. Why? Using L = 20 as our risk-free reference point (since
there is no chance of failing the Casing challenge with a Casing level of 20),
we see that playing the challenge at a Casing level of 18 saves 5.4 actions
in comparison to the risk-free reference point, with a standard deviation of
3
See, for e.g., the Sharpe ratio in finance, which attempts to provide a measure of
whether the additional returns provided by an investment justify the risk taken on.

7
7.87. Playing the challenge at a Casing level of 16 saves only an additional
2 actions, but the variability in outcomes nearly doubles, meaning that you
could make an argument that on the margin, the additional risk incurred
does not justify the additional action savings - in fact, we can do the explicit
computations to show that L = 18 maximizes the action savings-standard
deviation ratio, which is in some sense the risk-adjusted action savings.
A more thorough exploration of how best to weigh risk-reward tradeoffs
such as these is outside the scope of this writeup, but definitely something
I’d be interested in reading.

Vous aimerez peut-être aussi