SBF was right, we can't reject fanaticism
What to do about tiny probabilities in expected utility decisions.
Consider the following two gambles.
Option A - A guaranteed happy life. Say, 100 years full of all the things that make your life great.
Option B - A 1/1020 (0.0000000000…1) chance of a much better life and a complementary [1 - 1/1020 ] (ie 0.99999999999…) chance of instant death.
In other words, Option A is a great life with certainty. Option B is almost guaranteed death and a tiny chance of an extraordinarily good life. Not a very difficult choice, right?
Roughly speaking, fanaticism is the position that there is some conceivable “much better life” such that Option B is preferred. Anti-fanaticism is the position that there is no such possible life where Option B is preferred. Stated in this way, fanaticism is one of the most unintuitive philosophical positions I have ever come across. Over the past few weeks, I have been reading every paper I can find on the topic of fanaticism. Incredibly, almost all philosophical literature on fanaticism defends the position and along with it, choosing Option B in the above thought experiment. In this blog post, I want to explore the two strongest arguments for why we cannot reject fanaticism.
In effective altruism circles, what we do about fanaticism is highly important in deciding how to do the most good. Anti-fanatics might be inclined to support interventions in global health or animal welfare which are almost guaranteed to save human or animal lives cost effectively. In contrast, a fanatic might wish to work on or donate to AI research, knowing that their impact reduces the chance of AI doom by say 1 / 1010 but that if they are successful, this will prevent untold suffering for current and future generations, thus creating a higher expected utility.
At first glance, fanaticism as a philosophical position seems like a non-starter. How could there possibly be an amount of utility that makes a bet worth taking with a 99.9999…% chance of losing everything and a tiny chance of getting that high utility?
Unfortunately, recent literature suggests the fanatic position cannot be dismissed easily. Philosophers make progress on this type of question by using maths and logic to show which other positions are implied by a position like fanaticism or anti-fanaticism. Put simply, if we reject fanaticism, philosophers can show what else we are obliged to believe in order to maintain a consistent philosophical position. The reason why philosophers are reluctant to reject fanaticism, is that if they do, they must also reject many other philosophical axioms which appear self-evidently correct.
Before we start, here is some basic terminology. When I refer to utility, I mean the highest form of value. Higher utility means more of everything that is good; more wellbeing, freedom, justice or whatever you think makes an outcome better. Here’s one way to think about how utility works. One happy world full of freedom and justice implies X utility. 100 happy worlds full of the same freedom and justice implies 100X utility1. I’ll explain why this definition matters later but for now, just note that instead of talking about a “better life”, I will henceforth talk about higher utility. You could think of doubling utility from a personal viewpoint as doubling how good your life is, or from an impersonal viewpoint as doubling the number of good lives in the world. Both would make sense. I’ll say that death amounts to 0 utility. Finally, expected utility is calculated by multiplying the probability of getting a certain amount of utility, by the utility you get conditional on a certain outcome being realised.
With all of that out the way, here’s why we cannot easily reject fanaticism.
The continuity justification:
In my opinion, the strongest justification for not rejecting fanaticism is that, if you do, you must also reject a form of continuity. Consider the following two options.
Option 1:
1 / 1020 chance of 0 utility (instant death)
Complementary [1 - 1 / 1020 ] chance of 100 utility (Your current life)
Option 2:
1 / (1020 - 1) - An ever so slightly higher probability of 0 utility
Complementary [1 - 1 / (1020 - 1)] chance of 100 trillion utility
It seems to me that Option 2 is clearly better than Option 1. The probability of instant death, though higher, is still extremely low in Option 2. And the utility from the good outcome is 1 trillion times higher and this outcome is almost guaranteed. Thus, if you trade Option 1 for Option 2, that would be a beneficial trade.
Now consider another 2 set of gambles. I’ve simplified the notation a bit. p means probability
Option 2 (same as before):
p = 1 / (1020 - 1) | Utility = 0
p = [1 - 1 / (1020 - 1)] | Utility = 100 trillion = 1014
Option 3:
p = 1 / (1020 - 2) | Utility = 0
p = [1 - 1 / (1020 - 2)] | Utility = 1026
Once again, it seems that accepting a barely noticeable increase in the chance of the bad outcome is worth it in exchange for increasing the value of the good outcome 1 trillion times. That is to say, moving from Option 2 to Option 3 is another beneficial trade.
This is where we encounter a problem. If we keep making these beneficial trades - exchanging an ever so slightly higher probability of death for a massive increase in utility, we end up at the problem from the start of the blog post. We end up with a tiny probability attached to a massive value. By the time we make enough trades to reach that point, the value we are dealing with will be unthinkably high since we are multiplying our initial utility by a trillion many many times.
But nonetheless, it appears that by making a series of many beneficial trades, we will end up in a position much worse than we started. The only escape from this conclusion is accepting fanaticism which is to say that Option B from the original fanaticism gamble (p = 0.00000…1 | Utility = extremely high) with the almost certain prospect of instant death is better than say Option 2 (p =0.9999999999 | Utility = 100 trillion) which almost guarantees 100 trillion utility.
In summary, if you reject fanaticism, you have two options. You could just say that these trades actually make you worse off. This is a difficult position to defend, since you can always shrink the probability changes down to arbitrarily small amounts and increase the utilities to astronomically high levels. As long as the probabilities change by a finite epsilon after each trade, there will always be some number of trades which ends up leading to the original fanaticism problem.
If you accept that there is a probability-utility combination such that the trades make you better off, then rejecting fanaticism implies rejecting continuity. That is to say, if you reject fanaticism, you must reject the axiom that a series of beneficial trades always makes you better off. This is a pretty big bullet to bite. It is essentially saying that making a situation better and better can eventually make it much worse. Rejecting continuity is to reject some of the most basic laws of logic.
Can’t we just assume diminishing marginal utility?
If you are an economist, you may be looking at these calculations and thinking, why can’t we just assume diminishing utility at higher levels. Can’t we just say that the difference between 1 and 1 trillion utility is much bigger than the difference between 1 trillion and 2 trillion utility? Perhaps utility is even bounded and has a hard ceiling.
It is correct that if utility is bounded at high levels or has diminishing returns, then the fanaticism problem is easy to resolve. We could just say that the big utility numbers aren’t really so big due to diminishing returns. Unfortunately, the axiom of non-linear/diminishing utility is untenable. Make no mistake, when converting money, or anything else into utility, we can assume diminishing returns or unusual functional forms. The money-utility relationship is famously thought of as concave - higher money only brings higher happiness up to point.
But we can’t think of utility in this way without ending up in something of an infinite regress. If you say that utility has diminishing returns, the question is, diminishing returns to what? Utility? Utility, in our context, is supposed to incorporate all forms of value. In order to say that the utility scale has diminishing returns, you must come up with a higher form of value for utility to be measured against. Let me explain.
In saying that money has diminishing returns to utility, you are saying that there is a concave relationship with money on the X-axis and utility on the Y-axis. But if utility displays diminishing returns, then utility is on the X-axis and what exactly is on the Y-axis? If we define utility as the highest form of all-encompassing value, then there is nothing to put on the Y-axis. And whether or not this is your conception of utility, there still must be some highest form of value, after which there can be nothing on the Y-axis. And if we use this highest form of value in all of the thought experiments, we run into all the fanaticism problems just the same. We cannot escape fanaticism by assuming bounded utility.
To sum up, utility must be linear, by definition. Non-linearity might seem like a simple resolution to the fanaticism dilemma but once utility is properly defined, the resolution can’t possibly work. And so once again, it seems like we must accept the position of fanaticism.
The world isn’t ready to accept fanaticism
In 2022, former crypto billionaire Sam Bankman-Fried (SBF) was asked about fanaticism on the podcast, Conversations with Tyler.
In case you don't know, it is widely accepted that in late 2022, SBF committed large scale financial fraud as CEO of the cryptocurrency exchange FTX. Before the fraud was uncovered, Bankman-Fried used FTX profits to make a number of large donations to charitable organisations whilst being a notable member of the Effective Altruism community. It is believed that had his fraud not been found out, he would have continued to make these donations.
Here’s an excerpt of the transcript where SBF talks about fanaticism:
COWEN: Should a Benthamite be risk-neutral with regard to social welfare?
BANKMAN-FRIED: Yes, that I feel very strongly about.
COWEN: Okay, but let’s say there’s a game: 51 percent, you double the Earth out somewhere else; 49 percent, it all disappears. Would you play that game? And would you keep on playing that, double or nothing?
BANKMAN-FRIED: With one caveat. Let me give the caveat first, just to be a party pooper, which is, I’m assuming these are noninteracting universes. Is that right? Because to the extent they’re in the same universe, then maybe duplicating doesn’t actually double the value because maybe they would have colonized the other one anyway, eventually.
COWEN: But holding all that constant, you’re actually getting two Earths, but you’re risking a 49 percent chance of it all disappearing.
BANKMAN-FRIED: Again, I feel compelled to say caveats here, like, “How do you really know that’s what’s happening?” Blah, blah, blah, whatever. But that aside, take the pure hypothetical.
COWEN: Then you keep on playing the game. So, what’s the chance we’re left with anything? Don’t I just St. Petersburg paradox you into nonexistence?
BANKMAN-FRIED: Well, not necessarily. Maybe you St. Petersburg paradox into an enormously valuable existence. That’s the other option.
Perhaps it isn’t as clear by reading the transcript but SBF was very happy to repeatedly take the bet of 51% chance of doubling utility and 49% of 0 utility. If you repeat this bet enough times, we once again end up in the fanaticism world of a tiny probability of a huge value. In the words of Cowen, does this not “St Petersburg paradox you into nonexistence?”. Well, yes it does.
In late 2022, after Bankman-Fried’s crypto exchange FTX were found to have been widely misusing customer funds and committing fraud, many people returned to this specific clip from the podcast as evidence of the problematic ethical system that SBF used in order to justify the actions he took. Cowen later said that SBF took on the double or nothing bet in the real world and that his conversation on the podcast demonstrates the philosophy that led him to do this. By taking on higher and higher levels of risk, including risks associated with committing large scale financial crime, SBF’s endeavours inevitably and unsurprisingly failed in the long term. The position of fanaticism was widely condemned as a result of SBF’s actions.
However, from the rest of this blog post, it is hopefully clear that we can’t dismiss Bankman-Fried’s position very easily. In fact, what he said on the podcast is at worst a justifiable position to hold and probably the most tenable philosophical position one can hold regarding expected utility.
This is not to say, of course, that SBF’s actions as a whole were justifiable. He broke the law and committed actions which were, by any estimation, very likely to cause harm. Effective altruists should not take such actions. I often talk about how many effective altruist ideas fail the ‘Sam Bankman-Fried test’. The test is something like: can you explain your idea to a non-EA without sounding like a maniac. SBF’s idea to defraud millions of people out of their savings in order to donate more money to charity clearly fails this test. I came up with the Sam Bankman-Fried test when an EA proposed an idea to deliberately detonate large sections of natural habitat in order to end the suffering of wild animals. This idea may well be good in expectation but it’s also a bit crazy.
Conclusion:
In this post, I have laid out my main reasons for not rejecting the philosophical position of fanaticism. This leads us to some unintuitive conclusions, like siding with Sam Bankman-Fried on the double-or-nothing bet. But we may not have any other choice. I don’t know that the world is ready to hear this yet. Earlier, I tried to argue that SBF was correct about fanaticism but that his actions fail the don’t sound like a maniac test. Unfortunately, if you say that you are willing to choose Option B in the thought experiment at the start of this post, you also seem like a maniac.
And though we may never encounter a genie offering us a utility gamble of this nature, our position on fanaticism does matter. We need to know whether to work on AI risk which has a tiny probability of helping the world substantially. We need to know whether to go out and vote when there’s only a tiny chance that our votes will mean anything at all. There are surprisingly many situations in the real world where we have to choose between high probabilities of small utility and tiny probabilities of massive utility. We can’t simply evaluate each on a case-by-case basis and give up philosophical consistency. However, if we aspire to philosophical consistency, we must either accept fanaticism or accept even wackier positions such as that a string of hundreds of beneficial trades can you make you worse off.
As is often the case in the study of ethics, the price of philosophical consistency is high, but the value of a correct ethical system is surely higher.
I am assuming here that the worlds are independent and non-interacting