Optimal Decision-Making Strategies for Sergeant Schultz
What should you do when all you can say is: "I know nothing. Nothing!"?
I'm reading Michael Schwarz's very interesting "Decision Making under Extreme Uncertainty", with its fascinating result that "invariance restrictions alone are sufficient to pin down the agent’s choices in some decision problems":
Suppose an agent... has no information relevant for estimating the variable. In this case her actions in decision problems where payoff is contingent on different dimensional variables are the same, i.e., in this case the name of a variable is merely an uninformative label. (Effectively, provided that an agent has never heard of either tugric or dugric her strategy for selecting a guess from an interval [1,4] is the same regardless if she is guessing the value of a ”tugric” or the length of a “dugric”.) This imposes a sever restriction on an agent’s choices. For instance, if an agent is asked to “guess” exchange rate between currencies A and B conditional on the rate being between a and b, a “guess” of (a + b)/2 is not “reasonable” because, if this decision problem is reformulated in terms of exchange rate between B and A the range becomes 1/b to 1/a and the “guess” in the mirror decision problem (1/b+1/a)/2 is not a reciprocal of the guess in the original problem. A remarkable property of invariant decision problems is that a strategy in the “image game” must be reciprocal of the strategy in the original game.
Surprisingly, in an information vacuum the invariance consideration along are sufficient to uniquely pin done the strategy of an agent in some decision problems. We showed that if the payoff relevant range in an invariant decision problem is given by [a, b], then an agent’s strategy in such a decision problem is approximated by the geometric mean given by √ab. Combining the results of this section with expected utility axioms one can show that the prior of an agent in an information vacuum corresponds to the Jeffreys’ prior 1/x...
I find myself wondering if there isn't a connection between Schwarz's idea of "invariance" and the "Grass Is Greener" Switching-Envelopes Problem, but I'm not smart enough to see clearly why I have a hunch that there is a connection.









Suppose an agent... has no information relevant for estimating the variable.
I suppose the response is to torture the agent until information - relevant or not - is obtained.
Posted by: pebird | June 15, 2005 at 10:50 AM
It seems to me this has more to do with the old paradoxes of indifference that Keynes discussed in his book on probability. The simplest such problem is the following.
Factory A makes cubes with side lengths varying between 1cm and 3cm. What should you "guess" as to the side length of the next cube? Perhaps 2cm (the midpoint between 1 and 3)?
Factory B makes cubes with volumes varying between 1cm3 and 27cm3. What should you "guess" as to the volume of the next cube? Perhaps 14cm3 (the midpoint between 1 and 27)?
Of course we've just got the same problem described twice over here, but the principle "guess the midpoint" plus the change of description led to a change of answer. I think the same thing is going on in Schwartz's case.
Bas van Fraassen discusses these cases at length, as well as what policies will be description-invariant, in his book Laws and Symmetry.
Posted by: Brian Weatherson | June 15, 2005 at 11:02 AM
This sounds to me (off the top of my head...) to be a scaling argument. And, as the previous poster noted, scaling arguments will give different answers in different dimensions-- unless you assume 'indefinite' or 'infinite' dimensionality (limit as d goes to infinity).
Posted by: Matt | June 15, 2005 at 11:07 AM
The Keynes problem is such that you really can't have a reasonable invarient solution unless you pick the dimention.
The Schwarz problem does produce a reasonable invarient guess. If I was going to trade dollars for currency X and I knew currency X was worth 0 to 1 dollars, I think the geometric mean guess of 0 dollars is better than the arithmetic mean guess of 1/2 dollars.
Posted by: Joe O | June 15, 2005 at 11:27 AM
I think what's left out is that in all of these cases, we actually do know something about what we're guessing -- we've been given the units, and from that we can actually infer something about the distribution of possible values, or at least find a scale-invariant way to guess at them (which creates stability in the mirror image problem).
The volumes of cubes, obviously, vary with a cubic function of the edge length.
Almost all values that have to do with money have exponential probability distributions -- hence why they're always plotted on logarithmic graphs -- and the geometric mean is the obvious stable function in that domain.
Posted by: Auros | June 15, 2005 at 11:51 AM
"Invariance" is even more restrictive than this. For the currency example, the range [a,b] is incoherent for arbitrary choices of (a,b). The interval [1,4], for example, doesn't make sense - if that is the range of possible exchange rates from currency A to B, it can't be the range if you switch the labeling of the currencies. If the identities of the currencies can be switched, then the range has to be of the form [a,1/a]. Given that range, the only legitimate guess is 1 (=sqrt(a*1/a))
Posted by: guest | June 15, 2005 at 12:17 PM
Can someone explain this to me? Personally, if I lacked any information about an industrial process, I would assume a normal distribution for linear measurements. The upper and lower bounds would be +3 sigma and -3 sigma.
I'd have to graph the distribution for the volume or math it out, but the mean would be 8 cm^3.
I think I'm missing something.
Posted by: anon | June 15, 2005 at 12:28 PM
I am unconvinced. The geometric mean is invariant under some transformations, but so what. What if I ask you to guess the log of the exchange rate saying only that it is between log(A) and Log(B) Schultz would have you guess
root(log(A)*Log(B)) which is not (log(A)+Log(B))/2 which is log(root(AB)) which is alwso what Schultz would have you guess.
Nothing comes from nothing. Our ability to learn from data depends on our being lucky enough to use the right words not perhaps the first time or the second but much sooner than is conceivable. In this case, success depends on guessing correctly whether definition of the exchange rate which will make the world seem simple is
A, 1/A log(A) logistic(A) Probit(A) or what.
Recall the bleen book (fact fiction and forecast by Nelson Goodman). My sense is that our guesses about the way to look at the world work surprisingly well because our brains are physical objects and the processes which occur when we try to imagine the world, the processes which occur when we perceive the world and the world all follow the same physical laws. This, I think, is enough for our intuition to be, if not good, at least better than it could conceivably be.
Proofs that the only rational choice is the geometric mean reflect, I think, limited imigination and not the limits on conceivable universes.
Posted by: Robert Waldmann | June 15, 2005 at 01:19 PM
Robert,
I don't think Swartz means that the geometric mean is always invarient. In your example, the arithmetic mean is invarient.
I do agree that definitions of "rational" as shorthand for "consistent" are not really persasive. People are not consistency checking machines.
Posted by: Joe O | June 15, 2005 at 02:01 PM
Guest: If I'm guessing the exchange rate from dollars to foobars, and I'm told that one dollar is worth between 1 and 4 foobars (endpoint inclusive), then I know that the foobar is worth between 0.25 and 1 dollars. Using the geo-mean method, we'd guess a dollar is worth 2 foobars, and a foobar is worth 0.5 dollars. This is consistent.
Robert: Like Joe O says, I don't think the method is intended to be universally applicable -- it's just that it applies well to money because geo-means provide stable relections in the kind of scale that generally applies in problems relating to money. It's the same reason that the geo-mean is the right figure for giving the average per-year growth of a stock price over a decade, from the growth for each year.
Posted by: Auros | June 15, 2005 at 03:46 PM
BTW, I've been intrigued by the way interesting things fall out of scale invariance since I first learned about Benford's Law:
http://plus.maths.org/issue9/features/benford/
Posted by: Auros | June 15, 2005 at 03:48 PM
OK OK I haven't read Schultz and I seem to be in a bad mood.
I don't concede that it is reasonable to decide that we want a number and that we want it to be invariant under this and that. That seems to me to be close to assuming the answer. However, I am interested by the argument that invariant under inverse is right for exchange rates. Hmmm well you could say that the dollars/euro or euros/dollar is arbitrary. I certainly don't accept that there is a number which all people say deciding whether to visit across the Atlantic next year should use. If I am risk averse and you are risk neutral there shouldn't be any function from the extremes of the distribution which works for both of us. I care about log(dollars/euros), that is the log of what I can buy in the USA (actually more like -($/euro)^(-3) but I'm trying not to be picky). Note I live in Europe and earn Euros. You care about euros/dollars. Any rule which gives us the same certainty equivalent must be wrong. There can't be a rule for summarizing an interval with a number which works for the brave and for cowards.
I still see little progress in discovering that if you want a function f(a,b) so that 1/f(1/a,1/b) = f(a,b) and f(a,b) = f(b,a) and a<= f(a,b) <= b or b<=f(a,b)<=a then f(a,b) = root(ab). That's a lot of ifs. I could just say that if when faced with two numbers you want to take the geometric mean then f(a,b) = root(ab). I don't see what is gained in making an implausible assumption which isn't the conclusion but rather is removed by a bit of math.
It is important that Schultz assumes that we want to chose something like a mean. I would say that, if you don't know, you don't know and don't convince yourself that you do. If someone presents you with a problem, you know more than absolutely zero because you are human too and you can try to imagine what this person is up to. If a martian presents you with such a problem, it's probably hopeless.
Consider this problem presented by (Schelling was it ?) to a class. Go to New York go to one spot and wait there to meet someone else from the class who has just received the same instruction. The claim is that half said they would go to union station and wait under the big clock. No way is invariant under inverse gonna do that well ever. I don't know if this story is true, but I heard it from Brad DeLong so in this thread it is true.
Posted by: Robert Waldmann | June 15, 2005 at 04:55 PM
Paul would not like this. Why do guys who like strange rules like to take the mean of logs then go back ? What is the source of the strange charm of that kind of mean ? Why do guys try and try to solve things when they know it can not be done ? Why ask why.
This post sure sounds dumb, but just try to write "geometric mean" with words of just one syllable.
Also Brad you know you mention Sergeant Schultz a lot. Some of the culturally deprived youths who read this might never have seen Hogan's heroes. Also I always thought you never watched TV. Finally given that google says that in your semi daily journal Sergeant Schultz loses to Charles Schults only 2 to 5, do you think you want to reconsider this post http://delong.typepad.com/sdj/2005/06/matthew_yglesia_1.html?
Posted by: Robert Waldmann | June 15, 2005 at 06:14 PM
Although this note is off-subject, I think some of you may be interested in it. I just watched on CSPAN-2, a very remarkable Senate Budget Cmte Hearing on the subject of "Solvency of the Pension Benefit Guaranty corporation." As you know the PBGC which is responsible for the private pensions of large corporations that have failed has recently taken on some large liabilities including the pension obligations of bankrupt airlines like United. From a small surplus in 2000, the PBGC is currently facing a shortfall of $23 billion. The two guests at the hearing were Douglas Holtz-Eakins, CBO head, and Bradley Belt, exec. director of the PBGC. Some very sharp questioning by the best financial minds on Capitol Hill, like Kent Conrad, Sen Mike Enzi, Sen Gregg, Sen Byrd, et al covered a lot of ground and insights into what is wrong with design of the private pension from a regulatory point of view. You may be able to access this hearing later today or on the web page of CSPAN with your real audio software.
Posted by: Ralph | June 15, 2005 at 07:10 PM
Robert W seems obviously right to me. I don't see why invariance is in any way a desirable property here. If the only information you have about a choice is the type of units it is being offered in, then why on earth would it be a good idea to throw away the only information that you have?
I also think it's extremely perceptive to bring Nelson Goodman's book into the discussion; the whole point of Goodman's work is that there aren't any general-purpose inference rules because everything is dependent on our (possibly arbitrary) decision of what constitutes a natural kind of thing. So it makes no sense to try to arbitrarily construct one out of the invariance principle. (It also strikes me that the index number problem is relevant here; it looks like an analogous problem is definitely insoluble, so why should this one be soluble?)
Posted by: dsquared | June 16, 2005 at 05:34 AM
Curiously enough, I caught some of the PBGC hearing as well--both Holtz-Eakin and Belt did very good jobs...
Posted by: Brad DeLong | June 16, 2005 at 06:18 AM
I'm not sure, dsquared.
I look at it something like this:
(1) Suppose the problem is guess Joe's age; it's between a and b.
One possibility is to say I'll guess (a+b)/2 because this minimizes the maximum possible error. This reasoning survives translation into the forex problem. BUT
(2) A second possibility (and one that I think is more reasonable to those of us who are scientists, as opposed to folk reasoning) is to view the problem as one of a probability distribution. In the absence of any further info, our best guess for the pdf of Joe's age distribution is flat between a and b, and with this pdf we can calculate expectation values, variances and anything else.
(3) OK now switch to the forex problem. In this case we now have an additional (though implicit) piece of information which is that we are dealing with exchange rates. Mathematically this translates into saying that the pdf of X has no reason to be different from the pdf of 1/X; what we want is something that is as near to flat as possible, in some sense, but that ALSO preserves its structure under this transformation. Fortunately this isn't too hard a problem. As people have pointed out, if we switch to log X, and use a pdf that is flat between log a and log b we have what we want. (This sort of pdf, when translated back into X space [or 1/X space] has the form k/X between a and b, if anyone cares.) One can now ask for things like the expectation value of this pdf in its natural space, and of course one gets the sqrt value already discussed.
(4) This is not especially original reasoning. Remember that Maxwell derived the velocity distribution for gases using the same sort of ideas, stating a few properties that he felt his pdf had to fulfill and then finding a distribution compatible with those properties.
(5) Regarding the Keynes' problem, I can't think of a way to describe it in natural language, but if you have the sort of problem described and so have a situation that is kinda like what we have but now you feel that the pdf you want should have the same structure both in the range [a,b] and as X^2 or X^3, you can presumably again transform to logs and everything follows as before --- the natural best guess/expectation value/whatever is sqrt(ab).
(6) OK smartypants, think up an equivalent problem whose answer would be the harmonic mean. Hmm, isn't that interesting?
The purely math parts of the problem are trivial --- one would transform the random variable X to 1/X, assume a flat distribution for 1/X between 1/a and 1/b and there you are; again if you care the equiv pdf for X is now k/X^2 between a and b.
The interesting problem is more philosophical/psychological; what's a wording that implies that the natural formulation for the problem should be a flat distribution in the reciprocal, not the given variables? What springs to mind is rates, of course; something like "this computer can do x calculations per second where x is between a and b. Guess x." But we're now at a pretty geeky place, and a more natural example doesn't spring to mind. Perhaps the reality is that harmonic means really are pretty geeky, that they're not going to appear out of some folk/intuition argument.
(7) Perhaps the most interesting thing this whole train of reasoning shows is that there is real physical content to statistical mechanics. In particular the founding axiom of statistical mechanics, that possible states of a fixed energy system have a flat distribution in energy, is not content-free --- one could imagine that the appropriate variable in which the distribution is flat could be not energy but log energy or 1/energy or some other weirdness.
Posted by: Maynard Handley | June 16, 2005 at 06:46 AM
I'm with Ben.
The writer could just have said "if you don't know what to do don't do anything extreme" but he gets paid by the word, or possibly by the letter.
Good advice when considering Kyoto.
Posted by: Neil Craig | June 20, 2005 at 02:35 PM
In the original post Brad mentioned that he intuits a connection between invariance consideration and "grass is greener problem" (http://www.j-bradford-delong.net/movable_type/archives/001395.html). Let me offer an argument that suggests that his intuition is correct.
A naive agent in the "grass is greener" problem thinks that there is a 50/50 chance that the remaining envelope has half the money or double the money. A naive agent forgets that he has some sense about the budget involved in the experiment. What if the agent has no preconceived notion about the size of the budget involved? In that case it may be sensible to assign probability 50% to the event that the other envelope has twice the money. What is the probability distribution that could support such belief? A simple calculation that follows shows that such distribution is proportional to 1/x. Intuitively we can say that such distribution represents the lack of preconceived notions about how much money is in play. I use invariance to axiomatically derive beliefs that are free of preconceived notion about what is a lot and what is little. Perhaps it is not a coincidence that in "the information vacuum" invariance axioms imply a posterior proportional to 1/x.
Now let me formulate a modified "Grass is Greener" problem and show where the distribution 1/x pops up. Consider an agent who plays the following game: Amount x is drawn from distribution r(.), that amount is placed in an envelope. Then a fair coin is flipped, if the outcome is heads the amount 2x is placed in another envelope, otherwise x/2 is placed into the other envelope. Which of the two envelopes is given to an agent is decided by a second coin flip. An agent receives an envelope containing x if the outcome of the coin flip is heads, otherwise the agent receives the other envelope. What is the distribution r(.) such that a Bayesian agent who received an envelope containing some amount z would assign probability 50% to the event that another envelope contains 2z?
Note that the state of the world is a triplet containing the information about the draw from the distribution and the outcomes of both coin flips. For instance in a state of the world (x,coin_1=H,coin_2=T) the agent receives an envelope containing amount 2x and the remaining envelope contains amount x.
Suppose an agent finds amount between [z,z+d] in his envelope, where d is small. There are four distinct subsets of states of the world that correspond to this event:
(A) x∈[z,z+d] and coin_1=H coin_2=H
In this case the agent actually gets the envelope containing a draw from the distribution and the second envelope contains ≈2z. Probability of this is r(z)d/4
(B) x∈[z,z+d] and coin_1=T coin_2=H
Here an agent also gets the envelope containing the draw from the distribution, the other envelope contains ≈z/2. Probability of this is r(z)d/4
(C) x∈[z/2,(z+d)/2] and coin_1=H coin_2=T
The draw from r(.) is x≈z/2, the agent receives an envelope containing double the draw from the distribution. The probability of this is r(z/2)d/8
(D) x∈[2z,2(z+d)] and coin_1=T coin_2=T
The draw from r(.) is x≈2z, the agent receives an envelope containing half that amount. The probability of this is r(2z)d/2
Note that in the events A and D the agent will double the money by switching the envelopes and in the events B and C the agent will find half as much in the second envelope as in the first, the probability of doubling the money by switching envelopes is
(r(z)+2r(2z))/(r(z)+2r(2z)+r(z)+r(z/2)/2),
it is easy to see that for r(z)=1/z that probability equals 0.5.
P.S. Some of the follow up commentators wondered about the connection between my paper and earlier applications of invariance in statistics. This question is extensively discussed in my paper. http://www-stat.wharton.upenn.edu/Seminars/Seminars-Fall2003/schwarz.pdf
Michael Schwarz
Posted by: Michael Schwarz | June 25, 2005 at 08:21 PM