Home > Chow-Robbins

The Chow-Robbins Problem

This page details examination of the Chow-Robbins problem, performed purely for interest.
Jump to: Summary, Intro, Analysis, Computation, Results, References,

Summary

Theoretical analysis produced no useful additional data or insight. Prior analysis by others had suggested EV(0,0)=0.79295350640… [1]. Machine calculation to higher precision was performed and for a larger maximum number of tosses n; the results confirmed previous calculation, but suggest only EV(0,0)=0.7929535064… with uncertainty over the next digit, as the ratio of deltas between successive EV(0,0) estimates with increasing n does not as yet seem to be converging sufficiently.

Introduction

The Chow-Robbins problem came to my attention from here [1]. The problem is interesting because while being very simple to describe, there is intruiging complexity in determining the solution.

	Statement of the Problem
	A fair coin is tossed repeatedly, stopping at any chosen time; the score achieved is the proportion of tosses which are heads. What is the optimal strategy for deciding when to stop, and what is the expected value for the score achieved if an optimal strategy is used?

Note that this case is a subset of the more general optimal stopping problem, where each step could have many possible outcomes each with different probabilities themselves, such as rolling a weighted die.

Limited game solution

The following table shows the final score S(h,t) achieved if the player decides to stop at any number of tosses up to six, where the tosses have generated h heads and t tails:

	t=0	1	2	3	4	5	6
h=0	-	0.00	0.00	0.00	0.00	0.00	0.00
1	1.00	0.50	0.33	0.25	0.20	0.17
2	1.00	0.67	0.50	0.40	0.33
3	1.00	0.75	0.60	0.50
4	1.00	0.80	0.67
5	1.00	0.83
6	1.00

Final score on stopping at any position up to 6 tosses

The player must therefore decide at any time whether to continue tossing or stop, by considering whether they would on average improve their final score by continuing, or be better off stopping at this point. For a game with limited maximum number of tosses n, the expected value EV(h,t) at each position can then therefore be determined:

For the positions where h+t=n, EV(h,t)=S(h,t) since we have reached the maximum number of tosses For every position where h+t<n If we stop, then EV(h,t)=S(h,t) If we continue, then EV(h,t) = (EV(h+1,t)+EV(h,t+1))/2 We therefore decide to stop only when there is no benefit in continuing, when S(h,t)>=(EV(h+1,t)+EV(h,t+1))/2

Using this rule, the expected values for all positions in our limited game from above become:

	t=0	1	2	3	4	5	6
h=0	0.72	0.44	0.34	0.25	0.17	0.08	0.00
1	1.00	0.54	0.42	0.33	0.25	0.17
2		0.67	0.51	0.42	0.33
3			0.60	0.50

Expected Value EV(h,t) for all positions with optimal stopping for a maximum of 6 tosses

Full problem solution

To consider a game with an unlimited number of tosses, the following assumption is made: that EV(h,t) for any position is a mininum of 0.5, based on the justification that at any such position the player could decide to continue tossing forever until this value was achieved with probability 1.

In the full problem there is no limit on the number of tosses. In order to estimate EV(h,t), the following assumption is made: that in any case where EV(h,t)<0.5 the player would continue to toss until h/(h+t)=0.5 which will occur with probability 1; EV(h,t) can therefore be assumed to have a minimum value of 0.5 in any position.

In order to make an estimation for EV(0,0) for the unlimited game, we can therefore consider the position for up to n tosses; Using n=6 as above, the previous table now becomes:

	t=0	1	2	3	4	5	6
h=0	0.78	0.55	0.51	0.50	0.50	0.50	0.50
1	1.00	0.60	0.53	0.50	0.50	0.50
2		0.67	0.55	0.50	0.50
3			0.60	0.50

Expected Value EV(h,t) with optimal stopping for unlimited n, calculated to maximum of N=6

Thus it can be determined that EV(0,0)>=0.777 for the unlimited case by calculation for n=6. As n increases, the estimate for EV(0,0) appears to tend to a limit [1].

Note that while this example suggests a simple rule of "stop when h>t", when the maximum number of tosses is increased it is found that this is no longer sufficient.

Analysis

Initial theoretical examination of the problem suggested a possible solution in the form of a summation based on factors from increasing series of binomial expansion coefficients; However on further examination of the pseudo-chaotic determination of stopping points, this seemed less viable so the analysis was re-focussed on creation of a multiple-precision fixed point implementation for large numbers of tosses. Specifically, the primary goal became to enable higher accuracy result calculation; this would then also facilitate further examination of whether the ratio of the difference between successive EV(0,0) estimates tends towards a constant.

Computation

Implementions of the Chow-Robbins computation were created in C while examining the problem; initially starting with a basic double-precision floating point implementation. Subsequently a multiple-precision fixed point implementation was created and optimised, with several goals:

to validate and compare with existing results from [1]
to validate the apparent convergence using higher-accuracy results
to take this further for higher n and see if the trend continues

In order to achieve this, the following optimisations were implemented:

Implementation steps:

Double-precision floating point implementation
Fixed-point arbitrary-precision model for maximum n=2ⁱ tosses
Optimisations:

inline calculation functions for speed
fixed precision function calculations optimised for this application
compiler optimisation (for speed, calculations validated against previous to i=20)
modified calculation from diagonals to vertical method (calculation by column, with decreasing t)
reduced range of each column calculated to none below machine precision
reduced range of each column calculated to avoid calculation where h is greater than the known stop point from columns with greater t
dual (ping-pong) column segment buffers to (dramatically) reduce memory requirement
ability to save/resume calculation

In particular, memory optimisation was performed to limit the buffer sizes required for the calculation; By examination of the column buffer sizes required for calculation where i<=20, it was determined that a buffer size of 11*1.41ⁱ values was sufficient for calculation to 28 decimal places of accuracy for i<20, and for all i>=20 with an increasing spare margin. Using a pair of buffers, and with each value requiring 16 bytes, this becomes a total of 352*1.41ⁱ; For i=28 the memory required is only just over 5MB and therefore easily manageable.

In addition, with high i runs taking several days, the ability to automatically save current state and to resume from previous saved state was added to facilitate restart following any power-cut or interruption for other cause (e.g. machine needed for other purposes).

Results

The following results show the estimate for EV(0,0) as calculated using U32.96 bit precision (i.e. 96 bits of precision after the decimal point), which is equivalent to 28 decimal places. The rounding stratgies implemented should provide accuracy to this last bit, so at least 27 decimal places of calculation accuracy.

The PARI/GP calculation engine [3] was then used to calculate at high precision the difference between successive estimates, the ratio between successive differences, and thereby the estimated limit, where Limit=EV+Diff/(1/Ratio–1) as the final result provided that successive values of Ratio continue to tend towards a constant limit.

i	n	EV(0,0)	Delta	Ratio	Limit (est.)
3	8	0.7799479166666666666666666666
4	16	0.7852476574125744047619047618
6	64	0.7900232244244767745603163299	0.0018323254529310023873224968	0.6225535404478935284432033117	0.7930454297442156636962305979
7	128	0.7911968421606349052304977492	0.0011736177361581306701814193	0.6405072495613715307068802118	0.7932878736690753445432831873
8	256	0.7918975370793603120599453450	0.0007006949187254068294475958	0.5970384539510714573933760190	0.7929357051504059609469791952
9	512	0.7923182242915051320773415975	0.0004206872121448200173962525	0.6003857041093826333882485148	0.7929502702184669538814051416
10	1024	0.7925704383240074703561009768	0.0002522140325023382787593793	0.5995286408076376958376989896	0.7929480172260385348986090149
11	2048	0.7927224938475998836189902076	0.0001520555235924132628892308	0.6028828851582774429669050632	0.7929533367635597859212903275
12	4096	0.7928144143569131534117117336	0.0000919205093132697927215260	0.6045193699090068587248183068	0.7929549211836545482919368694
13	8192	0.7928697526753007425659962106	0.0000553383183875891542844770	0.6020236267294096198433094927	0.7929534636123571788875491481
14	16384	0.7929030566167815956792589990	0.0000333039414808531132627884	0.6018242413438110737743002611	0.7929533939838355745956930173
15	32768	0.7929231096159555050505617958	0.0000200529991739093713027968	0.6021208986761555520945535044	0.7929534563465165527778279834
16	65536	0.7929351959569924273442376235	0.0000120863410369222936758277	0.6027198690880930164616999313	0.7929535323330674321068765712
17	131072	0.7929424763794650187451862936	0.0000072804224725914009486701	0.6023677844560732408696133799	0.7929535053951817662225597556
18	262144	0.7929468624282204411140251513	0.0000043860487554223688388577	0.6024442636309255556928772146	0.7929535089174318474794079856
19	524288	0.7929495040796001778699219954	0.0000026416513797367558968441	0.6022850011575782130683964980	0.7929535044995213110156491231
20	1048576	0.7929510953144074217685469331	0.0000015912348072438986249377	0.6023636651867618155403571598	0.7929535058135161472687355962
21	2097152	0.7929520539320792875105459335	0.0000009586176718657419990004	0.6024363390630686501130662810	0.7929535065450254480349344705
22	4194304	0.7929526313911985760810804784	0.0000005774591192885705345449	0.6023873085551106452156438011	0.7929535062476912216647779680
23	8388608	0.7929529792942063975497508676	0.0000003479030078214686703892	0.6024720992372327133757216090	0.7929535065574627651366755916
24	16777216	0.7929531888720279025765418135	0.0000002095778215050267909459	0.6024030169137675363734485429	0.7929535064054029620686870211
25	33554432	0.7929533151208519469761595019	0.0000001262488240443996176884	0.6023959173627133983027588045	0.7929535063959909539565080734
26	67108864	0.7929533911744300846845356856	0.0000000760535781377083761837	0.6024101904581827553162828355	0.7929535064073897617973420030
27	134217728	0.7929534369899290286017952583	0.0000000458154989439172595727	0.6024108275479194774451910410	0.7929535064076962759040097845
28	268435456	0.7929534645897849993267635433	0.0000000275998559707249682850	0.6024130830597293036631187494	0.7929535064083499945796124577
29	536870912	0.7929534812164250961971926337	0.0000000166266400968704290904	0.6024176399509557080891900088	0.7929535064091456339872856045

Machine calculation of EV(0,0) and limit estimation up to i=29

From the above, the following points are noted:

Agreement to the stated 14 digits of accuracy with [1] for calculation of EV(0,0), plus extension of these results to 28 digit accuracy (note when comparing that n is used on this page to represent the maximum number of tosses, while on [1] it is defined as the maximum number of heads where calculation starts at h=t, and is therefore half the value for the same test case).
The results suggest EV(0,0)=0.7929535064… Note that this is one less digit of precision than in the conjecture from [1] ("the expected value ≈0.79295350640…, with the next digit probably being not more than 1 different from 7") as despite the increased precision and larger n computation, the ratio of deltas between successive EV(0,0) estimates with increasing n does not as yet seem to be converging sufficiently.
Additional precision on calculation has not provided increase accuracy over the current estimate from [1]. The additional precision would likely be useful for longer runs for much higher i as would be required for increase accuracy of calculation.
Calculation for i=29 at 28 digit precision will likely be the last result provided as the calculation at this precision took 32 days, and for each increment to i the calculation time increases by a factor of approximately 2.8.

Phil W.
Sep 2013.

References

[1] "The expected value of sn/n ≈0.79295350640…", J. D. A. Wiseman
(http://www.jdawiseman.com/papers/easymath/coin-stopping.html)

[2] "On optimal stopping rules for S_{n}/n", Y. S. Chow and Herbert Robbins, Illinois J. Math. Volume 9, Issue 3 (1965)

[3] PARI/GP calculation engine
(http://pari.math.u-bordeaux.fr/)