notes_20on_20the_20use_20of_20rnd

*by Richard Russell, February 2010, revised May 2013*

This article expands on the description of RND in the main documentation, discusses some of the limitations and pitfalls in its use, and provides some code examples.

The **RND** function comes in five variants, as follows:

**RND** on its own returns a signed 32-bit pseudo-random integer value, i.e. a value in the range -2147483648 to +2147483647. The sequence length is 2^33-1 (8589934591), which means that after this number of values have been returned the sequence repeats.

If you call RND precisely 8589934591 times, each of the non-zero integer values (&00000001 to &FFFFFFFF) will be returned **exactly twice** and the value zero (&00000000) will be returned **exactly once**. Therefore, the statistical likelihood of RND returning zero is half that of it returning any specified non-zero value.

**RND(n)**, where n is a positive integer greater than 1, returns a pseudo-random integer value in the range 1 to n. RND(n) is equivalent to the following function:

DEF FNrnd(n%) LOCAL R R = RND IF R < 0 R += 2^32 = R - n% * INT(R / n%) + 1

Note that the statistical distribution of the returned values depends on the value of **n**. For small values of **n** the probabilities of each of the possible returned values (1 to n) are very nearly equal. For example if n=7 the likelihoods of the different values are as follows:

1: 1227133513 / 8589934591 = 0.142857142857 2: 1227133514 / 8589934591 = 0.142857142974 3: 1227133514 / 8589934591 = 0.142857142974 4: 1227133514 / 8589934591 = 0.142857142974 5: 1227133512 / 8589934591 = 0.142857142741 6: 1227133512 / 8589934591 = 0.142857142741 7: 1227133512 / 8589934591 = 0.142857142741

So returned values **5**, **6** and **7** are very slightly less probable than values **2**, **3** and **4**, with returned value **1** between these two probabilities. Although this variation is unlikely to be significant, it should be borne in mind for critical applications.

**RND(1)** returns a floating-point value in the range 0 ⇐ R < 1, that is the returned value is **greater than or equal to** zero but **less than** one. RND(1) is equivalent to the following function:

DEF FNrnd1 LOCAL R R = RND R = (R >>> 16) OR (R << 16) IF R < 0 R += 2^32 = R/2^32

Note that the distribution of returned values reflects the distribution of the values returned by **RND**, so the likelihood of the value zero being returned is less than other values. Irrespective of the ***FLOAT** mode in use, RND(1) returns a 40-bit floating point number (32-bit mantissa).

**RND(0)** returns the previous random number in RND(1) format. In other words, the pseudo-random-number generator does not advance to the next number in the sequence, but repeats the previous result.

Setting the parameter of **RND** to a negative integer (i.e. a value in the range -2147483648 to -1) **seeds** the generator (see below) and returns that same value.

As mentioned above, the sequence-length of BBC BASIC's pseudo-random number generator is 2^33-1 (8589934591), so unless your program calls RND at least that number of times (unlikely!) only part of the sequence will be utilised. Determining the starting point in the sequence is called **seeding** the generator.

When *BBC BASIC for Windows* is executed the random number generator is seeded from the value of TIME, but since this counts the number of centiseconds since the PC was last restarted it is likely to be quite small compared with the sequence length. Suppose for example the PC has been running for 24 hours, so TIME is approximately 8640000, this corresponds to only about one-thousandth of the overall sequence length! So relying on the automatic seeding will not make good use of the potential performance of RND.

To improve the performance you should seed the generator yourself, using the **RND(-n)** option. To do better than the automatic seeding you need to choose a value which is highly variable, not predictable and can range over most or all of the available range (-2147483648 to -1). On a Windows PC a suitable source of such a seed is the Performance Counter: a 64-bit integer value which counts at a rate up to the clock speed of the CPU. The following code may be used:

DIM pc{l%,h%} SYS "QueryPerformanceCounter", pc{} seed% = RND(-ABS(pc.l%)-1)

It should be noted, however, that the rate at which the Performance Counter increments is extremely variable between systems, and indeed one isn't guaranteed to exist at all. Therefore if you are lucky enough to have a better source of seed available you should use that instead.

However good your source of seed, since RND(-n) can only take a maximum of 2^31 different values (compared with the total sequence length of 2^33-1) only about **one quarter** of the overall sequence is likely to be exploited. If this is important to you, the pseudo-random number generator can alternatively be seeded by writing values directly to its memory locations, see Interpreter internal variables:

!412 = seedl% ?416 = seedh%

Here **seedl%** is a 32-bit value (&00000000 to &FFFFFFFF) and **seedh%** is a 1-bit value (0 or 1). Note that they **must not both be zero**. Using this technique you can initialise RND to any point in its sequence.

The built-in RND function is likely to be good enough for most requirements, but it is very important to appreciate that it is **not** good enough for some applications. For example, suppose one is using RND to select six lottery numbers, each in the range 1 to 49. The maximum number of possible combinations is (49*48*47*46*45*44)/(6*5*4*3*2*1) which is approximately 14 million and therefore significantly fewer than the sequence length of RND (n.b. the order of the numbers is not important). Therefore RND should be suitable for this kind of task.

However suppose that instead we want to shuffle a pack of 52 playing cards. Now the number of possible permutations is 52x51x50x….x2 or about 8*10^67! This is a massively larger number than the length of the RND sequence, so only a minuscule fraction of all the possible shuffles can be achieved using RND. There are no 'workarounds' to this; RND just isn't suitable for shuffling a pack of cards in critical applications (such as an electronic gaming machine or online casino).

There is no shortage of pseudo-random number generator algorithms with a longer sequence length than RND. For example the assembler-code algorithm here has a sequence length of 2^64-1 (1.8E19), which although far longer than RND is still hugely less than the number of permutations of a pack of 52 cards. The Mersenne Twister has a mind-boggling sequence length of 2^19937−1 (about 10^6000), although the BBC BASIC code at that link can only take a 32-bit seed which somewhat defeats the object. Two modern, high-performance algorithms are listed at High Quality Random Number Generation.

One thing not to attempt is to write your own algorithm, unless you're a top-notch mathematician! An algorithm with an unintentional bias is only too easy to create and the consequences could be serious.

Using an improved algorithm is only part of the solution. Even if the sequence length is adequate for the task in hand, you must be able to seed it so that a big enough portion of this length is exploited. For example, in the card shuffling case you need to be able to generate a seed with around 230 bits to ensure all possible permutations of cards could, in principle, be generated. Finding an external source of 'randomness' with that number of bits is by no means easy. For example the Performance Counter, as suggested above, is woefully inadequate with only 64 bits of precision.

The Windows API function CryptGenRandom (available on Windows 2000 and later) provides a *cryptographically secure* random number of a specified length, in bytes. The function below returns a random 32-bit integer using this method:

DEF FN_RndSecure LOCAL R%, P% SYS "CryptAcquireContext", ^P%, 0, 0, 1, 0 SYS "CryptGenRandom", P%, 4, ^R% SYS "CryptReleaseContext", P% = R%

CryptGenRandom uses various sources of *entropy* to seed the generator, for example the tick-count since boot time, the current clock time and the Performance Counter.

The following routine selects and prints six lottery numbers, each in the range 1 to 49:

max = 49 num = 6 DIM lotto(max) FOR I = 1 TO max lotto(I) = I NEXT I FOR choice = 1 TO num R = RND(max) PRINT lotto(R) lotto(R) = lotto(max) max = max-1 NEXT choice

If you want to display the selection in ascending sequence, you can put the numbers in a second array and sort it:

INSTALL @lib$+"SORTLIB" sort% = FN_sortinit(0,0) max = 49 num = 6 DIM lotto(max), choices(num) FOR I = 1 TO max lotto(I) = I NEXT I FOR choice = 1 TO num R = RND(max) choices(choice) = lotto(R) lotto(R) = lotto(max) max = max-1 NEXT choice C% = num CALL sort%,choices(1) FOR choice = 1 TO num PRINT choices(choice) NEXT

Notwithstanding the comments made above, this routine uses RND for the convenience of the example. In practice a better random number generator ought to be used:

cards = 52 DIM pack(cards) FOR I = 1 TO cards pack(I) = I NEXT I FOR N = cards TO 2 STEP -1 SWAP pack(N),pack(RND(N)) NEXT N FOR I = 1 TO cards PRINT pack(I); NEXT I PRINT

This uses Durstenfeld's algorithm for the Fisher–Yates shuffle.

This website uses cookies for visitor traffic analysis. By using the website, you agree with storing the cookies on your computer.More information

notes_20on_20the_20use_20of_20rnd.txt · Last modified: 2020/07/27 13:49 by richardrussell

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International