Buffon’s needle is a classic Monte Carlo simulation that we can conduct in a classroom. We give the students, say 10 needles each, and have them drop the needles on a paper that we provide also. The paper is special, in that it has parallel lines that are separated by the length of a needle. Each student records how many needles intersect one of the lines, then we tally their numbers to arrive at a very special number. (Guess who?)
In a class of 30, we have 300 samples to determine this special number. If the students had repeated the experiment ten times (any more than this risks frustration and wrath), we would have 3,000 samples at best. But with a computer program, we can make a million samples easily. In less than a second!
So today, we will learn how to write a program that generates needles (line segments). Our program needs to determine if a needle intersects a line. In general, checking if two line segments intersect is not an easy task. It’s rather amazing how we can look at many line segments (like the ones in the picture above) and instantly tell which lines intersect with which others. In our problem, fortunately, the lines on the paper are parallel and are equally spaced apart. We can use this fact to come up with a simple solution. Lastly, we will learn how to vectorize our code. Vectorization allows us to arrive at the answer quickly.
1. Problem description
Consider a paper with parallel lines that are spaced
units apart and a needle that is
units long. When we drop the needle on the paper, what is the probability that the needle intersects one of the lines?
Think about what the input and output of a simulation are. Without loss of generality, we can assume that the lines are horizontal and spaced one unit apart, i.e. and
. “WLOG” assumptions help us by simplifying the problem and leading us to the same conclusion had we not made these assumptions.
Why can we, you ask? If the lines are at an angle to the horizontal axis (horizontal from our viewpoint), we can move our body around the paper so that the lines now appear to be horizontal from our new viewpoint. If the distance between two lines appears less than one unit, we can “scooch in” until the lines look one unit apart and a needle looks one unit long. If the distance is greater than one, we can “scooch out.” When we move and scooch around, we do not influence the paper, the lines, or the needle. We are simply an observer.
2. A single needle
Two points create a needle (line segment). The points and
cannot be arbitrary, however, as the needle needs to be 1 unit long.
But not to worry. We can meet the length requirement by expressing the coordinates of the second point in polar form with respect to the coordinates of the first point:
These equations show that we need three numbers to create a needle: ,
, and
.
% Create the first point x1 = 9 * rand + 1; y1 = 9 * rand + 1; % Set the angle theta = 2*pi * rand; % Create the second point x2 = x1 + cos(theta); y2 = y1 + sin(theta);
Matlab’s rand (“random”) function selects a number between 0 and 1. It considers a uniform distribution of , so all real numbers between 0 and 1 are equally likely to be selected. (By “real numbers,” we really mean floating-point numbers.)
Lines 2-3 and 6 are examples of an affine transformation. In the present context, this means the following: If we multiply a uniform distribution by a constant (e.g. 9, 2) and add a constant (e.g. 1, 0) to the product, then the resulting distribution is uniform also. This is a very powerful idea, since we can now generate a number between any two numbers
and
in an unbiased manner. Simply multiply the interval
by
, which stretches it to
, then add
, which moves it to
.
We will consider affine transformations again when we study linear transformations (linear algebra). For now, it is sufficient to know that the code above, altogether, drops a needle on a paper that is 11 units long and 11 units wide. The horizontal lines occur at . Think about why before heading further.
Next, let’s determine if the needle intersects a line. The lines are horizontal and spaced 1 unit apart, so let’s see what happens if we use the ceil (“ceiling”) and floor functions to move the endpoints to the nearest horizontal line.
From the diagrams above, we see that a needle intersects a line, if and only if,
, assuming
.
If , we can switch the names of the endpoints. In other words,
in the equation above really represents the smaller between the two coordinates, and
the larger one. We arrive at our criterion by inserting the min (“minimum”) and max (“maximum”) functions:
% Check if the needle intersects a line criterion = (ceil(min(y1, y2)) == floor(max(y1, y2)));
The variable criterion equals 1 if the needle intersects a line, and 0 otherwise.
3. Monte Carlo simulation
When we tried to find the probability of winning craps and Penney’s game, we used a for loop to create an input and find the output one at a time. We can similarly create the needles one at a time and examine if each one intersects a line:
tic; % Set the number of simulations N = 10^6; % Count how many needles intersect a line numWins = 0; % Run the simulations for i = 1 : N % Create the first point x1 = 9 * rand + 1; y1 = 9 * rand + 1; % Set the angle theta = 2*pi * rand; % Create the second point x2 = x1 + cos(theta); y2 = y1 + sin(theta); % Check if the needle intersects a line criterion = (ceil(min(y1, y2)) == floor(max(y1, y2))); % Update the count numWins = numWins + criterion; end % Return the probability that a needle intersects a line p = numWins / N; toc;
The problem is speed. Notice that I surrounded the code with tic and toc functions. They give us an estimate of how efficient our code is. On my laptop, the code takes about 0.25 seconds to run. The duration seems small, until we consider this:
tic; % Set the number of simulations N = 10^6; % Run the simulations % Create the first point x1 = 9 * rand(1, N) + 1; y1 = 9 * rand(1, N) + 1; % Set the angle theta = 2*pi * rand(1, N); % Create the second point x2 = x1 + cos(theta); y2 = y1 + sin(theta); % Check if a needle intersects a line criterion = (ceil(min(y1, y2)) == floor(max(y1, y2))); % Count how many needles intersect a line numWins = sum(criterion); % Return the probability that a needle intersects a line p = numWins / N; toc;
The second code finishes in about 0.081 seconds, almost a third of the time incurred by the first. The key is vectorization, which is a form of parallelization. We do not use a for loop, which considers only one needle at a time. Instead, we consider all by creating a vector of coordinates, a vector of
coordinates, and a vector of
coordinates. Then, we perform calculations on these vectors by calling functions that are designed to work great with vectors.
Calling such functions is easy in Matlab. Most of the time, we use the same functions that we use for scalars. Compare lines 19-20, 23 in the first code to lines 15-16, 19 in the second. There is no difference on the outside! However, we know that the variables x1, y1, and theta are vectors in the second code. This means, functions such as cos, sin, min, max, ceil, and floor (even the addition + and the comparison ==) can handle both scalar and vector inputs and know what to do in each case. We call the ability of a function to work with different inputs, polymorphism. As we see above, polymorphism makes coding easy.
With a million needles (300 of which are shown below), I found the probability that a needle intersects a line to be,
We get a special number by considering the following quantity:
4. Mathematical proof
Let us find the probability that a needle of length intersects one of the parallel lines, which are separated by a distance of
. We want to know this probability for all values of
and
, not just when
. Martin Aigner, in his book “Proofs from the Book,” provides a simple explanation using calculus. (Well, the idea is simple but calculations get a bit hairy.) I will expand upon his proof here.
First, consider the slope of the needle. Let be the angle that the needle makes with the horizontal axis. We can ignore needles with a negative slope due to symmetry. Just look at the paper from the other side to see a positive slope.
A needle that lies at an angle has height
, while the horizontal lines are
units apart. So when the needle is short (
), the probability that it intersects a line is,
This number is for a particular angle . We get the probability for a particular
and
by averaging over all possible angles:
Now, when the needle is long (), we get the same probability
as long as
, that is, for the angles
. At a larger angle, the needle always crosses a line, so the probability is 1. Hence,
Again, we find the probability for a particular and
by averaging over all possible angles:
In conclusion, the probability that a needle intersects a line is,
We can understand better how the distance between the lines and the length of the needle affect the probability by considering a normalized ratio,
Clearly, a short needle corresponds to , and a long needle to
. In terms of this ratio
, we can write,
We see that, for short needles, the probability that a needle intersects a line increases linearly as its (relative) length increases. When the needle is as long as the distance between the lines, there is a 63.7% chance that the needle intersects one of them. For long needles, the probability increases sublinearly. Even when the needle is 3 times as long as the distance between the lines, there is still about 10% chance that the needle does not intersect any of the lines. How’s that for a slice of fried gold?
Notes
For long needles, the criterion that I came up with using the ceiling and floor functions will not work. Imagine what would happen if a needle intersects more than one line.
However, with a slight modification, we can detect whether a needle intersects a line. This works for both short and long needles.
% Set the distance d and the length l % Create the second point x2 = x1 + l * cos(theta); y2 = y1 + l * sin(theta); % Check if the needle intersects a line criterion = (ceil(min(y1, y2) / d) <= floor(max(y1, y2) / d));
Keep in mind that vectorization does come at a cost. If a vector is too long to fit inside memory, a vectorized code may take more time to run than a serial code (with a for loop). Knowing how far we can take vectors is an art by itself.
You can find the code in its entirety and in other languages here:
Download from GitHub