Mike's Notes
A great introduction to using Microsoft Excel for Monte Carlo simulation by Jon Wittwer. The 4-part series of articles is combined into a single article here. Vetex42 is an excellent free resource for learning to use Excel.
I recently discovered that Pipi has used Markov chain Monte Carlo (MCMC) since version
6, but the method was unnamed, as is much of Pipi's internal functioning
(A disadvantage of working visually 😇 and finding ideas from anywhere to solve a problem)
Resources
References
Repository
-
Home > Ajabbi Research > Library >
-
Home > Handbook >
Last Updated
15/02/2026
A Practical Guide to Monte Carlo Simulation
By: Jon Wittwer
Vertex42: 1/06/2004
Jon Wittwer started Vertex42 in 2003 while working on a PhD in mechanical
engineering. After finishing his degree, he worked for Sandia National
Laboratories where he kept Vertex42.com running on the side. In 2008, he
left the labs to work on Vertex42 full time. In addition to his expertise
with Excel, Dr. Wittwer is respected in multiple fields for his
development of financial tools such as the Debt Reduction Calculator,
project management tools like the Gantt Chart Template, statistical tools
such as the Monte Carlo Simulator, and a large collection of business
productivity and time management tools.
A Monte Carlo method is a technique that involves using random
numbers and probability to solve problems. The term Monte Carlo Method was
coined by S. Ulam and Nicholas Metropolis in reference to games of chance, a
popular attraction in Monte Carlo, Monaco (Hoffman, 1998; Metropolis and
Ulam, 1949).
Computer simulation has to do with using computer models to imitate
real life or make predictions. When you create a model with a spreadsheet
like Excel, you have a certain number of input parameters and a few
equations that use those inputs to give you a set of outputs (or response
variables).
This type of model is usually deterministic, meaning that you get the same
results no matter how many times you re-calculate.
Example 1: A Deterministic Model for Compound Interest
Deterministic Model
Figure 1: A parametric deterministic model maps a set of input variables
to a set of output variables.
Monte Carlo simulation is a method for iteratively evaluating
a deterministic model using sets of random numbers as inputs. This method is
often used when the model is complex, nonlinear, or involves more than just
a couple uncertain parameters. A simulation can typically involve over
10,000 evaluations of the model, a task which in the past was only practical
using super computers.
By using random inputs, you are essentially turning the deterministic
model into a stochastic model. Example 2 demonstrates this concept with a
very simple problem.
Example 2: A Stochastic Model for a Hinge Assembly
In Example 2, we used simple uniform random numbers as the inputs to the
model. However, a uniform distribution is not the only way to represent
uncertainty. Before describing the steps of the general MC simulation in
detail, a little word about uncertainty propagation:
The Monte Carlo method is just one of many methods for analyzing
uncertainty propagation, where the goal is to determine how random
variation, lack of knowledge, or error affects the sensitivity, performance,
or reliability of the system that is being modeled.
Monte Carlo simulation is categorized as a sampling method because
the inputs are randomly generated from probability distributions to simulate
the process of sampling from an actual population. So, we try to choose a
distribution for the inputs that most closely matches data we already have,
or best represents our current state of knowledge. The data generated from
the simulation can be represented as probability distributions (or
histograms) or converted to error bars, reliability predictions, tolerance
zones, and confidence intervals. (See Figure 2).
Uncertainty Propagation
Monte Carlo Analysis
Figure 2: Schematic showing the principal of stochastic uncertainty
propagation. (The basic principle behind Monte Carlo simulation.)
If you have made it this far, congratulations! Now for the fun part!
The steps in Monte Carlo simulation corresponding to the uncertainty
propagation shown in Figure 2 are fairly simple, and can be easily
implemented in Excel for simple models. All we need to do is follow the
five simple steps listed below:
-
Step 1: Create a parametric model,
y = f(x1, x2, ..., xq).
-
Step 2: Generate a set of random inputs, xi1, xi2, ..., xiq.
-
Step 3: Evaluate the model and store the results as yi.
-
Step 4: Repeat steps 2 and 3 for i = 1 to n.
-
Step 5:
Analyze the results using histograms, summary statistics, confidence
intervals, etc.
On to an example problem ...
REFERENCES:
-
Hoffman, P., 1998, The Man Who Loved Only Numbers: The Story of Paul
Erdos and the Search for Mathematical Truth. New York: Hyperion, pp.
238-239.
-
Metropolis, N. and Ulam, S., 1949, "The Monte Carlo Method." J. Amer.
Stat. Assoc. 44, 335-341.
-
Eric W. Weisstein. "Monte Carlo Method." From MathWorld--A Wolfram Web
Resource.
-
Paul Coddington. "Monte Carlo Simulation for Statistical Physics."
Northeast Parallel Architectures Center at Syracuse University.
http://www.npac.syr.edu/users/paulc/lectures/montecarlo/p_montecarlo.html
-
Decisioneering.com. "What is Monte Carlo Simulation?"" Part of: Risk
Analysis Overview -
http://www.decisioneering.com/risk-analysis-start.html
Our example of Monte Carlo simulation in Excel will be a simplified sales forecast model. Each step of the analysis will be described in detail.
The Scenario: Company XYZ wants to know how profitable it will be to
market their new gadget, realizing there are many uncertainties associated
with market size, expenses, and revenue.
The Method: Use a Monte Carlo Simulation to estimate profit and
evaluate risk.
You can download the example spreadsheet by following the
instructions below. You will probably want to refer to the spreadsheet
occasionally as we proceed with this example.
Download the Sales Forecast Example
Step 1: Creating the Model
We are going to use a top-down approach to create the sales forecast model,
starting with:
Profit = Income - Expenses
Both income and expenses are uncertain parameters, but we aren't going to
stop here, because one of the purposes of developing a model is to try to
break the problem down into more fundamental quantities. Ideally, we want
all the inputs to be independent. Does income depend on expenses? If
so, our model needs to take this into account somehow.
We'll say that Income comes solely from the number of sales (S) multiplied
by the profit per sale (P) resulting from an individual purchase of a
gadget, so Income = S*P. The profit per sale takes into account the sale
price, the initial cost to manufacturer or purchase the product wholesale,
and other transaction fees (credit cards, shipping, etc.). For our purposes,
we'll say the P may fluctuate between $47 and $53.
We could just leave the number of sales as one of the primary variables,
but for this example, Company XYZ generates sales through purchasing leads.
The number of sales per month is the number of leads per month (L)
multiplied by the conversion rate (R) (the percentage of leads that result
in sales). So our final equation for Income is:
Income = L*R*P
We'll consider the Expenses to be a combination of fixed overhead (H) plus
the total cost of the leads. For this model, the cost of a single lead (C)
varies between $0.20 and $0.80. Based upon some market research, Company XYZ
expects the number of leads per month (L) to vary between 1200 and 1800. Our
final model for Company XYZ's sales forecast is:
Profit = L*R*P - (H + L*C)
Y = Profits
X1 = L
X2 = C
X3 = R
X4 = P
Notice that H is also part of the equation, but we are going to treat it as
a constant in this example. The inputs to the Monte Carlo simulation are
just the uncertain parameters (Xi).
This is not a comprehensive treatment of modeling methods, but I used this
example to demonstrate an important concept in uncertainty propagation,
namely correlation. After breaking Income and Expenses down into more
fundamental and measurable quantities, we found that the number of leads (L)
affected both income and expenses. Therefore, income and expenses are not
independent. We could probably break the problem down even further, but we
won't in this example. We'll assume that L, R, P, H, and C are all
independent.
Note: In my opinion, it is easier to decompose a model into
independent variables (when possible) than to try to mess with
correlation between random inputs.
Step 2: Generating Random Inputs
The key to Monte Carlo simulation is generating the set of random inputs.
As with any modeling and prediction method, the "garbage in equals garbage
out" principle applies. For now, I am going to avoid the questions "How do I
know what distribution to use for my inputs?" and "How do I make sure I am
using a good random number generator?" and get right to the details of how
to implement the method in Excel.
For this example, we're going to use a Uniform Distribution to represent
the four uncertain parameters. The inputs are summarized in the table shown
below. (If you haven't already, Download the example spreadsheet).
Sales Forecast Input Table
Figure 1: Screen capture from the example sales forecast spreadsheet.
The table above uses "Min" and "Max" to indicate the uncertainty in L, C,
R, and P. To generate a random number between "Min" and "Max", we use
the following formula in Excel (Replacing "min" and "max" with cell
references):
= min + RAND()*(max-min)
You can also use the Random Number Generation tool in Excel's
Analysis ToolPak Add-In to kick out a bunch of static random numbers for a
few distributions. However, in this example we are going to make use of
Excel's RAND() formula so that every
time the worksheet recalculates, a new random number is generated.
Let's say we want to run n=5000
evaluations of our model. This is a
fairly low number when it comes to Monte Carlo simulation, and you will see
why once we begin to analyze the results.
A very convenient way to organize the data in Excel is to make a column for
each variable as shown in the screen capture below.
Random Inputs in Column Format
Figure 2: Screen capture from the example sales forecast spreadsheet.
Cell A2 contains the formula:
=Model!$F$14+RAND()*(Model!$G$14-Model!$F$14)
Note that the reference
Model!$F$14 refers to the corresponding
Min value for the variable L on the Model worksheet, as shown in Figure 1.
(Hopefully you have downloaded the example spreadsheet and are following
along.)
To generate 5000 random numbers for L, you simply copy the formula down
5000 rows. You repeat the process for the other variables (except for H,
which is constant).
Step 3: Evaluating the Model
Our model is very simple, so to evaluate the output of our model (the
Profit) for each run of the simulation, we just put the equation in another
column next to the inputs, as shown in Figure 2.
Cell G2 contains the formula:
=A2*C2*D2-(E2+A2*B2)
Step 4: Running the Simulation
To iteratively evaluate our model, we don't need to write a fancy macro for
this example. We simply copy the formula for profit down 5000 rows, making
sure that we use relative references in the formula (no $ signs). Each row
represents a single evaluation of the model, with columns A-E as inputs and
the Profit as the output.
Re-run the Simulation: F9
Although we still need to analyze the data, we have essentially completed a
Monte Carlo simulation. We have used the volatile RAND() function. So, to
re-run the entire simulation all we have to do is recalculate the worksheet
(F9 is the shortcut).
This may seem like a strange way to implement Monte Carlo simulation, but
think about what is going on behind the scenes every time the Worksheet
recalculates: (1) 5000 sets of random inputs are generated (2) The model is
evaluated for all 5000 sets. Excel is handling all of the iteration.
If your model is not simple enough to include in a single formula, you can
create your own custom Excel function (see my article on user-defined
functions), or you can create a macro to iteratively evaluate your model and
dump the data into a worksheet in a similar format to this example (Update
9/8/2014: See my new Monte Carlo Simulation template).
In practice, it is usually more convenient to buy an add-on for Excel than
to do a Monte Carlo analysis from scratch every time. But not everyone has
the money to spend, and hopefully the skills you will learn from this
example will aid in future data analysis and modeling.
A Few Other Distributions
My new Monte Carlo Simulation template includes a worksheet that calculates
inputs sampled from a variety of distributions. Some of the formulas are
listed below.
Normal (Gaussian) distribution
To generate a random number from a Normal distribution you would use the
following formula in Excel:
=NORMINV(rand(),mean,standard_dev)
Ex: =NORMINV(RAND(),$D$4,$D$5)
Excel 2010+: =NORM.INV(RAND(),$D$4,$D$5)
Lognormal distribution
To generate a random number from a Lognormal distribution with median =
exp(meanlog), and shape = sdlog, you would use the following formula in
Excel:
=LOGINV(RAND(),meanlog,sdlog)
Ex: =LOGINV(RAND(),$D$6,$D$5)
Excel 2010+: =LOGNORM.INV(RAND(),$D$4,$D$5)
Weibull distribution
There isn't an inverse Weibull function in Excel, but the formula is quite
simple, so to generate a random number from a (2-parameter) Weibull
distribution with scale = c, and shape = m, you would use the following
formula in Excel:
=c*(-LN(1-RAND()))^(1/m)
Ex: $C$5*(-LN(1-RAND()))^(1/$C$6)
Beta distribution
This distribution can be used for variables with finite bounds (A,B). It
uses two shape parameters, alpha and beta. When alpha=beta=1, you get a
Uniform distribution. When alpha=beta=2, you get a dome-shaped distribution
which is often used in place of the Triangular distribution. When
alpha=beta=5 (or higher), you get a bell-shaped distribution. When
alpha<>beta (not equal), you get a variety of skewed shapes.
Excel 2010+: =BETA.INV(RAND(),alpha,beta,A,B)
MORE Distribution Functions: Dr. Roger Myerson provides a free
downloadable Excel add-in, Simtools.xla, that includes many other
distribution functions for generating random numbers in Excel.
Creating a histogram is an essential part of doing a statistical
analysis because it provides a visual representation of data.
In Part 3 of this Monte Carlo Simulation example, we iteratively ran a
stochastic sales forecast model to end up with 5000 possible values
(observations) for our single response variable, profit. If you have not
already, download the Sales Forecast Example Spreadsheet.
The last step is to analyze the results to figure out how much the profit
might be expected to vary based on our uncertainty in the values used as
inputs for our model. We will start off by creating a histogram in Excel.
The image below shows the end result. Keep reading below to learn how to
make the histogram.
Histogram With Excel
Figure 1: A Histogram in Excel for the response variable Profit, created
using a Bar Chart.
(From a Monte Carlo simulation using n = 5000 points and 40 bins).
We can glean a lot of information from this histogram:
-
It looks like profit will be positive, most of the time.
-
The uncertainty is quite large, varying between -1000 to 3400.
-
The distribution does not look like a perfect Normal distribution.
-
There doesn't appear to be outliers, truncation, multiple modes,
etc.
The histogram tells a good story, but in many cases, we want to
estimate the probability of being below or above some value, or between a
set of specification limits. To skip ahead to the next step in our analysis,
move on to Summary Statistics, or continue reading below to learn how to
create the histogram in Excel.
Creating a Histogram in Excel
Update 7/2/15: A Histogram chart is one of the new built-in chart types in
Excel 2016, finally! (Read about it).
Method 1: Using the Histogram Tool in the Analysis Tool-Pak.
This is probably the easiest method, but you have to re-run the tool each
to you do a new simulation. AND, you still need to create an array of bins
(which will be discussed below).
Method 2: Using the
FREQUENCY function in Excel.
This is the method used in the spreadsheet for the sales forecast example.
One of the reasons I like this method is that you can make the histogram
dynamic, meaning that every time you re-run the MC simulation, the chart
will automatically update. This is how you do it:
Step 1: Create an array of bins
The figure below shows how to easily create a dynamic array of bins. This
is a basic technique for creating an array of N evenly spaced numbers.
To create the dynamic array, enter the following formulas:
B6 = $B$2
B7 = B6+($B$3-$B$2)/5
Then, copy cell B7 down to B11
Array of Bins in Excel
Figure 2: A dynamic array of 5 bins.
After you create the array of bins, you can go ahead and use the Histogram
tool, or you can proceed with the next step.
Step 2: Use Excel's
FREQUENCY formula
The next figure is a screen shot from the example Monte Carlo simulation.
I'm not going to explain the
FREQUENCY function in detail since you
can look it up in the Excel's help file. But, one thing to remember is that
it is an array function, and after you enter the formula, you will need to
press Ctrl+Shift+Enter. Note that the simulation results (Profit) are in
column G and there are 5000 data points
( Points: J5=COUNT(G:G) ).
The Formula for the Count column:
FREQUENCY(data_array,bins_array)
a) Select cells J8:J48
b) Enter the array formula: =FREQUENCY(G:G,I8:I48)
c) Press Ctrl+Shift+Enter
Layout for Creating a Scaled Histogram
Figure 3: Layout in Excel for Creating a Dynamic Scaled Histogram.
Creating a Scaled Histogram
If you want to compare your histogram with a probability distribution, you
will need to scale the histogram so that the area under the curve is equal
to 1 (one of the properties of probability distributions). Histograms
normally include the count of the data points that fall into each bin on the
y-axis, but after scaling, the y-axis will be the frequency (a
not-so-easy-to-interpret number that in all practicality you can just not
worry about). The frequency doesn't represent probability!
To scale the histogram, use the following method:
Scaled = (Count/Points) / (BinSize)
a) K8 = (J8/$J$5)/($I$9-$I$8)
b) Copy cell K8 down to K48
c) Press F9 to force a recalculation (may take a while)
Step 3: Create the Histogram Chart
Bar Chart, Line Chart, or Area Chart:
To create the histogram, just create a bar chart using the Bins column for
the Labels and the Count or Scaled column as the Values. Tip: To reduce the
spacing between the bars, right-click on the bars and select "Format Data
Series...". Then go to the Options tab and reduce the Gap. Figure 1 above
was created this way.
A More Flexible Histogram Chart
One of the problems with using bar charts and area charts is that the
numbers on the x-axis are just labels. This can make it very difficult to
overlay data that uses a different number of points or to show the proper
scale when bins are not all the same size. However, you CAN use a scatter
plot to create a histogram. After creating a line using the Bins column for
the X Values and Count or Scaled column for the Y Values, add Y Error Bars
to the line that extend down to the x-axis (by setting the Percentage to
100%). You can right-click on these error bars to change the line widths,
color, etc.
Histogram Via Error Bars
Figure 4: Example Histogram Created Using a Scatter Plot and Error
Bars.
REFERENCES:
CITE THIS PAGE AS:
Wittwer, J.W., "Creating a Histogram In Excel" From Vertex42.com, June 1,
2004