Modeling and Simulation
How to Create Monte Carlo Models and Forecasts Using Excel Data
Tables
Data Tables provide a quick and easy way to implement the Monte
Carlo method. Here's how to do it.
by Charley Kyd, MBA Microsoft Excel MVP, 20052014
The Father of Spreadsheet Dashboard Reports

(This is the second article in a series. The first article is,
Introduction to Probabilistic Simulations in Excel.)
During more than three decades of working with spreadsheets
in business, most of the spreadsheet models and forecasts I've seen have used what
statisticians call a deterministic method.
To illustrate, if I were forecasting profits for a period, a
deterministic model would use one number for my forecast of sales,
another number for my forecast of operating expenses, and so on.
Instead,
a more useful method would take a probabilistic approach,
supported by the Monte Carlo method.
In the past, without thinking much about it, I
wrote macros to support the Monte
Carlo method.
But I was wrong back then. There's a much easier way.
The Monte Carlo Method
The Monte Carlo method supports a "stochastic" or
"probabilistic" system.
According to
Wikipedia, "a purely stochastic system is one whose state is
randomly determined, having a random probability distribution or
pattern that may be analyzed statistically but may not be
predicted precisely."
This figure Illustrates a normal probability distribution,
which probably is the best approach for most business use.
To use this method, you first set up your key assumptions to be defined by this curve.
The models in your workbook then randomly select values from
this normal distribution, use those
values in their calculations, record key results, repeat the process many times, and then summarize
your record of results. This process is known as the Monte Carlo
method.
In this article, I'm going to show you how to do this using
Excel Data Tables to record the results produced by each
calculation.
Before I get into the details, however, I need to mention
two things.
First, whenever you open a Monte Carlo analysis that uses
data tables, make sure
that the Monte Carlo workbook is the only workbook open. This is
because it will need to recalculate many times, and if you have
other workbooks open they also will recalculate, needlessly. And
that could make your simulation VERY SLOW.
Second, you might grumble about my examples.
If you work in Finance, you'll probably grumble that the
following examples vastly oversimplify what is typically a complex
financialmodeling process. In this case,
I'll respond that I'm making the model very simple so you
can understand the changes I'm suggesting to your standard
modeling process.
And if you don't work in Finance, you'll probably grumble
that I should have used an example from your own specialty, not
Finance. In this case,
I'll respond that I'm using a simple income statement as the
example because even if you work in engineering, or operations,
or marketing, or wherever, you understand a simple income
statement. Therefore, you'll be able to understand what the
model is doing, and you'll be able to adapt my techniques to your own
models and forecasts.
The Standard Forecasting Method
This
figure illustrates a deterministic forecast...the standard
method in my experience. It essentially asks, "If we have Sales of 100, Cost of
Goods Sold of 35%, Operating Expenses of 45, and Taxes of 25%,
what will our Profits be?"
(Click here to get a copy of this workbook with all the Monte Carlo tables and reports
described on this page. This will allow you to concentrate on how to
adapt the Monte Carlo method to your own company.)
Unlike most such forecasts, this one states its assumptions
explicitly...in column F.
In real life, of course, each assumption might be supported
by separate analyses and forecasts.
Again, the problem with this approach is that we know the
forecast will be incorrect, because most forecasts are
incorrect, and we have no way to express how far wrong the
profit forecast might reasonably be.
So let's fix this forecast...
Begin the Stats Table
This figure illustrates the Stats Table, which I've set up in
a new workbook.
This table translates our four key assumptions into five results that we can use for each iteration of our
forecast.
Let's look at the Sales assumptions...
To calculate a random number from a normal curve of potential sales, we need to know the mean and
standard deviation of our sales curve. If you can calculate
those values directly, you could enter them into cells E5 and F5
directly.
However,
the yellow cells illustrate a less rigorous way to find these
numbers, a way that works pretty well.
As the numbers at the bottom of this figure illustrate, one
standard deviation from the mean represents about 68% of the
potential results from a normal curve. And two standard
deviations represent about 95%.
Therefore, if we estimate the highestfeasible amount of
sales, we could say that the number represents the second
standard deviation above the mean, and enter it in cell C5 in
the Stats Table, repeated below. And we could say that our estimate of the
lowestfeasible amount of sales represents the second standard
deviation below the mean, and enter that number in cell D5 in
the table.
And just to doublecheck our assumptions, we're saying that
there's a 95% chance that our actual sales will be somewhere
between these two numbers.
So now, the average of the max and min values is the mean, as
calculated by this formula:
E5: =AVERAGE(C5:D5)
And the standard deviation is merely onefourth the range
between the max and min values, as
calculated by this formula:
F5: =(C5D5)/4
And now, we need Excel to return a random number from the
normal distribution that's defined by the mean value in cell E5 and the
standard deviation in cell F5. To do so, we use this formula:
H5: =NORM.INV(RAND(),E5,F5)
Now copy the range E5:H5 downward, as shown in the table.
Then enter the labels shown in column I.
To assign these labels as names for the adjacent cells in
column H, first select the range H5:I8. Choose Formulas, Defined
Names, Create from Selection, or press Ctrl+Shift+F3. In the Create Names dialog, make
sure that only Right Column is checked, then choose OK.
For convenience, these four names begin with "c.", which
stands for Current Results. Because Excel sorts names
alphabetically in most lists, this will group those four names
together in those lists, so you can find them easily. Also, when we use these names in
formulas, we'll have no doubt that they came from the Current
Results section of the Stats Table.
Before we can set up the last two columns of the Stats Table, we
need to complete the model.
Set Up the Stochastic Model
This
figure illustrates the model we'll use. Each of the four formulas in the
Assumptions data column references one of the four values you just named in
the Stats Table.
To create this figure, add a new worksheet to your Monte Carlo workbook, and
name the worksheet Model.
Now enter these formulas in column D:
D5: =c.Sales
D6: =c.PctCOGS
D9: =c.OpExp
D12: =c.TaxRate
The formulas in the Amounts section rely on the assumptions in
column D:
G5: =D5
G6: =D6*G5
G7: =G5G6
G9: =D9
G10: =G7G9
G12: =D12*G10
G13: =G10G12
Finally, to finish the figure, use the Create Names dialog to
assign the labels in columns E to the cells on their left, and
then do the same for column H.
Notice that each time you recalculate your workbook, the
model generates different results. We now need to capture those
results automatically for many recalculations.
Set Up the Data Table
We're
now going to set up a Data Table.
This table will automatically recalculate Excel,
return the values for the
items named in row 2 of this figure, record those values
in row 4, recalculate, record the current items
in row 5, and so on...through the last row of the table.
To
begin the Data Table, add a new sheet to your Monte Carlo workbook and name
it Data. Then enter the labels, which are shown in bold
in the preceding figure.
The Seq (sequence) column is convenient for several reasons. To
create the column...
 Enter the value 1 in cell B4.
 With cell B4 selected, press Ctrl+Shift+DownArrow to
select cell B4 through the last cell in column B.
 Choose Home, Editing, Fill, Series to launch the Series
dialog.
 Accept the default Step Value of 1, but enter 5000 as
the Stop Value.
 Choose OK.
You now have the beginnings of a
table with 5000 rows.
Now enter the following formulas in the cells shown:
C3: =m.NetProfit
D3: =m.Sales
E3: =m.PctCOGS
F3: =m.OpExp
G3: =m.TaxRate
It's now time to set up the Data Table. After you do so,
Excel will calculate the workbook 5000 times, because the Data
Table will contain 5000 rows. Each row of the completed table
will contain the values returned to row 3 of the figure after
each calculation.
Before you start your Data Table, however, set your
calculation option to manual. To do so, choose Formulas,
Calculation, Calculation Options, Manual. By taking this step,
make sure that your Data Table will calculate only when you
press the F9 key or save your workbook.
Now, to set up the table so that it displays results that
start in cell C4 of the figure below...
 Select cell B3.
 Press Ctrl+Shift+DownArrow to slide down the Seq column
to the 5000th row of your table.
 Press Ctrl+. (period), which won't appear to do
anything...yet.
 Hold down your Shift key and then press your right
arrow
once. When you do so, Excel will extend your selection by
one column and also will display the top of your table, as shown here, not the bottom. (This was why you
pressed Ctrl plus the period key.) Now tap Shift+RightArrow
several times, until all calculations in row 3 are
selected, through cell G3.
 Make sure that all other workbooks are closed, because
otherwise, the next few steps could take a LONG time.
 Choose Data, Data Tools, WhatIf Analysis, Data Table.
 In the Data Table dialog, click in the Column Input Cell
edit box; click on any empty cell outside of the area of your
table; and then choose OK.
 If you forgot to close other workbooks, you can press
Esc to interrupt the Data Table's calculations.
 After a delay of perhaps ten seconds (assuming you don't
press Esc), your Data Table
should be completed.
Here's what's happened...
Strictly speaking, Data Tables are intended to enter a value
into either one or two cells that your model uses for each calculation.
But in this case, your model ignores the values entered in the
empty cell. Instead, it merely recalculates the model each time
a new value is written. The model's values change each time because of the random numbers that the model contains.
And then the Data Table captures the results we've specified
using
the formulas in
row 3.
Name the Data Table Columns
Here's
the top of the completed Data Table. So that we can reference
the columns of this table easily, we need to name them.
To name the columns,
we can't use the Create Names dialog because we need to exclude
row 3 from the defined names. Therefore...
 Select cell C4.
 Press Ctrl+Shift+DownArrow to extend the selection
through the bottom row of the table. When you do so, Excel
will display the bottom of the table.
 Press Ctrl+Back to display your active cell, which still
is cell C4.
 Press Ctrl+Alt+F3, or choose Formulas, Defined Names,
Define Name, to launch the New Name dialog, which should show
the selected range as the default Refers To range.
 To name that range Profits, type the
name into the Name box, and then choose OK.
Now repeat this process for the other columns shown in the
figure above, from Sales through the TaxRate.
Summarize the Results of the Data Table
We'll summarize our results in a section to the right of the
Data Table, in the area shown here.
To begin, enter the labels shown in this figure. Then use the
Create Names dialog to assign the labels in the range I1:I3 to
the cells on their right. Then do the same with the labels in
the range M1:M2.
The Seq (Sequence) columns, beginning in cells I7 and M7, count the
number of bins into which we want to summarize our data. Use the
Series dialog (Home, Editing, Fill, Series) to set up a series
from 1 through 21 for both sets of sequences.
Now enter these formulas found in the first three rows:
J1: =ROUNDUP(MAX(Profits),0)
J2: =ROUNDDOWN(MIN(Profits),0)
J3: =COUNT(I:I)
N1: =ROUNDUP(MAX(Sales),0)
N2: =ROUNDDOWN(MIN(Sales),0)
The two Frequency sections at the bottom of the figure above
generate the data we'll use for the histograms in our report.
The ProfBins column
contains data that defines the beginning and ending values for
the profit bins. Here are the first two formulas in this column:
J7: =MinProfits
J8:
=(MaxProfitsMinProfits)/(NumBins1)+J7
Copy the formula in cell J8 downward to cell J27. Notice that
the value in cell J27 approximately equals the MaxProfits value.
Now select the range J6:K27 and use the Create Names dialog
to assign the labels at the top of this range as the names for the two columns.
Similarly, here are the first two formulas in the SalesBins
column:
N7: =MinSales
N8:
=(MaxSalesMinSales)/(NumBins1)+N7
Copy the formula in cell N8 downward to cell N27. The value
in cell N27 should approximately equal the MaxSales value.
And as before, select the range N6:O27 and assign the labels
at the top of the selection to the two columns below the top row
in the selection.
We now can use the FREQUENCY function to return the number of
items found in each Profit Bin.
First, select the range K7:K27. Then type in this formula:
=FREQUENCY(Profits,ProfBins)
Now hold down Ctrl and Shift and then press Enter. This key
combination arrayenters the formula in the selected range. And
this formula returns the number of values in the Profits range
that fall into each bin in the ProfBins range.
Similarly, select the range O7:O27, then arrayenter this
formula:
=FREQUENCY(Sales,SalesBins)
Now, with our results summarized, we must take one more step
before we can create our Monte
Carlo forecast.
Complete the Stats Table
You now can complete columns K and L of the Stats Table, shown
here.
To complete this table, first enter the following formulas in
the cells shown:
K5: =AVERAGE(Sales)
K6: =AVERAGE(PctCOGS)
K7: =AVERAGE(OpExp)
K8: =AVERAGE(TaxRate)
K9: =AVERAGE(Profits)
(All of these formulas return values from named ranges in the
Data Table in the Data worksheet.)
Now enter the labels shown in column L, next to the values.
After you've done so, use the Create Names dialog to assign the
five labels in
column L to the cells with the five
formulas.
And now, finally, we can set up the Monte Carlo forecast.
An Overview of the Monte Carlo Forecast
This figure shows our forecast generated by the work done so
far. Each time you recalculate your workbook, it might change
slightly...but it shouldn't change significantly.
The forecast is divided into four sections...
 The Projected Net Profit shows the
simple income statement, which uses the average values for
each item shown.
 The Key Percentiles section shows the
likely values for both sales and profits. It shows that
there's a 25% chance that sales and profits will be equal to
the values shown in the 25% row, or less. It shows that
there's a 50% chance that sales and profits will be equal to
the values shown in the 50% row, or less...and so on.
Also, the
Loss Percentile value shows that
there's a 17% chance that a loss will be incurred.
 The Sales Histogram shows the
distribution of sales results among the 5000 sales forecasts that our
model performed.
 The Profits Histogram shows the
distribution of the 5000 profits forecasts.
Now let's create this report...
Create the Projected Net Profit Section
To
begin, add another worksheet to your workbook and name the sheet
Report. And then enter the labels shown in this figure.
Now enter the formulas shown here:
D4: =s.Sales
D5: =s.PctCOGS*D4
D6: =D4D5
D8: =s.OpExp
D9: =D6D8
D10: =s.TaxRate*D9
D11: =D9D10
Whenever you press the F9 key, Excel recalculates 5000 times
and calculates new averages, which this forecast displays. But
notice that this simple forecast changes very little each time
you recalculate.
However, in a more complex model, you'll probably get more
variation. Therefore, you might need to use more than 5000 rows
in your Data Table of the Data worksheet.
Create the Key Percentiles Section
This
section is easy to set up. To begin, just enter the few labels
in the figure, and then enter the percentages shown in the range B15:B18.
Now enter these formulas:
C15: =PERCENTILE.INC(Sales,$B15)
D15: =PERCENTILE.INC(Profits,$B15)
Now copy each of these formulas down their column as
illustrated in
the figure.
In these formulas, the PERCENTILE.INC function...
=PERCENTILE.INC(array,k)
...returns the Sales and Profit percentiles for each of the k
values (the percentages) shown. For example, 75% of the time profits were 14 or
lower.
And from another perspective,
when the PERCENTILE.INC function uses
a k value of 50%, it returns the median value. That is, like the
MEDIAN worksheet function, it returns
the number in the middle of a set of numbers.
Finally for this section, the Loss Percentile
calculation is basically the
mirror image of
the PERCENTILE.INC function.
An exact mirror image
would begin with a particular value in our column of profit
results, and then return the percentile for that value. To
illustrate, its formula for profits of zero or less would be...
=COUNTIFS(Profits,"<=0")/COUNT(Profits)
...where the COUNTIFS function's arguments are:
=COUNTIFS(criteria_range, criteria,...)
However, in this case, we want to know the percentile for
profits LESS than zero. So its formula is:
D21:
=COUNTIFS(Profits,"<0")/COUNT(Profits)
That is, the formula in cell D21 tells us that 15% of the numbers in the
Profits range have values that are less than zero.
Create the Histograms
Here are the two histograms again.
These two charts summarize the number of calculations (out of
5000) that produced the sales results shown in the top chart and
the profit results shown in the bottom chart.
You'll find the data for these charts in the Data worksheet
in the Sales Frequency and Profits Frequency sections.
Logically, you'd create the Sales Histogram by first
selecting the range N6 through the bottom of the SalesItems
column, as shown here...
...and then choosing Insert, Charts, Column Chart, 2D, Clustered
Column. But if you do so, Excel won't treat the SalesBins column
as data for the X axis. Instead, it treats SalesBins as a second
data series and plots both columns of data.
The
easiest way I've found to get around this problem is to give
Excel a clue that you want the SalesBins column to be used as
the X axis, and not as a separate data series. To do so,
temporarily remove the label in cell N6 from your worksheet. The way I do it is to
select cell N6, select the text in my formula bar, press Ctrl+X
to cut it, then press Enter.
Now you can select the range N6 through the last cell in the
SalesItems column, and choose Insert, Charts, Column Chart, 2D,
Clustered Column. When you do so, Excel will generate a
histogram as you expect.
And then, once your chart is created, select cell N6 again and
press Ctrl+v to paste the SalesBins text back into that cell.
Now you can select your chart, press Ctrl+X to cut it, select
the Report worksheet, and then paste it where you want it. You
then can format the chart to make it pretty.
Wrapping Up the Monte Carlo Analysis
As you think about ways to apply Monte Carlo analysis to your
own data, you might wonder what you could do to reduce the
amount of uncertainty in your forecast. Another look at the Stats Table points
the way.
The yellow cells show our honest opinion about the maximum and
minimum values that we expect for each of our key assumptions.
Therefore, if you want to reduce the uncertainty in your forecast, you'll
need to find realistic ways to narrow the distance between each
set of those Max and Min
values.
But to do that, unfortunately, you'll need to spend a lot
more time and effort to understand what has influenced
performance for those items in the past and what's likely to affect performance in the
future...and to what degree.
Good luck with that!
And again,
you
can click here to get a copy of the Monte Carlo Forecast, with
the formulas and other content I've described in this article.
Tags: #excel, #normal distribution, #stochastic,
#probabilistic, #deterministic, #simulation, #model, #random number, #forecast, #norm.inv, #rand,
#Monte Carlo, #PERCENTILE.INC
