How to create a Microsoft Excel Pivot Table lookalike Crosstab with Texts in the Value Area using Array Formulas
Pivot Tables are probably the most powerful built-in data analysis feature of Microsoft Excel. With only a couple of mouse clicks, you can summarize and analyze your data from different perspectives. Well, I do not want to carry coals to Newcastle. I am sure you know all about the power of Pivot Tables.
However, there is one thing you can’t do with Pivot Tables: create a cross-tabular view with texts in the value area.
A real-life example should help to understand what I am talking about: Let’s assume you are managing a big project and you have a list of all existing project risks. The table includes the descriptions of the risks and you classified each of them using different categories like the risk impact (high, medium, low), the probability (very high, high, medium, low) and maybe the type of the risk, the subproject or the name of the responsible project member. If it is a large project with a considerable number of risks, you may be interested in a cross-tabular view of your risk list with e.g. the impact in the rows, the probabilities in the columns and maybe the option to filter the view by a third category like the risk type.
At first sight this looks like the perfect use case for Pivot Tables, but a Pivot Table aggregates data, i.e. Pivot Tables always have numerical results in the value area. In other words, all you can show in the value area is the count of risks for each combination. But what if the count is not enough and you prefer seeing the description of all risks for all combinations of impact and probability?
Today’s post is the first of two articles showing how to emulate a Pivot Table with texts in the value area. There are several ways of doing this, but there are 2 general approaches: with or without VBA. Today's article demonstrates a way to achieve this with Excel’s array formulas only. The next article will provide a VBA based solution which overcomes most of the weaknesses of today’s formula based approach.
As usual the post includes an example Excel workbook for free download.
The Challenge
Today’s challenge is creating a Pivot Table lookalike crosstab with texts in the value area using formulas only, i.e. no VBA. I hope I already described the challenge sufficiently in the introduction to this article.
The Preparation of the Workbook
Before we start to develop the array formulas, we have to do a couple of preparation steps:
- Bring your data into your workbook. In our example workbook (download link see below), we have a simple list of 150 data rows with 5 columns: the index, the risk description, the type, the impact and the probability.
- Assign names to the 5 data columns (myIndex, myRisks, myTypes, myImpacts, myProbabilities).
- Insert a new worksheet [Control], define the lists of possible entries for all three categories (types, impacts, probabilities) and assign range names to these lists.
- Add an entry “All” to the list of the risk types (which will be the filter in our example) and define a target cell for the number of the selected type (to be used as the target cell of the filter dropdown later).
- Insert another worksheet [Pivot Table Risks] and define the size of the crosstab. This is the problematic part. Since we are restricted to a formula based approach today, we have to decide in advance, which category will be shown in the rows and which will be shown in the columns.
Even more problematic is the fact that we have to define a sufficient number of rows for each entry of the row category, although we do not yet know how many entries we need here. If we insert too few rows, some risks will not be displayed, if we insert too many, we will probably have a couple of empty rows. - Add a dropdown for the filter (form control combo box) and define the input range and the target cell.
- Finally format the crosstab the way you like.
The Implementation – The Array Formulas
To make the formula more readable and easier to handle, we insert 2 helper columns on the worksheet [Control] from row 35 downwards. The formulas in those helper columns are simple: they create a row index and the impact in the row. For sure, you can create formulas without the helper columns, but from my point of view this is the easiest way of implementing this. Additionally we add further names for the number of the “All” entry of the filter (myAll) and the number of the actual selection (mySelectedTypeNo). You do not necessarily need those either, but they improve the readability of the formulas.
Finally, here is the beating heart of the implementation, the array formula within the value area of our crosstab:
INDEX(myRisks,
SMALL(
IF((myImpacts=Control!$C39)*(myProbabilities=E$10)*
IF(mySelectedTypeNo=myAll,TRUE,(myTypes=mySelectedType)),
myIndex),
Control!$B39)))
In a nutshell, the formula combines INDEX and SMALL. The SMALL function returns the nth position of a risk belonging to the impact category and the probability category and fulfilling the filter criterion. The INDEX function then fetches the description of this risk from the risk table.
Finally the INDEX formula shown above has to be embedded into an error handling procedure. If you are using Excel 2007/2010 you can do this with a simple IFERROR statement. If you are using Excel 2003 or earlier, you have to include a more lengthy IF(IFERROR(...) statement, but this just repeats parts of the formula described above.
That’s it.
The Pros
- It works, at least within the predefined static structure
- No VBA necessary
The Cons
Besides the fact that it takes a lot of preparation work to set it up and the complex array formulas, the static layout of the crosstab is the main disadvantage of the formula based approach. You have to define in advance the size of the crosstab and this leads to a couple of undesired effects:
- As already described above, the number of rows has to be static. You will always either see some empty rows or – even worse – the number of rows is too small and some risks will not be displayed. For me, this is the main disadvantage.
- The row and column headers always include all entries of those categories. E.g. if you do not have any risks in your data with a probability of “high”, the crosstab will still include this column. A real Pivot Table would not show this empty column.
- There is no easy way of changing the layout of the crosstab. If you want to e.g. swap rows and columns, you have to change the whole layout and the formulas.
- If you add new risks with new category entries (e.g. you add a new type of impact) you have to change the layout of the crosstab, to copy formulas and to change the worksheet [Control].
- The height of the rows will not be adapted automatically. In other words you have to change the row height manually each time you selected another filter.
In a nutshell: the formula approach works, but from my point of view it is way too static.
The Download Link
Here is the download link to the Excel workbook:
Download Pivot Tables_with_Texts - Formula Approach (Excel 2003 workbook, 150K)
The list of risks is taken from a template provided by the California Department of Transportation. The impacts and probabilities are arbitrarily assigned. The data included in the workbook does not mean anything. It is just an example to demonstrate the technique.
What’s next?
The formula based approach has a lot of drawbacks (see above) and it is nothing else than a very simple workaround. The next article will provide a more flexible VBA based solution which overcomes most of those disadvantages.
Stay tuned.