Create dynamic Word Clouds / Tag Clouds in Microsoft Excel
Unlike Tableau Software, Microsoft Excel provides no native feature to create a word cloud (aka tag cloud), i.e. a visual representation of text data where the font size of a word depicts the frequency of this word in a text.
Workarounds for creating word clouds in Microsoft Excel are available for free, some of them for almost 10 years already.
I also made my contribution with the article Wordle Tag Clouds in Microsoft Excel in 2010. All of the workarounds come with some considerable limitations and mine is probably the worst: it requires an Internet connection, needs another click in the webbrowser control and – worst of all – it doesn’t work anymore with Excel 2013, because Microsoft disabled scriptable ActiveX controls on worksheets in Office 2013.
I am fully aware of the fact that word clouds are a very poor tool for data analysis and business data visualizations. However, they definitely have their place in infographics. Moreover, realizing a word cloud in Excel is an interesting VBA challenge.
Hence, I decided to create another workaround for word clouds in Microsoft Excel.
Today’s article introduces my new implementation of word clouds in Excel including the workbook(s) for free download. I will go into the details of my approach in a follow-up post.
The Basic Approaches and the Existing Excel Word Cloud Solutions
Basically a word cloud algorithm isn’t much more than
- inserting and typesetting the words
- formatting the font sizes according to the frequency of the words
- arranging the words in the word cloud
I can think of several approaches how to create a word cloud in Excel:
- Use the cell grid of a worksheet: either set all words in one big cell or use a cell range with one word per cell and change the font size according to the frequency of the words by VBA. This is the most common of the existing solutions, demonstrated by Chris Gemignani of Juice Analytics, my good friend and Excel MVP Chandoo, my compatriot Alois Eckl of Excel-Inside Solutions (website is in German) and Melih Met of Beat Excel, to name just a few.
- Use a Pivot Table and linked images like MrExcel and Excel MVP Bill Jelen demonstrates in this video. This is the only way I can think of how to do this without VBA.
- Use an XY scatter chart to arrange the words, make the markers invisible (no fill and no border) assign the words as data labels and change the font size of the data labels by VBA. I haven’t seen such an implementation yet (and haven’t tried myself), but I do think this could be a possible approach
- Use an external website like Wordle and an ActiveX Webbrowser control like I did in Wordle Tag Clouds in Microsoft Excel
- Use freeform textboxes and a VBA algorithm to position the shapes and resize the font type
All of these approaches and solutions are workarounds and as such they have their disadvantages and limitations. That’s why I wanted to give this another shot. I decided to go with the freeform textboxes. Although it requires extensive use of VBA and may result in some performance issues, it definitely seemed to be the most flexible approach.
The Goals and the Requirements
The goal was to implement a VBA routine which creates a dynamic word cloud inside a Microsoft Excel workbook and meets the following requirements:
- The algorithm works stand-alone, i.e. without any add-in, external software or a website
- The word cloud can be used on any worksheet and does not require a fixed row height or column width of the cell grid
- The size of the word cloud on a worksheet or dashboard can be defined by the user
- Besides the size, the implementation also allows for a minimum of user defined settings like font type, font color, maximum and minimum font size, maximum number of words in the cloud, rotating words and mouse over tooltips for each word in the cloud
- The word cloud can be used on an interactive dashboard, i.e. the user can switch to another data source (list of words) with one single click and the cloud will be recreated automatically
- The word cloud can be used in production. Therefore the algorithm has to be as fast as possible, i.e. creating the word cloud should not take more than 2 or 3 seconds
- Setting up the word cloud in a new or an existing Excel workbook should be as easy as possible
The Result
I am planning to go into the details of the approach and implementation in a follow-up post. Thus, today we will only have a look at the result and the features.
Here is a screenshot of a workbook visualizing the well-known example I used several times before: the lyrics of Bruce Springsteen’s studio albums shown in a word cloud:
The dashboard is divided into three sections:
Section 1 (top left): Inputs
You can select an album using the drop down combo box. After changing the combo box, the word cloud will automatically be updated.
Furthermore you have the option to define some basic parameters in this input section:
- the maximum number of words to be displayed in the cloud
- the font type and color (simply format the font of this cell as you like)
- the maximum and minimum font size used in the cloud
- the screen updating: checking this box “animates” the creation of the cloud, but considerably increases the runtime
- the tooltips: checking this box will create a mouse over tooltip for each word in the cloud displaying the word and the count
- create a group: checking this box will group all words (textboxes) and the cloud frame to one object
- rotation: every nth word will be rotated by 270°. Set this parameter to 0 if all words shall be displayed horizontally
Please note: changing the parameters will not automatically recreate the word cloud, i.e. the changes will only take effect after you ran the routine again, e.g. by selecting another album using the combo box.
Section 2 (bottom left): Results
This is for informational purposes only:
- the number of words on the selected album (without the excluded words)
- the number of words actually displayed in the cloud. If you defined a relatively big maximum number of words (see above) or you defined large font sizes, it can happen that the defined word cloud frame does not have enough real estate to display all words. This result informs you, how many words are shown in the cloud
- the last runtime of the algorithm in seconds
Section 3 (right): The Word Cloud
Well, not much to explain here, I guess. Only two hints:
- If you checked the tooltips check box in the input section (see above), hover over the word cloud with your mouse and you will see a tooltip showing the word and the word count, e.g. “night (19 times)”. When hovering over, the mouse cursor will change to a finger pointer, indicating a click would do something. Please be advised that this is not the case. The tooltips are a workaround exploiting the ScreenTip property of a hyperlink: a hyperlink is assigned to each textbox, but it doesn’t point to an address, so clicking on it won’t do anything.
- You can easily resize the word cloud by simply selecting it, clicking on a handle and dragging it to the size you want. You will definitely distort the display of the word cloud by resizing. Please be aware that the cloud will only be recreated after you ran the algorithm again by e.g. selecting another album or manually starting the VBA sub “CreateWordCloud”.
The Disadvantages of the Approach
My approach comes with a couple of disadvantages and limitations:
- The algorithm is static. The arrangement of the words will always be the same: the word with the largest count will be positioned in the middle of the frame and all other words will be arranged concentrically around this center.
- The font sizes will not be automatically adjusted in order to optimize the use of real estate in the cloud. The algorithm will mindlessly use the maximum and minimum font size you defined in the input section, although this may either leave a lot of real estate unused or even lead to missing words in the cloud (see also next bullet point).
- Depending on the number of words, the size of the cloud frame and the defined font sizes, it can happen that the cloud does not display all words. If a word doesn’t fit into the real estate, it will simply be skipped.
- There is no option to use more than one font type or more than one color to e.g. visualize an additional variable by color coding.
- The performance: it always depends on the number of words and the size of the cloud frame. In the example workbook shown above (and posted for download below), the algorithm needs 2 to 3 seconds to update the word cloud. Not really fast, but in my book fast enough even for a word cloud on an interactive dashboard.
Agreed, a lot of disadvantages. But still: it works and creates a word cloud inside a stand alone Excel workbook.
If you do not like the drawbacks and limitations, please remember what Frank Underwood says in House of Cards:
“It is so refreshing to work with people who throw a saddle on a gift horse rather than look into its mouth.”
So, give it a try and if you have suggestions how to overcome the limitations, please share them with us in a comment or – even better - post your improved version for download somewhere.
The Download Links
Word clouds of the lyrics of Bruce Springsteen’s studio albums:
Download Word Clouds Bruce Springsteen (Microsoft Excel 2007-2013, 518.4K)
A generic template for creating a word cloud in Microsoft Excel:
Download Word Cloud Generic Template (Microsoft Excel 2007-2013, 78.7K)
An example how to create the word cloud with a User Defined Function:
Download Word Cloud UDF (Microsoft Excel 2007-2013, 72.4K)
Acknowledgements
Many thanks go to my good friend and Microsoft Excel MVP Jordan Goldmeier of the Option Explicit VBA Blog for reviewing my workbook prior to publishing. As always, Jordan’s thorough review and thoughtful feedback were invaluable. Thanks, Jordan.
Furthermore I would like to thank my good friend and Excel MVP Daniel Ferry (Excel Hero Academy) for the EHA Timer and Excel MVP Chip Pearson (Pearson Software Consulting) for providing the quick sort algorithm I used in this project. Thanks, gentlemen.
What’s Next?
As already mentioned above, the next article will go into the details of the approach, the algorithm and the implementation in VBA.
Stay tuned.