We can deal with data issues and make our visualization more interactive by adding filters. In this section we will deal with the message at the bottom of our map that says 3 unknown. We will also add an interactive Category filter.
At its most basic level, a filter allows us to specify what values to include or exclude in our visualization. This is how we can deal with many data issues, like null (non-existent) values. It is also possible to add an interactive filter. This kind of filter (generally) does not add or remove anything from the visualization to start with, but viewers of the finished project can use the filter to specify or exclude values temporarily.
For example: We could add a filter to include only the United States in our visualization, meaning only mods created in the US would be shown. Alternatively, we could add a generic country filter. A user could then choose the United States to only see mods created in the US, but they could also choose any other country and would be able to go back to viewing all of the countries at any time.
Since the moment we added Country to the worksheet, this "3 unknown" message at the bottom of the map has been bothering me.
Note that we did not actually change the data - the Country fields for these records will still state England or Wales, but now Tableau will know how to map these values. This would not be a solution for cleaning messy data. A good option is OpenRefine, a free, open-source tool that makes it easy to investigate and clean data. OU Digital Scholarship maintains an OpenRefine tutorial if you are interested.
There is not a good value to map our unspecified countries. We do want to remove them, however, because they are being counted as a country in their own right. This can interfere with our visualization. The easiest way to see this is to hover over the United States and see that it is ranked #2 for number of mods although it should be ranked #1. This filter is to solve a data issue, so it will not need to be interactive.
Note: An alternative to this would be to find Country on the Marks shelf, right click, and choose Filter...
Note: An alternative would be to select every value except for Not Specified and to leave Exclude unchecked. Both options would have the same result.
Now we can add an interactive filter that will allow users to view the distribution of mods for the various mod categories. This kind of filter will not make any changes to our data. Instead, users will be able to manipulate the finished visualization themselves.
Make sure you can see the orange triangle we noticed earlier when adding Country. If you do not, that may be because Tableau thinks you want to replace the Country filter with Category. The solution is to make sure not to drag Category on top of Country.
By default, this filter will be a long list of checkboxes. Note that the first option is (All) and that the second is Null.
The second option on our new filter is Null, which is a placeholder for a non-existent value. As a category, null is meaningless, so it would be best to filter it out. More than that, we do not want to include this as an option in the interactive filter provided to users. This is a data issue that should be taken care of behind the scenes. Unfortunately, since we already have a category filter, removing null values (without changing that filter) is a little bit complicated. Simply unchecking the box on the filter would still leave it visible to end users, so we have to add a second category filter. Dragging Category to Filters will not accomplish this: We will be able to edit our current filter, but we will not be able to add a second filter for the given dimension. Instead, we will have to use a workaround.
If you are curious, there are two mods with no value for category. This was not an error in data collection: I looked up the mod pages for those two mods, and somehow they do not have categories.
You may notice that Null is still displayed on the worksheet's category filter. The issue is that the Category filter is currently displaying all possible values from the dataset, even if there are no values and it is therefore irrelevant.
As you can see, the dropdown takes up a lot less space than the list of checkboxes.