//Readme.txt for Parts 1 and 2 of Clearly and Simply post about exporting data in Tableau views to a database using R. //DISCLAIMER: The author makes no warranty that the information below is complete or correct. //The material below is unsupported: USE AT YOUR OWN RISK //NOTE: Please read the complete Clearly and Simply Tableau | RODBC post (parts 1 and 2) BEFORE following the configuration / Tableau Workbook usage steps discussed in this file. //The Post narrative assumes you have good working knowledge of Tableau. //If you are a Tableau novice, the author suggests FIRST educating yourself about connecting to data sources from within Tableau, editing and refreshing these data sources, //Table Calculations (a special type of Calculation within Tableau), writing "normal" calculations (i.e. formulas), //important built-in Tableau functions (SUM, AVG, COUNT, COUNTD, etc.) used when authoring calculations, data extracts, and building Tableau Views. //Get to know the Help system. //Then have a go at working your way through the material from parts one and two of the post. This section of this file describes the steps to follow to configure the provided resources (the supplied data, Tableau Workbook, and Tableau Calculations) Before working through the examples in the post (please read both parts of the entire post FIRST): 1. Set up the back end databaase (in SQL Server, Access, or other Database Server.) A Microsoft SQL Server Backup of the database is supplied (Tableau_Superstore_Orders.bak) if you have Microsoft SQL Server, you can restore this backup under SSMS. An Access Database (Tableau_Superstore_Orders.mdb) is supplied if you have MS Access but not MS SQL Server. If you have 32 bit MS Access, and are running 64 bit Windows, be sure to read the Appendix to part 1 re: setting up a 32 bit ODBC data source under 64 bit Windows. You can easily upsize the Access database to SQL Server (if you have SQL Server) or export the data from the Access database into another DB Server if you don't have Microsoft SQL Server. Confirm that you can browse the tables in the supplied database on your system using your database server. NOTE: Whatever your Database Server will be, it needs to support the ODBC protocol. This means that an ODBC Driver for your Database Server must be installed on your system. If your database server DOES NOT support ODBC, the examples described in the post will not work. You can still hopefully gain something from reading the post. 2. Download and install R and RStudio as described in the post. Then create an ODBC datasouce as explained in the post. 3. Create the desired file paths for the text files to be written to your system by the relevant Tableau Calculations as described in the post. 4. Place the provided sample text files (OrdersQuery.csv, FilteredOrdersQuery.csv) in the desired destination folder created in step 3 above. 5. Using the supplied Tableau Packaged Workbook (requires Tableau 9.3 or higher). If your Tableau version is before 9.3, you will need to download version 9.3 or higher from the Tableau Customer Portal - or download a trial version of Tableau 10 from the Tableau website, which will open the supplied example .twbx workbook. Note: BEFORE opening the Packaged Workbook for the first time, open RStudio and load the RODBC library and start Rserve as described in the post. When you open the packaged workbook, Tableau will show you a dialog box explaining that the workbook has scripts that would be used by a third part source (your R platform installation). You will need to give Tableau permission to access the R Platform by clicking on "Yes" (or your local language equivilent) within the dialog. You can either: Open the packaged workbook and try to re-create the examples in the .twbx file within new sheets within in a NEW Tableau Workbook (.twb file). You can keep the supplied .twbx file open for guidance as you work through creating the example scenarios in your new workbook. OR Open the the packaged workbook and create a new sheet that uses a direct data connection to data on your system for each of the examples - i.e. re-create each of the examples within the packaged workbook by inserting new sheets with DIRECT data connections (not extracts) to the supplied data resources. After you open the included packaged workbook for the first time, you will need to IMMEDIATELY edit the data source connection information for each sheet and ensure that the data source connection references databases servers and / or files (i.e. the supplied text files, the restored SQL Server backup file, or the MS Access database on your file system if you are using Access as your database). If you haven't edited a Tableau data source before, see the advice aimed at people getting started with Tableau near the start of this file. Many sheets in the supplied .twbx require a manual update to display as explained in the post. After you have edited the data source connection metadata in Tableau, you can manually update the data view in sheets that are set to require a manual update. The sheets requiring a manual update are named: Supterstore Basic Query Setup Filtered Orders Setup (after filtering has been applied as described in the post) INSERT Into SQL Table (after filtering has been applied as described in the post) Orders Across (after filtering has been applied as described in the post) Orders Down (after filtering has been applied as described in the post) DLL Queries Update Query Other sheets will require a refresh of the data source after writing text files, loading data, or editing data so that you see the lastest information. This is also desribed in the post. . Whether you create a new workbook or create new sheets in the supplied.twbx workbookWork through each example in a new sheet. As stated previously, you will need to edit the data source connection for each sheet as most (but not all) sheets connect to a different data source. Use the Data Source metadata in the supplied packaged workbook as a guide. The .twbx was provided so that you can see the metadata for the various data sources, and the fully realized examples in Tableau once views that neeed to be manually updated have been updated. You will want to work with direct connections to all datasources on your system when trying to emulate the supplied examples in the .twbx file so that so that the results of refreshing data connections are immediately visible (i.e. visible without having to regeneerate an extract). NOTE: two of the Tables in the supplied databases (SQL Server and Access) contain data that you may want to DELETE before you start re-creating the examples in the packaged workbook. These two tables are "Data_From_Tableau" and "Data_From_Tableau_Nr_2". These two tables are loaded by INSERT commands from within a Tableau Calculation. By deleting the rows in this table first, you will only see the data you yourself INSERTED when connecting to these Tables from Tableau (to see the results of your INSERTS) as described in the post. 6. Below are the Tableau Calculations required to duplicate the examples in the supplied packaged workbook in the order presented in the Clearly and Simply post. You can copy and paste them into the sheets you create in order to emulate the examples in the supplied .twbx file. Some of these calculations include references to other Tableau Calculations, Tableau Parameters, or Field Names in the underlying database. Create the calculations and parameters in the order presented in the post. The name of each calculation is surrounded by quotation marks: for example - "This is the Calculation name" Calculation "Orders SQL String" 'SELECT * FROM Orders' Calculation "RunOrdersQuery" - a wrapper to run "Orders SQL String" SCRIPT_STR(" library(RODBC) crs$odbc <- odbcConnect('Tableau_Superstore_Orders', believeNRows=TRUE) crs$odbc.tables <- sqlTables(crs$odbc)$TABLE_NAME crs$dataset <- sqlQuery(crs$odbc,.arg1) write.table(crs$dataset, file = 'C:\\Data\\Clearly_And_Simply\\RODBC_Article\\OrdersQuery.csv', row.names = FALSE, quote = FALSE, sep = '|') close(crs$odbc) 'Query Results from refreshed Data Connection'" ,[Orders SQL String] ) The three following calculations reference Tableau Parameters - create them FIRST as described in the post, then create the Calculations that reference them. //Calculation "Category WHERE" 'Category =' + "'" + [Category Condition] + "'" //Calculation "Regional WHERE" 'Region =' + "'" + [Regional Condition] + "'" //Calculation "Customer Segment WHERE" 'Segment =' + "'" + [Customer Segment Condition] + "'" //Calculation "Full SQL Statement" - based on the three previous calculations that reference Tableau Parameters 'SELECT * FROM Orders WHERE ' + [Category WHERE] + ' AND ' + [Customer Segment WHERE] + ' AND ' + [Regional WHERE] //Calculation "RunFilteredOrdersQuery" - a wrapper to run calculation "Full SQL Statement" SCRIPT_STR(" library(RODBC) crs$odbc <- odbcConnect('Tableau_Superstore_Orders', believeNRows=TRUE) crs$odbc.tables <- sqlTables(crs$odbc)$TABLE_NAME crs$dataset <- sqlQuery(crs$odbc,.arg1) write.table(crs$dataset, file = 'C:\\Data\\Clearly_And_Simply\\RODBC_Article\\FilteredOrdersQuery.csv', row.names = FALSE, quote = FALSE, sep = '|') close(crs$odbc) 'Query Ran - Connect to Data If Not Aready Connected'" ,[Full SQL Statement] ) //Calculation "SQL INSERT Statement" 'INSERT INTO Data_From_Tableau ([Category],[Segment],[Region],[Sales]) VALUES (' + "'" + ATTR([Category]) + "'" + ',' + "'" + ATTR([Segment]) + "'" + ',' + "'" + ATTR([Region]) + "'" + ',' + [CONVERT Header] + STR(SUM([Sales])) + ')' + ')' Alternative suggested by Leonid Koyfman: 'INSERT INTO Data_From_Tableau ([Category],[Segment],[Region],[Sales]) VALUES (' + "'" + ATTR([Category]) + "'" + ',' + "'" + ATTR([Segment]) + "'" + ',' + "'" + ATTR([Region]) + "'" + ',' + "1.0*" + STR(SUM([Sales])) + ')' //Calculation "Insert INTO Data From Tableau" - wrapper to run calculation "SQL INSERT Statement" SCRIPT_STR(" library(RODBC) crs$odbc <- odbcConnect('Tableau_Superstore_Orders', believeNRows=TRUE) crs$odbc.tables <- sqlTables(crs$odbc)$TABLE_NAME crs$dataset <- sqlQuery(crs$odbc,.arg1) close(crs$odbc) 'Query Ran - Connect to Data If Not Aready Connected'" ,[SQL INSERT Statement] ) //calculation "SQL INSERT Statement Nr 2" // Configure this table calculation as described in the body of the post. 'INSERT INTO Data_From_Tableau_Nr_2 ([Year],[Region],[Category],[Segment],[Sales],[Profit],[Discount], [OrderCount],[ProductCount],[AvgItemsPerOrder], [AvgYearlySalesAcrossCategories],[Comment]) VALUES (' + "'" + STR(AVG(YEAR([Order Date]))) + "'" + ',' + "'" + ATTR([Region]) + "'" + ',' + "'" + ATTR([Category]) + "'" + ',' + "'" + ATTR([Segment]) + "'" + ',' + [CONVERT Header] + STR(SUM([Sales])) + ')' + ',' + [CONVERT Header] + STR(SUM([Profit])) + ')' + ',' + [CONVERT Header] + STR(AVG([Discount])) + ')' + ',' + [CONVERT Header] + STR([OrderCount]) + ')' + ',' + [CONVERT Header] + STR([ProductCount]) + ')' + ',' + [CONVERT Header] + STR([AvgItemsPerOrder]) + ')' + ',' + [CONVERT Header] + STR([AvgYearlySales]) + ')' + ',' + "'" + [Comment] + "'" + ')' //alternate version suggested by Leonid Koyfman // Configure this table calculation as described in the post. 'INSERT INTO Data_From_Tableau_Nr_2 ([Year],[Region],[Category],[Segment],[Sales],[Profit],[Discount], [OrderCount],[ProductCount],[AvgItemsPerOrder], [AvgYearlySalesAcrossCategories],[Comment]) VALUES (' + "'" + STR(AVG(YEAR([Order Date]))) + "'" + ',' + "'" + ATTR([Region]) + "'" + ',' + "'" + ATTR([Category]) + "'" + ',' + "'" + ATTR([Segment]) + "'" + ',' + "1.0*" + STR(SUM([Sales])) + ',' + "1.0*" + STR(SUM([Profit])) + ',' + "1.0*" + STR(AVG([Discount])) + ',' + "1.0*" + STR([OrderCount]) + ',' + "1.0*" + STR([ProductCount]) + ',' + "1.0*" + STR([AvgItemsPerOrder]) + ',' + "1.0*" + STR([AvgYearlySales]) + ',' + "'" + [Comment] + "'" + ')' // Calculation "Insert INTO Data From Tableau Nr 2 PANE ACROSS | DOWN" - wrapper to run calculation "SQL INSERT Statemnt Nr 2" // Configure this table calculation as described in the post. SCRIPT_STR(" library(RODBC) crs$odbc <- odbcConnect('Tableau_Superstore_Orders', believeNRows=TRUE) crs$odbc.tables <- sqlTables(crs$odbc)$TABLE_NAME crs$dataset <- sqlQuery(crs$odbc,.arg1) close(crs$odbc) 'INSERT Query Ran - Connect to Data If Not Aready Connected'" ,[SQL INSERT Statement Nr 2] ) //Calculation "Insert INTO Data From Tableau Nr 2 PANE DOWN | ACROSS" // Configure this table calculation as described in the post. SCRIPT_STR(" library(RODBC) crs$odbc <- odbcConnect('Tableau_Superstore_Orders', believeNRows=TRUE) crs$odbc.tables <- sqlTables(crs$odbc)$TABLE_NAME crs$dataset <- sqlQuery(crs$odbc,.arg1) close(crs$odbc) 'INSERT Query Ran - Connect to Data If Not Aready Connected'" ,[SQL INSERT Statement Nr 2] ) //Calculation "DDL_Queries" "DROP TABLE Superstore_Sales_WEST;DROP TABLE Superstore_Sales_SOUTH;DROP TABLE Superstore_Sales_CENTRAL;DROP TABLE Superstore_Sales_EAST;" + "SELECT * INTO Superstore_Sales_WEST FROM Orders WHERE Region=" + "'" + "West" + "'" + ';' + "SELECT * INTO Superstore_Sales_SOUTH FROM Orders WHERE Region=" + "'" + "South" + "'" + ';' + "SELECT * INTO Superstore_Sales_CENTRAL FROM Orders WHERE Region=" + "'" + "Central" + "'" + ';' + "SELECT * INTO Superstore_Sales_EAST FROM Orders WHERE Region=" + "'" + "East" + "'" Alternative version suggested by Leonid Koyfman: " DROP TABLE Superstore_Sales_WEST; DROP TABLE Superstore_Sales_SOUTH; DROP TABLE Superstore_Sales_CENTRAL; DROP TABLE Superstore_Sales_EAST; SELECT * INTO Superstore_Sales_WEST FROM Orders WHERE Region='West'; SELECT * INTO Superstore_Sales_SOUTH FROM Orders WHERE Region='South'; SELECT * INTO Superstore_Sales_CENTRAL FROM Orders WHERE Region='Central'; SELECT * INTO Superstore_Sales_EAST FROM Orders WHERE Region='East' " //Calculation "Run_DDL_Queries" - wrapper function to run the previous calculation SCRIPT_STR(" library(RODBC) crs$odbc <- odbcConnect('Tableau_Superstore_Orders', believeNRows=TRUE) crs$odbc.tables <- sqlTables(crs$odbc)$TABLE_NAME crs$dataset <- sqlQuery(crs$odbc,.arg1) close(crs$odbc) 'DDL Queries Ran – Check Database Structure'" ,[DDL_Queries] ) //Calculation "UPDATE Query SQL" "UPDATE [East SS] SET UpdatedComment=" + "'" + "This field was updated from Tableau" + "'" // Calculation "Run UPDATE Query" - wrapper to run previous calculation. SCRIPT_STR(" library(RODBC) crs$odbc <- odbcConnect('Tableau_Superstore_Orders', believeNRows=TRUE) crs$odbc.tables <- sqlTables(crs$odbc)$TABLE_NAME crs$dataset <- sqlQuery(crs$odbc,.arg1) close(crs$odbc) 'Update Query Ran - Check the Database Table'" ,[UPDATE Query SQL] ) //END