You are on the Home/Publications & Training/Business Data Management/Designing an Analysis Template page
Google
Web This Site

Designing an Analysis Template

I have used this technique (or a variant of it) very successfully over the past few years to share with clients an analysis and presentation template that involves PivotTables , PivotCharts, QueryTables, Often, the PT or the QT is simply a way of analyzing data that are subsequently further analyzed and presented in much more visually informative fashion such as an interactive chart.

This method finds use in various different scenarios.  One is a series of web-based educational simulations that I have written over the years.  The client (professor or other instructor) runs the simulation with students.  When done, the data for the exercise are downloaded from the database as an SpreadsheetML document for further analysis in Excel.  This constitutes the Data block in the diagram below.  I distribute the same template and add-in (everything to the right of the red line below) to each client.  S/he establishes a connection to the correct data source and gets the customized results.

In another case, a client had multiple data sets (each representing different geographic and temporal data).  The company wanted a solution that it could use with any data set.  So, I created a template and an add-in that would work with a generic data set.  Then, the client could use the add-in to open a new template, point out the new data source, and get the custom results without any further intervention.

The rest of this document describes the technique.

The Multiple Roles of an Excel Worksheet

At its core, an Excel worksheet can perform four different roles: data entry and acquisition, data storage, data analysis, and, finally, information presentation.  Unfortunately, the ease of use of the generic free-form spreadsheet often leads to a blurring of the roles.  The solution below relies on a strict separation of the analysis (and presentation in some cases) from the storage roles.  Consider the diagram below.  The data (in green) are in one workbook.  The analysis template is a separate workbook.  Though not relevant to this discussion, the code needed to make the template work is, of course, in an add-in.

One of the functions of the code is to link the data connections in the template to the appropriate workbook.   Once that is done, all that remains to be done is to refresh each of the tables and the analysis will be complete!

Designing an Analysis Template

The way to approach the design of a new system is to start with the data set that has to be analyzed.  This should be in a workbook, with one worksheet containing one table – essentially, a format consistent with a relational database layout.

Next, create a new workbook that uses the data workbook as the source for its analysis.  My preferred method for extracting data is MS Query, which lets Excel do all the heavy lifting.  The result is either a QueryTable or a PivotTable.  On several occasions, there is further analysis, which treats the PivotTable or QueryTable as the source.

Once the analysis works satisfactorily, write the code that will search out all the query tables and pivot tables, find their associated data connections, and adapt each to a new workbook of the user’s choosing.

Now the add-in and template are ready for distribution.

Connecting the template to a new data source

The code below is specific to 2007 but can be easily adapted to earlier versions.  It asks the user for a new workbook name (the file that contains the data to be analyzed), goes through each worksheet in the active workbook, and updates the data connection string and the SQL query string to use the new data workbook name.

Option Explicit

 

Function replaceOneToken(ByVal sString As String, ByVal TargetToken As String, _

        ByVal NewString As String, ByRef OldString As String)

    Dim TokenPos As Integer, SemiColonPos As Integer

    TokenPos = InStr(1, sString, TargetToken)

    If TokenPos = 0 Then replaceOneToken = sString: Exit Function

    SemiColonPos = InStr(TokenPos, sString, ";")

    OldString = Mid(sString, TokenPos + Len(TargetToken), _

        SemiColonPos - (TokenPos + Len(TargetToken)))

    replaceOneToken = Left(sString, TokenPos + Len(TargetToken) - 1) & NewString _

        & Mid(sString, SemiColonPos, Len(sString))    End Function

Sub fixOneQT(aQT As Object, Newfilename As String)

        'aQT is either a listobject.querytable or a pivottable.pivotcache

    Dim newDir As String, S As String, Arr() As String, Oldfilename As String, OldDir As String

    On Error GoTo ErrXIT

    S = aQT.Connection

    If InStr(1, S, "DSN=Excel") > 0 Then

        S = replaceOneToken(S, "DBQ=", Newfilename, Oldfilename)

        newDir = Left(Newfilename, InStrRev(Newfilename, Application.PathSeparator) - 1)

        S = replaceOneToken(S, "DefaultDir=", newDir, OldDir)

        aQT.Connection = S

       

        Debug.Print "Old CommandText" & vbNewLine & aQT.CommandText & vbNewLine

        aQT.CommandText = Replace(aQT.CommandText, Oldfilename, Newfilename)

        Debug.Print "New CommandText" & vbNewLine & aQT.CommandText & vbNewLine

        End If

    Exit Sub

ErrXIT:

    MsgBox "fixOneQT Error :" & Err.Description & " (" & Err.Number & ")"

    End Sub

Sub doOneSheet(aWS As Worksheet, Newfilename As String)

    Dim aListObj As ListObject

    For Each aListObj In aWS.ListObjects

        fixOneQT aListObj.QueryTable, Newfilename

        Next aListObj

    Dim aPT As PivotTable

    For Each aPT In aWS.PivotTables

        fixOneQT aPT.PivotCache, Newfilename

        Next aPT

    End Sub

Sub fixOldReferences()

    Dim Newfilename As String, aWS As Worksheet

    Newfilename = Application.GetOpenFilename()

    For Each aWS In ActiveWorkbook.Worksheets

        doOneSheet aWS, Newfilename

        Next aWS

    End Sub

The result of one running the code on one sample workbook is shown below in the form of Debug.Print messages.  The old filename is in yellow, the new in turquoise.

Old connection:

ODBC;DSN=Excel Files;DBQ=C:\Tushar\Office Software\Academic Games\iCMS\02507_palm_springs.xls;DefaultDir=C:\Tushar\Office Software\Academic Games\iCMS;DriverId=790;MaxBufferSize=2048;PageTimeout=5;

 

New connection:

ODBC;DSN=Excel Files;DBQ=C:\Tushar\Office Software\Academic Games\iCMS\02305_san_francisco.xls;DefaultDir=C:\Tushar\Office Software\Academic Games\iCMS;DriverId=790;MaxBufferSize=2048;PageTimeout=5;

 

Old CommandText

SELECT `Player$`.PlayerName, `Player$`.PlayerID, `Player$`.CurrentMarket, `PeriodInfo$`.PeriodID, `PeriodInfo$`.Ei, `PeriodInfo$`.Ri

FROM `PeriodInfo$` `PeriodInfo$`, `Player$` `Player$`

WHERE `PeriodInfo$`.PlayerID = `Player$`.PlayerID

 

New CommandText

SELECT `Player$`.PlayerName, `Player$`.PlayerID, `Player$`.CurrentMarket, `PeriodInfo$`.PeriodID, `PeriodInfo$`.Ei, `PeriodInfo$`.Ri

FROM `PeriodInfo$` `PeriodInfo$`, `Player$` `Player$`

WHERE `PeriodInfo$`.PlayerID = `Player$`.PlayerID

 

Old connection:

ODBC;DSN=Excel Files;DBQ=C:\Tushar\Office Software\Academic Games\iCMS\02507_palm_springs.xls;DefaultDir=C:\Tushar\Office Software\Academic Games\iCMS;DriverId=790;MaxBufferSize=2048;PageTimeout=5;

 

New connection:

ODBC;DSN=Excel Files;DBQ=C:\Tushar\Office Software\Academic Games\iCMS\02305_san_francisco.xls;DefaultDir=C:\Tushar\Office Software\Academic Games\iCMS;DriverId=790;MaxBufferSize=2048;PageTimeout=5;

 

Old CommandText

SELECT `PeriodInfo$`.GameID, `PeriodInfo$`.MarketID, `PeriodInfo$`.PeriodID, `PeriodInfo$`.Ei, `PeriodInfo$`.Xi, `PeriodInfo$`.Yi, `PeriodInfo$`.Custi, `PeriodInfo$_1`.Ei, `PeriodInfo$_1`.Xi, `PeriodInfo$_1`.Yi, `PeriodInfo$_2`.Ei, `PeriodInfo$_2`.Xi, `PeriodInfo$_2`.Yi

FROM `C:\Tushar\Office Software\Academic Games\iCMS\02507_palm_springs.xls`.`PeriodInfo$` `PeriodInfo$`, `C:\Tushar\Office Software\Academic Games\iCMS\02507_palm_springs.xls`.`PeriodInfo$` `PeriodInfo$_1`, `C:\Tushar\Office Software\Academic Games\iCMS\02507_palm_springs.xls`.`PeriodInfo$` `PeriodInfo$_2`

WHERE `PeriodInfo$`.MarketID = `PeriodInfo$_1`.MarketID AND `PeriodInfo$`.PeriodID = `PeriodInfo$_1`.PeriodID AND `PeriodInfo$`.PeriodID = `PeriodInfo$_2`.PeriodID AND `PeriodInfo$`.MarketID = `PeriodInfo$_2`.MarketID AND ((`PeriodInfo$`.MarketID=?) AND (`PeriodInfo$`.PlayerID=?) AND (`PeriodInfo$_1`.PlayerID=?) AND (`PeriodInfo$_2`.PlayerID=?))

 

New CommandText

SELECT `PeriodInfo$`.GameID, `PeriodInfo$`.MarketID, `PeriodInfo$`.PeriodID, `PeriodInfo$`.Ei, `PeriodInfo$`.Xi, `PeriodInfo$`.Yi, `PeriodInfo$`.Custi, `PeriodInfo$_1`.Ei, `PeriodInfo$_1`.Xi, `PeriodInfo$_1`.Yi, `PeriodInfo$_2`.Ei, `PeriodInfo$_2`.Xi, `PeriodInfo$_2`.Yi

FROM `C:\Tushar\Office Software\Academic Games\iCMS\02305_san_francisco.xls`.`PeriodInfo$` `PeriodInfo$`, `C:\Tushar\Office Software\Academic Games\iCMS\02305_san_francisco.xls`.`PeriodInfo$` `PeriodInfo$_1`, `C:\Tushar\Office Software\Academic Games\iCMS\02305_san_francisco.xls`.`PeriodInfo$` `PeriodInfo$_2`

WHERE `PeriodInfo$`.MarketID = `PeriodInfo$_1`.MarketID AND `PeriodInfo$`.PeriodID = `PeriodInfo$_1`.PeriodID AND `PeriodInfo$`.PeriodID = `PeriodInfo$_2`.PeriodID AND `PeriodInfo$`.MarketID = `PeriodInfo$_2`.MarketID AND ((`PeriodInfo$`.MarketID=?) AND (`PeriodInfo$`.PlayerID=?) AND (`PeriodInfo$_1`.PlayerID=?) AND (`PeriodInfo$_2`.PlayerID=?))