Onion - the missing ingredient for Sage Line 50 / Sage Instant accounts packs in Excel

Onion - the missing ingredient for Sage Line 50 / Sage Instant accounts packs in Excel
Full audit trails to underlying transactions from P&Ls, Balance Sheets, graphs, PivotTables, and departmental TBs in a stand-alone Excel file. Aged Debtors and Aged Creditor files too. Free 30 day trials. Download today at www.onionrs.co.uk

Friday 26 August 2011

Running Totals using SQL and the Text File ODBC driver

With a data file containing detailed dated transactions how can you use SQL to record a running total?

Here's the RTData.txt data file:

Date
Amount
A
01/01/2011
12.30
A
15/01/2011
16.00
B
16/01/2011
5.69
A
01/02/2011
65.12
B
01/02/2011
0.58
A
14/02/2011
8.91
A
28/02/2011
1.00
A
15/03/2011
9.00
B
01/04/2011
31.45
B
01/05/2011
738.00
B
21/05/2011
9.11
A
21/05/2011
10.93


Schema.ini looks as follows:

[RTData.txt]
Format=TabDelimited
ColNameHeader=True
MaxScanRows=0
Col1=Item Text width 255
Col2=Date Date
Col3=Amount Currency


The SQL written to show the running totals is as follows:

SELECT Cur.Item, Cur.Date, Cur.Amount, Sum(Cum.Amount) AS 'Running Total'
FROM RTData.txt Cum, RTData.txt Cur
WHERE Cum.Item = Cur.Item AND Cum.Date <= Cur.Date
GROUP BY Cur.Item, Cur.Date, Cur.Amount
ORDER BY Cur.Item, Cur.Date


The output is as follows:

ItemDateAmountRunning Total
A01/01/201112.3012.30
A15/01/201116.0028.30
A01/02/201165.1293.42
A14/02/20118.91102.33
A28/02/20111.00103.33
A15/03/20119.00112.33
A21/05/201110.93123.26
B16/01/20115.695.69
B01/02/20110.586.27
B01/04/201131.4537.72
B01/05/2011738.00775.72
B21/05/20119.11784.83


It adds another layer of complexity if we want to generate monthly totals and also keep a running total.  This SQL will do it:

SELECT Cur.item, Cur.month, sum(Cur.Amount)/count(Cur.Amount) AS Amount, sum(Cum.Amount) AS 'Running Total'
FROM (SELECT RTD.Item, (year(RTD.Date)*100)+month(RTD.Date) AS Month, Sum(RTD.Amount) AS Amount
FROM RTData.txt RTD
Group By RTD.Item, (year(RTD.Date)*100)+month(RTD.Date)) as Cur,
(SELECT RTD.Item, (year(RTD.Date)*100)+month(RTD.Date) AS Month, Sum(RTD.Amount) AS Amount
FROM RTData.txt RTD
Group By RTD.Item, (year(RTD.Date)*100)+month(RTD.Date)) as Cum
WHERE Cum.Item = Cur.Item AND Cum.Month <= Cur.Month
Group By Cur.Item, Cur.Month
ORDER BY Cur.Item, Cur.Month


The output is as follows:

ItemMonthAmountRunning Total
A20110128.3028.30
A20110275.03103.33
A2011039.00112.33
A20110510.93123.26
B2011015.695.69
B2011020.586.27
B20110431.4537.72
B201105747.11784.83


The trick here is to realise that the Amount column inflates artificially as we progress and we need to correct for this by dividing by count(Cur.Amount) which returns 1, 2, 3 ... with successive records.

Comments welcome.



Tuesday 2 August 2011

Dealing with mixed data types in Excel ODBC queries (2)

Shortly after posting the first message on this topic I had cause to use it in a real life situation.  All was fine until a cell in my text field had more than 255 (?) characters.  It broke the solution!

I haven't had time to fully investigate exact limits in all of this but I think the Excel ODBC driver will only accept up to about 255 characters in a "Text" field.  If the driver comes across more than 255 characters it interprets it as a "Memo" type content (up to 32k characters?) and the attempt to query fails at the record that contains the offending data.  How to work around?

Using the IMEX=2 approach from the previous article, firstly, I appended 256 spaces into the first rows of the field that needs to be "Memo".  This was to force the majority type interpretation of this field to be "Memo".  Once this had been done I discovered that the query no longer failed but it only returned the first 253 characters of the "Memo" field.  Why I'm getting 253 I'm not sure but it seems this is a limitation of the Excel ODBC driver when dealing with "Memo" fields.  However, I discovered that if I constructed additional calculated fields in my SQL I could bring back the entire contents of the "Memo" field in 253 character chunks.  The SQL select clause entries look as follows:

''+F1 as [Memo Field],
mid(''+F1,254,253) as [Memo Field 2],
mid(''+F1,507,253) as [Memo Field 3], ... and so on.

I'd be really interested if anyone is able to put some better flesh on these bones.