Monday, 26 August 2013

Correctly Performing Calculations SQL

Correctly Performing Calculations SQL

I have a table called #rtn1 with four columns: dt, indx_nm, cusip_id, and
lrtn and another table called #rtn3 with four columns: dt, ticker,
cusip_id, and lrtn. I have bunch of different indx_nm and ticker each with
a bunch of different lrtn for different dates. I have a query written that
looks like:
insert into [CORP\Eng].etf_to_indx_cor (ticker, cusip_id, indx_nm,
cor , dt_pnts)
select b.ticker, b.cusip_id, a.indx_nm,
(AVG(a.lrtn*b.lrtn) - AVG(a.lrtn)*AVG(b.lrtn))/
STDEVP(a.lrtn)/STDEVP(b.lrtn),
COUNT(*)
from #rtn1 a, #rtn3 b
where a.cusip_id < b.cusip_id and a.dt = b.dt
group by b.cusip_id, b.ticker, a.indx_nm
having STDEVP(a.lrtn) > 0 and STDEVP(b.lrtn) > 0
order by 1,2
This is supposed to take each lrtn for the corresponding indx_nm from
#rtn1 and perform (AVG(a.lrtn*b.lrtn) - AVG(a.lrtn)*AVG(b.lrtn))/
STDEVP(a.lrtn)/STDEVP(b.lrtn) against each lrtn for the corresponding
cusip_id. Since the dates only random from todays date and exactly one
year from now, there should be a maximum of around 250 dt_pnts (only
business days). However, some of my results have 1000+ dt_pnts, meaning
that the calculation was run way too many times. How can I fix this?

#rtn1 looks like
dt ticker cusip_id lrtn
2013-08-14 ABQI 33736Q104 -0.0117821120522711
2013-08-13 ABQI 33736Q104 -0.00173822840021087
2013-08-14 ACNACTR 18385P101 -0.00183689770083951
2013-08-13 ACNACTR 18385P101 -0.00132366141379037
2013-08-12 ACNACTR 18385P101 0.018715153518994
and #rtn3 looks like
dt ticker cusip_id lrtn
2013-08-14 CEW 97717W133 0
2013-08-13 CEW 97717W133 -0.010598181
2013-08-14 EDV 921910709 0.015324696
2013-08-13 EDV 921910709 -0.013134064
both with a lot more indx_nm/ticker values and dates and lrtn

No comments:

Post a Comment