> Dear Splusers,
>
>> A while ago there was some discussion on using dyn.load to bring in
>> subroutines written in C++. I have been using Splus4.0 on a pentium pc
>> in conjunction with ADModel Builder, a non-linear optimization
>> programming language created by Dave Fournier that works with C++. I
>> asked Dave how I might take my ADModel programs (since they are
>> effectively written in C++) into Splus4.0 for execution and analysis.
>> He
>> was able to provide me with a straightforward description of how to do
>> this using dyn.load. I thought it might be useful to some out there to
>> see his approach to this by soliciting his comments on this topic.
>> Dave?
>>
>> Pat
>>
>> Patrick J. Sullivan
>> International Pacific Halibut Commission
>> P. O. Box 95009
>> Seattle, WA 98145-2009 USA
>> Tel:(206) 634-1838
>> pat@iphc.washington.edu
>>
>> -----------------------------------------------------------------------
>>
>> This message was distributed by s-news@wubios.wustl.edu. To
>> unsubscribe
>> send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
>> message: unsubscribe s-news
>
>
> Dear Splusers,
Well first thing Pat -- for Splus version4 release 3 on 95/NT it isn't
dyn.load that you use, it is dll.load. The point is that with this
version of Splus it is extremely easy to create Dll's
that can be dynamically linked to Splus. This has been possible with
small C functions for
some time but more difficult with large C++ programs which have their own
libraries.
My main interest in this is in promotiing my own software which is
desinged for the rapid
development of nonlinear statistical models.
I have noticed how on this list people who have questions about graphics
in splus or
questions about the standard statistical packages generally get a very
quick response
to their questions while people who have a less generic nonlinear
statistical modeling
question whose solution involves the minimization of some complicated
nonlinear function
often receive few responses.
It is somewhat difficult to write nonlinear statistical models in Splus
and the execution of them is extremely slow.
The example I worked up with Pat was the estimation
of the parameters describing a mixture of two bivariate normal
distributions. The Splus
code took about 2.5 hours to reach the minimum. The DLL written in AD
Model Builder
took 20 seconds. The two codes are of about equal complexity although
the
AD Model Builder code has more safeguards to ensure that parameter
estimates for the covariance matrices remain positive definite. While
it is surely possible to
make the splus code more effiecient and there may be better approaches
(such as the
EM algorithm) for this problem the point was to write a simple piece of
code to do the
job as quickly as possible. I think that with simple dynamic linking
tools like Ad Model Builder
can greatly extend the usefullness of splus into the area of general
nonlinear models.
More information about this example and a demonstration version of the
software which
can create the DLL (if you have visual C++ version 5.0) can be found
at
http://otter-rsch.com/admodel.htm
A version using the ecgs (freely available) compiler which can also
make DLL's
will be available soon.
The splus code for the problem is
if(T){
x1 <-
rmvnorm(125,mean=c(0,0),cov=matrix(c(1.78,4.74,4.74,14.4),nrow=2,byrow=T))
x2 <-
rmvnorm(375,mean=c(1,0),cov=matrix(c(1.78,2.37,2.37,4.94),nrow=2,byrow=T))
x.data <- rbind(x1,x2)
}
det <- function(x) prod(eigen(x)$values)
fmix <- function(theta,data=x.data){
p1 <- theta[1]
p2 <- 1-theta[1]
m1 <- theta[2:3]
m2 <- theta[4:5]
v1 <- matrix(c(theta[6],theta[7],theta[7],theta[8]),nrow=2,byrow=T)
v2 <- matrix(c(theta[9],theta[10],theta[10],theta[11]),nrow=2,byrow=T)
v1 <- solve(v1)
v2 <- solve(v2)
fmatprod <- function(data,m=m1,v=v1) t(data-m)%*%v%*%(data-m)
a1 <- apply(data,1,fmatprod,m=m1,v=v1)
a2 <- apply(data,1,fmatprod,m=m2,v=v2)
loglike <- sum(log(p1*sqrt(det(v1))*exp(-0.5*a1)
+p2*sqrt(det(v2))*exp(-0.5*a2)))
return(-loglike)
}
param1 <- c(.25,0,0,1,0,1.78,4.74,14.4,1.78,2.37,4.94)
param2 <- c(.50,-1,0,2,0,1,0,1,1,0,1)
plower <- c(.10,-Inf,-Inf,-Inf,-Inf,0.1,-Inf,0.1,0.1,-Inf,0.1)
pupper <- c(Inf,Inf,Inf,Inf,Inf,Inf,Inf,Inf,Inf,Inf,Inf)
#
if(F){
dos.time(junk <- nlminb(start=param2,objective=fmix,
lower=plower,upper=pupper,max.iter=50))
}
The AD Model Builder code is
DATA_SECTION
splus_int nobs
splus_matrix obs(1,nobs,1,2)
PARAMETER_SECTION
splus_init_bounded_vector pcoff(1,2,.02,1.1,3);
splus_init_bounded_vector C1(1,3,-10.0,10.0,1)
splus_init_bounded_vector C2(1,3,-10.0,10.0,1)
splus_init_vector mu1(1,2,2)
splus_init_vector mu2(1,2,2)
splus_matrix S1(1,2,1,2)
splus_matrix S2(1,2,1,2)
splus_vector p(1,2)
objective_function_value f
PROCEDURE_SECTION
dvariable psum=sum(pcoff);
f+=100.*square(log(psum+1.e-20));
p=pcoff/(psum+1.e-20); // so p's satisfy constraints
dvar_matrix tmp1(1,2,1,2);
dvar_matrix tmp2(1,2,1,2);
tmp1.initialize();
tmp2.initialize();
int ii=1;
int i=0;
for (i=1;i<=2;i++) { // fill lower triangle
for (int j=1;j<=i;j++) {
tmp1(i,j)=C1(ii);
tmp2(i,j)=C2(ii);
ii++;
}
}
S1=tmp1*trans(tmp1); // form S1 S2 from Choleski decomp.
S2=tmp2*trans(tmp2);
for (i=1;i<=2;i++) { // to make positive definite
S1(i,i)+=0.1;
S2(i,i)+=0.1;
}
dvariable det1=sqrt(det(S1));
dvariable det2=sqrt(det(S2));
dvar_matrix S1inv=inv(S1);
dvar_matrix S2inv=inv(S2);
for (i=1;i<=nobs;i++) // add up minus log-likelihood
{
// add the 1.e-60 to avoid log(0)
f-= log(1.e-60+p(1)/det1*exp(-.5*(obs(i)-mu1)*(S1inv*(obs(i)-mu1)))
+p(2)/det2*exp(-.5*(obs(i)-mu2)*(S2inv*(obs(i)-mu2))));
}
RUNTIME_SECTION
maximum_function_evaluations 50,100,10000
This can be turned into a DLL using Visual C++ with the following commands
assuming
that the name of the above file is bimix.tpl.
tpl2dll bimix
Tpl2dll is an AD Model Builder command which creates a C++ file named
bimix.cpp
The command
cl -DOPT_LIB -MD -LD -GX -D__MSVC32__ -nologo admod32.lib adt32.lib
ado32.lib bimix.cpp
which (assuming that the C++ compiler has been properly configured to find
the necessary files)
invokes the visual C++ compiler to produce a DLL named bimix.dll
Now assuming that the data for the problem are contained in a file named
bimix.dat somewhere that
Splus can find it then the following Splus command file will load the DLL
and run the program
and print out the results afterwards. (This is where the dll.load()
function comes in.)
nobs<-scan("bimix.dat",n=1)
x<-matrix(scan("bimix.dat",skip=1),nrow=nobs,ncol=2,byrow=TRUE)
pcoff<-rep(.5,2)
C1<-rep(1,3)
C2<-rep(1,3)
C1[2]<-0
C2[2]<-0
p<-rep(0,2)
mu1<-rep(0,2)
mu1[1]<--1
mu2<-rep(0,2)
mu2[1]<-2
S1<-matrix(0,nrow=2,ncol=2)
S2<-matrix(0,nrow=2,ncol=2)
dll.load("bimix.dll",symbol="bimix")
ans<-.C("bimix",nobs=as.integer(nobs),as.double(x),pcoff=as.double(pcoff),
C1=as.double(C1),C2=as.double(C2),mu1=as.double(mu1),mu2=as.double(mu2),
S1=as.double(S1),S2=as.double(S2),p=as.double(p)," -sp -nohess ")
dll.unload("bimix.dll")
S1<-matrix(ans$S1,nrow=2)
S2<-matrix(ans$S2,nrow=2)
mu1<-ans$mu1
mu2<-ans$mu2
p<-ans$p
print ("Estimated proportions")
print(p)
print ("Estimated mean for component 1")
print(mu1)
print ("Estimated mean for component 2")
print(mu2)
print ("Estimated covariance matrix for component 1")
print(S1)
print ("Estimated covariance matrix for component 2")
print(S2)
>
David A. Fournier
Otter Research Ltd.
PO Box 265, Station A
Nanaimo, B.C. V9R 5K9, Canada
voice/fax (250) 756-0956
http://otter-rsch.com
-----------------------------------------------------------------------
This message was distributed by s-news@wubios.wustl.edu. To unsubscribe
send e-mail to s-news-request@wubios.wustl.edu with the BODY of the
message: unsubscribe s-news