I saw a recent
post on OR-Exchange about what programming language is best to for optimization. While I agree with
@JFPuget that languages are just a wrapper for various solvers, there is a learning curve behind how to use each wrapper. My previous entries are about how to program in SAS using optmodel. I took this opportunity to write up the same example inside of R and using
lpsolve.
First up, let's setup the data.
products<-data.frame(
product=c("TRAVEL","CASH Rewards","HOTEL"),
volume=rep(500,3))
prices<-data.frame(
price=c("10.99"),
volume=c(500))
set.seed(123)
customers<-data.frame(
id=seq(1,1000),
customer_status=rbinom(1000,1,0.25))
set.seed(123)
require("triangle")
model.scores<-rbind(
data.frame(
id=seq(1,1000),
price=rep("10.99",1000),
product=rep("TRAVEL",1000),
expected.profit=runif(1000,1,100),
likelihood.to.apply=rtriangle(1000,0.0,1.0,0.6)),
data.frame(
id=seq(1,1000),
price=rep("10.99",1000),
product=rep("CASH Rewards",1000),
expected.profit=runif(1000,1,100),
likeliho
od.to.apply=rtriangle(1000,0.0,1.0,0.6)),
data.frame(
id=seq(1,1000),
price=rep("10.99",1000),
product=rep("HOTEL",1000),
expected.profit=runif(1000,1,100),
likelihood.to.apply=rtriangle(1000,0.0,1.0,0.6)))
Next Up, the optimization code. I will note that this is somewhat confusing but I will break it up into sections similar to what I did with the SAS version. I will be using these two libraries for lpSolve. The first one provides access to the solver itself but the API is what actually makes this relatively easy to code.
require("lpSolve");require("lpSolveAPI")
The first step is making an empty optimization problem named ip. I start with zero constraints and add in the number of decision va
riables required.
ip <- make.lp(0,nrow(customers)*nrow(products)*nrow(prices))
Next I declare each decision variable as binary.
set.type(ip,seq(1,nrow(customers)*nrow(products)*nrow(prices)),type="binary")
This next part will seem like a relatively pointless step, but it will help in adding constraints. This data.frame contains every combination of customer, product, and price available. This will provide the index for assigning values in constraints and the objective function.
combos <- expand.grid(prices$price,products$product,customers$id)
names(combos)<-c('price','product','id')
rownames(combos)<-do.call(paste, c(combos[c("id", "product","price")], sep = "_"))
rownames(model.scores)<-do.call(paste, c(model.scores[c("id", "product","price")], sep = "_"))
model.scores.ordered<-model.scores[order(match(rownames(model.scores),rownames(combos))),]
By default, the objective function is set to minimize the problem. First I will set the coefficients to the decision variables in the objective function and then change it to be a maximization problem.
set.objfn(ip,model.scores.ordered$expected.profit*model.scores.ordered$likelihood.to.apply)
lp.control(ip,sense="max")
Here are two constants we will use in defining the constraints.
prod.per.person<-1
price.per.product<-1
First up, let's add in the constraints around the number of products per person.
for (c in customers$id) {
add.constraint(ip,
rep(1,nrow(products)*nrow(prices)),
"<=",
prod.per.person,
which(combos$id==c)
)
}
The next set of constraints will limit the number of price points per product per customer.
for (c in customers$id) {
for (p in products$product) {
add.constraint(ip,
rep(1,nrow(prices)),
"<=",
price.per.product,
which(combos$product==p & combos$id==c)
)
}
}
The next set of constraints assign the volume constraints for price points.
for (q in prices$price) {
add.constraint(ip,
rep(1,nrow(customers)*nrow(products)),
"<=",
prices$volume[which(q==prices$price)],
which(combos$price==q)
)
}
The next set of constraints assign the volume constraints for products.
for (p in products$product) {
add.constraint(ip,
rep(1,length(which(combos$product==p))),
"<=",
products$volume[which(p==products$product)],
which(combos$product==p)
)
}
Finally, let's clean up the formulation a little and assign names to the decision variables and constraints.
colnames <- character()
for (c in customers$id) {
for (p in products$product) {
for (q in prices$price) {
colnames<-c(colnames,paste(c,p,q,sep="_"))
}
}
}
rownames <- character()
for (c in customers$id) {
rownames<-c(rownames,paste("prod_per_cust",c,sep="_"))
}
for (c in customers$id) {
for (p in products$product) {
rownames<-c(rownames,paste("price_per_prod",c,p,sep="_"))
}
}
for (q in prices$price) {
rownames<-c(rownames,paste("price_vol",q,sep="_"))
}
for (p in products$product) {
rownames<-c(rownames,paste("prod_vol",p,sep="_"))
}
dimnames(ip) <- list(rownames, colnames)
It's possible to write out the formulation in case you would like to review it. Note that this can be very overwhelming with the increase in any of the three dimensions to the problem.
write.lp(ip,"modelformulation.txt",type="lp",use.names=T)
Last, but not least, let's solve the problem. Following the ability to solve the problem are commands to view the objective value, decision variable values and constraint values.
solve(ip)
get.objective(ip)
get.variables(ip)
get.constraints(ip)
I'm sure there are more elegant ways for writing up this code, but it still remains a relatively painful way to code an optimization problem. I'll skip the visualization part for now. I'm going to spend time writing up a Python version next and then hope to write a comparison between the three versions.