Tuesday, July 1, 2014

Fantasy Football Optimization Part 1

I decided to break up the problem into baby steps. This first part will deal with building out the initial structure of the optimization problem. For those that read my other post on optimization in R, I'll be using the same libraries and style for setting up this problem.

First up, let's read in the data we created in the last post. We'll add a simple column that creates a numeric ID per player.
d <- read.csv(file=paste(getwd(),"/Data/ESPN-Projections.csv", sep=""))
d$id <- as.integer(factor(paste(d$name,d$team)))

Now that the data is all set, we can load the required solver libraries.
require("lpSolve");require("lpSolveAPI");

We can set the number of teams in the league. Given the number of teams in the league, we can set up a vector of team IDs.
num.teams <- 10
teams <- seq(1,num.teams)

Similarly, we can grab the number of players in our dataset and create a vector of the ids.
num.players <- length(unique(d$id))
players <- unique(d$id)

I'm going to create a data frame with the decision variables for our problem. First up is creating the cross product of all players and teams. We'll then merge in our player data and add in a team ID.
vars <- data.frame(player.id=rep(players,num.teams))
vars <- merge(x=vars,y=d,by.y="id",by.x="player.id")
vars <- vars[,c("player.id","pos","name")]
vars$team.id <- rep(seq(1,num.teams),num.players)

The data is set up and it's time to create the actual Integer Programming problem. Note that these decision variables are also binary, either a player is assigned to that team or he isn't.
ip <- make.lp(0,num.players*num.teams)
set.type(ip,seq(1,num.players*num.teams),type="binary")

The objective function is simply to maximize the number of projected points.
set.objfn(ip,rep(d$total.points,num.teams))
lp.control(ip,sense="max")

We need to add constraints for each player that ensures that if they are assigned to a team, that they are assigned to one and only one team.
for (p in players) {
  add.constraint(ip,
                 rep(1,num.teams),
                 "<=",
                 1,
                 which(vars$player.id==p)
                 )
}

Now for the team constraints. First up, the positions required for each team. For simplicity, I'm using the lineup that ESPN uses in their standard league. Here are the minimum number of positions to be drafted:

  • 1 QB
  • 2 RB
  • 2 WR
  • 1 RB/WR/TE (Flex player)
  • 1 TE
  • 1 DEF
  • 1 K


for (t in teams) {
  #This constraint covers having at least 1 QB  
  add.constraint(ip,
                 rep(1,sum(vars$pos=="QB")/num.teams),
                 ">=",
                 1,
                 which(vars$team.id==t & vars$pos=="QB")
  )
  #This constraint covers having at least 2 WR
  add.constraint(ip,
                 rep(1,sum(vars$pos=="WR")/num.teams),
                 ">=",
                 2,
                 which(vars$team.id==t & vars$pos=="WR")
  )
  #This constraint covers having at least 2 RB
  add.constraint(ip,
                 rep(1,sum(vars$pos=="RB")/num.teams),
                 ">=",
                 2,
                 which(vars$team.id==t & vars$pos=="RB")
  )
  #This constraint covers having at least 1 DEF
  add.constraint(ip,
                 rep(1,sum(vars$pos=="DEF")/num.teams),
                 ">=",
                 1,
                 which(vars$team.id==t & vars$pos=="DEF")
  )
  #This constraint covers having at least 1 K
  add.constraint(ip,
                 rep(1,sum(vars$pos=="K")/num.teams),
                 ">=",
                 1,
                 which(vars$team.id==t & vars$pos=="K")
  )
  #This constraint covers having at least 1 TE
  add.constraint(ip,
                 rep(1,sum(vars$pos=="TE")/num.teams),
                 ">=",
                 1,
                 which(vars$team.id==t & vars$pos=="TE")
  )
  #This constraint covers having at least 1 flex player. Note that the other constraints require at least 1 TE, 2 RB, 2 WR. In order to cover a flex player, the total sum of players from those positions needs to be at least 6.
  add.constraint(ip,
                 rep(1,sum(vars$pos=="TE",vars$pos=="RB",vars$pos=="WR")/num.teams),
                 ">=",
                 6,
                 which(vars$team.id==t & (vars$pos=="TE" | vars$pos=="RB" | vars$pos=="WR"))
  )
  #This constraint covers each team having 16 players
  add.constraint(ip,
                 rep(1,num.players),
                 "=",
                 16,
                 which(vars$team.id==t)
  )
}

Well that's it for our basic set of constraints. If you're interested in seeing what the model formulation looks like, execute the "write.lp" statement below.
write.lp(ip,paste(getwd(),"/modelformulation.txt",sep=""),type="lp",use.names=T)

Now the fun part, solving the integer program. Following that it is feasible (and it is) we get the objective function value and the solution.
solve(ip)
get.objective(ip)
get.variables(ip)

Although seeing the solution looks relatively complex, we can simply keep the assignments and print them out.

sol<-vars[get.variables(ip)==1,c("name","team.id","pos")]
View(sol[order(sol$team.id,sol$pos),])

One huge downside to this approach is the lack of actual drafting strategy or complications. This problem simply looks at dividing talent evenly across teams. I particularly dislike the results of some teams ending up with more than one kicker. No one should ever own more than one kicker.

My next step is to either improve the formulation of this problem, probably by using some options mentioned in this Fantasy Football Analytics post, or to look at applying a different algorithm to solving this problem.

No comments:

Post a Comment