Wednesday, July 30, 2014

LocalSolver License

I spent the last few days translating the LPSolve code from my previous optimization problem to using the LocalSolver API. I'll admit, I like the syntax and the accompanying R package to the solver quite a bit. I found it incredibly easy to import data, setup constraints and solve the problem. However, there was one big downside, I'm not a student or teacher and can't afford to buy a commercial license.

Anyway, here's the code in case you have a license that exceeds the limitations for the free license.

Monday, July 28, 2014

Better Fantasy Football Data for Optimization

It's been awhile, but finally got around to parsing a better data source. FFToday provides projections for multiple statistics per position. This fills the gap for Special Teams and Kickers that ESPN was not providing. In addition, you can filter to get the projected total points according to several different websites default scoring and rules. The link I used calculates total points for standard ESPN leagues, but you can easily adjust and pull for your own league or site.

Well, let's get into the fun part. In general, all of the positions are relatively simple in pulling the data but vary in the columns that are pulled. We'll only use two libraries for parsing the website and then combining the data frames.

require("XML")
require("plyr")
options(stringsAsFactors = FALSE)

Next up, let's pull the raw data. As you'll note, the first column and row are not useful for us so we'll drop it and then change the column names. The last piece looks at removing some weird characters that ended up inside of the Name field.

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=10&LeagueID=26955"

x<-readHTMLTable(link)
qb <- x[11]$'NULL'
qb <- qb[-1,-1]
names(qb)<-c("Name","Team","Bye.Week","Passing.Completions","Passing.Attempts",
             "Passing.Yards","Passing.TDs","Passing.INTs",
             "Rushing.Attempts","Rushing.Yards","Rushing.TDs","FFPTs")

qb$Name<-sapply(qb$Name,function(x) substring(x,3))

We then repeat this for the other positions.

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=20&LeagueID=26955"
x<-readHTMLTable(link)
rb <- x[11]$'NULL'
rb <- rb[-1,-1]
names(rb)<-c("Name","Team","Bye.Week","Rushing.Attempts","Rushing.Yards","Rushing.TDs",
             "Receiving.Catches","Receiving.Yards","Receiving.TDs","FFPTs")
rb$Name<-sapply(rb$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=30&LeagueID=26955"
x<-readHTMLTable(link)
wr <- x[11]$'NULL'
wr <- wr[-1,-1]
names(wr)<-c("Name","Team","Bye.Week","Receiving.Catches","Receiving.Yards","Receiving.TDs",
             "Rushing.Attempts","Rushing.Yards","Rushing.TDs","FFPTs")
wr$Name<-sapply(wr$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=40&LeagueID=26955"
x<-readHTMLTable(link)
te <- x[11]$'NULL'
te <- te[-1,-1]
names(te)<-c("Name","Team","Bye.Week","Receiving.Catches","Receiving.Yards","Receiving.TDs","FFPTs")
te$Name<-sapply(te$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=80&LeagueID=26955"
x<-readHTMLTable(link)
k <- x[11]$'NULL'
k <- k[-1,-1]
names(k)<-c("Name","Team","Bye.Week","Field.Goals.Made","Field.Goals.Attempts","Field.Goals.Pct",
             "Extra.Points.Made","Extra.Points.Attempts","FFPTs")
k$Name<-sapply(k$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=99&LeagueID=26955"
x<-readHTMLTable(link)
def <- x[11]$'NULL'
def <- def[-1,-1]
names(def)<-c("Name","Bye.Week","Sacks","Fumble.Recovered","Interceptions",
            "Defensive.TDs","Points.Against","Passing.Yards.per.Game","Rushing.Yards.per.Game",
            "Safety","Kicking.TDs","FFPTs")
def$Name<-sapply(def$Name,function(x) substring(x,3))

Lastly, we'll add a field with the position and then combine the datasets. Note since each of the data frames has different columns, we need to use a function from the PLYR package to combine the data frames. This leaves NAs where the column wasn't in the data frame. We can easily fix that with the last statement.

qb$pos <- as.factor("QB")
rb$pos <- as.factor("RB")
wr$pos <- as.factor("WR")
te$pos <- as.factor("TE")
def$pos <- as.factor("DEF")
k$pos <- as.factor("K")

d <- rbind.fill(qb,rb,wr,te,def,k)
d[is.na(d)]<-0

Hooray, better data to work with in our optimization. I'm looking into using a different solver that looks to be a bit more user friendly. Keep an eye out for a blog on localsolver up next!