Feasible Analytics: Better Fantasy Football Data for Optimization

It's been awhile, but finally got around to parsing a better data source. FFToday provides projections for multiple statistics per position. This fills the gap for Special Teams and Kickers that ESPN was not providing. In addition, you can filter to get the projected total points according to several different websites default scoring and rules. The link I used calculates total points for standard ESPN leagues, but you can easily adjust and pull for your own league or site.

Well, let's get into the fun part. In general, all of the positions are relatively simple in pulling the data but vary in the columns that are pulled. We'll only use two libraries for parsing the website and then combining the data frames.

require("XML")
require("plyr")
options(stringsAsFactors = FALSE)

Next up, let's pull the raw data. As you'll note, the first column and row are not useful for us so we'll drop it and then change the column names. The last piece looks at removing some weird characters that ended up inside of the Name field.

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=10&LeagueID=26955"

x<-readHTMLTable(link)
qb <- x[11]$'NULL'
qb <- qb[-1,-1]
names(qb)<-c("Name","Team","Bye.Week","Passing.Completions","Passing.Attempts",
"Passing.Yards","Passing.TDs","Passing.INTs",
"Rushing.Attempts","Rushing.Yards","Rushing.TDs","FFPTs")

qb$Name<-sapply(qb$Name,function(x) substring(x,3))

We then repeat this for the other positions.

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=20&LeagueID=26955"
x<-readHTMLTable(link)
rb <- x[11]$'NULL'
rb <- rb[-1,-1]
names(rb)<-c("Name","Team","Bye.Week","Rushing.Attempts","Rushing.Yards","Rushing.TDs",
"Receiving.Catches","Receiving.Yards","Receiving.TDs","FFPTs")
rb$Name<-sapply(rb$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=30&LeagueID=26955"
x<-readHTMLTable(link)
wr <- x[11]$'NULL'
wr <- wr[-1,-1]
names(wr)<-c("Name","Team","Bye.Week","Receiving.Catches","Receiving.Yards","Receiving.TDs",
"Rushing.Attempts","Rushing.Yards","Rushing.TDs","FFPTs")
wr$Name<-sapply(wr$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=40&LeagueID=26955"
x<-readHTMLTable(link)
te <- x[11]$'NULL'
te <- te[-1,-1]
names(te)<-c("Name","Team","Bye.Week","Receiving.Catches","Receiving.Yards","Receiving.TDs","FFPTs")
te$Name<-sapply(te$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=80&LeagueID=26955"
x<-readHTMLTable(link)
k <- x[11]$'NULL'
k <- k[-1,-1]
names(k)<-c("Name","Team","Bye.Week","Field.Goals.Made","Field.Goals.Attempts","Field.Goals.Pct",
"Extra.Points.Made","Extra.Points.Attempts","FFPTs")
k$Name<-sapply(k$Name,function(x) substring(x,3))

link<-"http://fftoday.com/rankings/playerproj.php?Season=2014&PosID=99&LeagueID=26955"
x<-readHTMLTable(link)
def <- x[11]$'NULL'
def <- def[-1,-1]
names(def)<-c("Name","Bye.Week","Sacks","Fumble.Recovered","Interceptions",
"Defensive.TDs","Points.Against","Passing.Yards.per.Game","Rushing.Yards.per.Game",
"Safety","Kicking.TDs","FFPTs")
def$Name<-sapply(def$Name,function(x) substring(x,3))

Lastly, we'll add a field with the position and then combine the datasets. Note since each of the data frames has different columns, we need to use a function from the PLYR package to combine the data frames. This leaves NAs where the column wasn't in the data frame. We can easily fix that with the last statement.

qb$pos <- as.factor("QB")
rb$pos <- as.factor("RB")
wr$pos <- as.factor("WR")
te$pos <- as.factor("TE")
def$pos <- as.factor("DEF")
k$pos <- as.factor("K")

d <- rbind.fill(qb,rb,wr,te,def,k)
d[is.na(d)]<-0

Hooray, better data to work with in our optimization. I'm looking into using a different solver that looks to be a bit more user friendly. Keep an eye out for a blog on localsolver up next!

Monday, July 28, 2014

Better Fantasy Football Data for Optimization

No comments:

Post a Comment