EDUCATION
George Mason University, Fairfax, VA, December 2010Master of Science, Operations Research
Concentration in Decision Analysis
Worcester Polytechnic Institute, Worcester, MA, May 2006
Bachelor of Science, Mathematics
Majority of Coursework in Applied Statistics
Relevant Coursework:
- Deterministic Models
- Stochastic Models
- Numerical Methods
- Discrete System Simulation
- Applied Statistics
- Probability and Mathematical Statistics
- Decision and Risk Analysis
- Applied Probability
- Categorical Data Analysis
- Linear Programming
- Integer Programming
- Dynamic Programming
- Judgement and Choice Processing
SKILLS
Operating Systems:
- Windows (7, XP)
- Mac OS X
- Unix
- Linux
Software Packages:
- MS EXCEL
- MS ACCESS
- SAS BASE/EG
- Netezza
- Tableau
- ARENA
- AnyLogic
- ArcGIS
- Decision Lens
- MPL
- RStudio
- PostgreSQL
Programming Languages:
- R (5+ years experience)
- SAS (STAT/GRAPH/MACRO/SQL/OR) (10+ years experience)
- Python (3+ years experience)
- Visual Basic for Applications (8+ years experience)
- HTML (3+ years experience)
- Javascript (1+ years experience)
SECURITY CLEARANCE
Inquire for more details.CERTIFICATIONS
SAS Certified Base Programmer for SAS9SAS Certified Advanced Programmer for SAS9
ACADEMIC PROJECTS
Senior Major Qualifying Project: Worcester Polytechnic Institute, Worcester, MA, August 2005-March 2006
Working with two partners to create a measure determining the importance of defenses across Major League Baseball. Parsed ten years of a play-by-play database provided by Retrosheet to extract significant situational statistics in SAS. Using factor analysis on significant statistics particular to specific positions per team per year in a generalized linear model. Looking at cluster analysis and logistic regression are other approaches used.Operations Research Project Course: George Mason University, Fairfax, VA, August 2008 - December 2010
Working with three partners to create an optimal assignment of players for a group of teams given constraints and goals set by the softball league. Using lpsolve, the team organized the measures for team equality and programmed them in using goal constraints within the integer programming model. The team also produced analysis on the underlying statistics used within the model to provide some insight on the importance of the goals and provide some predictive metrics on the model results. The team provided a simple interface to run the model, set the options and produce results.AWARDS
NASA Systems Engineering Award: Campaign Analysis Team, October 2010
The Systems Engineering “Techniques/Methodologies” Activity Award acknowledges the superior achievement of an individual or team for an endeavor which has set new standards of excellence at NASA and has advanced the field of systems engineering. Examples for the basis of selection are activities utilizing research and development of new systems engineering techniques or methodologies that elevate SE understanding, leadership development, or the development of new or modified systems engineering best practices.EXPERIENCE
Operations Research Analyst, Science Applications International Corporation, McLean, VA, June 2006 - October 2010
- Served as the data manager for analysis of counter-terrorism defense measures in place for several mass transit systems under the Transit Risk Assessment Module (TRAM). Assisted in recording data at meetings and on-site visits. Track changes to and finalize documents for delivery to the customer. Performed cost-benefit analysis (CBA) for several jurisdictions using the Risk Management Tool (RMT). Prepared materials for TRAM workshops and adjusted data during the workshops to provide accurate results. Aided as the CBA specialist during the TRAM workshops. Instructed on proper use of the RMT software.
- Updated the TRAM methodology to include multi-hazards and aggregated decision tactics. Built the new methodology into EXCEL through worksheet functions and VBA.
- Using specific projects developed by Port Authority of New York and New Jersey (PANYNJ), adjusted pieces of the risk formula and calculated a ten year lifecycle cost in order to analyze and update the priority of projects to be implemented. Developed a specialized toolkit in EXCEL to run a cost benefit analysis and select optimal points using a pareto frontier.
- Cooperated with a team to build a system dynamics model to display response and recovery capabilities following a specific hazard in a downtown metropolitan area. Modified flat-earth data for an improvised nuclear attack to accommodate the effects of buildings and terrain on the weapon’s effects. Created a user interface to span several programs used in creation for input, run, and results of the model. Assisted in creation of a graphical representation of results incorporating EXCEL and Google Earth.
- Using the ARENA software package to build six models for points of entry and points of departure into the United States, provided analysis for architecture performance based on several categories. Used VBA to write values produced by ARENA into EXCEL. Created a tool kit in MS EXCEL for a final analysis of the results across each of the models built.
- Using the SAS software packages (BASE, MACRO, OR) to develop a model to assign recruits to schools in the U.S. marines as part of the Total Force Marine Manpower Review (TFMMR) project. Parsed large database of individual marine recruit info for useful information and anticipated errors produced by missing data.
- Updated and expanded the capabilities of an EXCEL based tool used to evaluate potential risk for several possible lunar surface campaigns for NASA. Developed several modules to handle new scenarios developed by NASA including lunar surface exploration. Implemented several different approaches into a single workbook and improved efficiency for the overall Monte Carlo simulation.
- Using AnyLogic to build a combined Discrete Event / Agent Based model to determine the effects of introducing delays to a seaport's operations. Build reports given the results of a sensitivity analysis or Monte Carlo run version of the model. Compiled an applet version of the model and combined with an HTML interface to supply to the customer.
- Analyzed daily PDF reports on ISS Crew Activities posted by NASA and created an EXCEL workbook that would data mine necessary information from each PDF into a usable format. Several statistics calculated through VBA and EXCEL workbook functions.
- Assisted in data collection and manipulation for several jurisdictions across the United States. Using EXCEL and VBA, data mined and validated necessary information from several PDFs. Created an EXCEL tool for future data dumps from PDFs to facilitate the overall data collection. Each piece of data is used in a complicated risk calculation to determine each jurisdiction's priority of receiving grant money from FEMA.
- Using a combination of JAVA and MS ACCESS to create a program that illustrates connections between documents. Created the relational database containing all the information required by the JAVA program including specific information for each document and detailed linkages for each document. Assisted with the JAVA code used to connect the program to the database for maximum efficiency.
Advanced Analytics Senior Consultant, IBM, Herndon, VA, November 2010 - July 2012
- Worked with a team to determine the best use of IBM’s analytical skills to help Aetna improve their business. Modified a SAS multiplicative regression model to be more flexible with data and improve efficiency. Determine the important factors in improving care management efficiency for existing programs at Aetna.
- Supported JIEDDO using various analytical techniques including Analytic Hierarchy Process and Regression Analysis. Created and tested a metric to help support decision making for various groups of people working with JIEDDO. Improved existing products in Excel and Access using SAS code. Created SAS Stored Processes to help streamline report generation. Improved raw data cleansing and formatting using regular expression parsing. Streamlined a process to parse XML files and create new databases from the results. Developed SAS stored processes to support business intelligence and analytics. Designed a database to enhance reporting and help determine an optimal solution to a resource allocation problem.
Senior Analytic Consultant, Epsilon, Wakefield, MA, August 2012 - June 2014
- Provide support on multiple business development opportunities that require expertise in optimization, machine learning and / or data mining techniques.
- Assist in model development by providing input on various techniques including integer optimization, Bayesian networks, sentiment analysis, social network analysis, and natural language processing.
- Support the client by improving data analysis techniques through SQL and data step processing in SAS. Improve the processing speed by maximizing SQL pass-through to Netezza or writing Netezza SQL. Decrease query run-times in SAS by utilizing Netezza Analytic functions and Netezza iClass functions.
- Evaluated several internal algorithms to determine sentiment of tweets in order to pick the best one to use against multiple clients. Developed a semi-supervised learning algorithm to efficiently improve accuracy in scoring sentiment.
- Collaborate in building a Bayesian network on large data sources in Netezza to predict the likelihood of someone closing an account. Compared model results and modeling process against Logistic Regression to determine best modeling approach.
- Provide new designs for displaying data for results of models and data analysis. Connect data analysis and model results from SAS and Netezza to Data-Driven Documents (D3). Create dashboards to effectively display multiple pieces of information in one space using Tableau.
Lead Data Scientist, MITRE, Bedford, MA, July 2014 - Present
- Assist in explaining Python/SAS code that cleans data from a PostgreSQL database. Looked at potential improvements by analyzing the time for a particular node in a discrete event simulation. Using a survival analysis approach in R, created an accelerated failure-time model to determine factors that would increase the likelihood of spending time in that node. Program iterates through values of a categorical variable to subset the data, fit the model, analyze it for fit, visualize the results with ggplot2, and decide to keep results given a threshold for AIC and chi-squared statistic. Code and results written up using R markdown for transition to the sponsor.
- Provide support using Python and PostgreSQL to analyze and prepare data.
- Train a Naive Bayes classifier in SAS to run on a small training dataset. Update SAS macro to produce metrics on classifier performance and graph them in SAS.
- Working with senior leadership, I gather and determine staffing event rates to utilize inside an Agent Based Model that forecasts out staffing capacities for the next five years. Tableau ingests the model results to display the impact of various hiring policies and understand how uncertainty in the data can impact forecasting the staff capacities.
- Lead a small team, we collected data from a local Health and Human Services Agency to bring together and understand some of the data quality issues that might exist in other HHS agencies. Use CARET (an R library) to identify which factors are the most important in determining children who are at higher risk due to fatality.
No comments:
Post a Comment