Chapter 2. Exercises 5.
Pitcher Strikeout / Walk Ratios
Analyzing Baseball Data with R, Introduction to R, page 58
(a) Read the Lahman "pitching.csv" data file into R into a data frame Pitching.
Pitching <- read.csv("pitching.csv")
(b) The following function computes the cumulative strikeouts, cumulative walks, mid career year, and the total innings pitched (measured in terms of outs) for a pitcher whose season statistics are stored in the data frame d.
stats <- function(d){ c.SO <- sum(d$SO, na.rm=TRUE) c.BB <- sum(d$BB, na.rm=TRUE) c.IPouts <- sum(d$IPouts, na.rm=TRUE) c.midYear <- median(d$yearID, na.rm=TRUE) data.frame(SO=c.SO, BB=c.BB, IPouts=c.IPouts, midYear=c.midYear) }
Using the function ddply (plyr package) together with the function stats, find the career statistics for all pitchers in the pitching dataset. Call this new data frame career.pitching.
career.pitching <- ddply(Pitching, "playerID", stats) # or # Note the use of the '.' function to allow # playerID to be used without quoting career.pitching <- ddply(Pitching, .(playerID), stats) }
(c) Use the merge function to merge the Pitching and career.pitching data frames.
merge(career.pitching, Pitching)
(d) Use the subset function to construct a new data frame career.10000 consisting of data for only those pitchers with at least 10,000 career IPouts.
career.10000 <- subset(career.pitching, IPouts >= 10000)
(e) For the pitchers with at least 10,000 career IPouts, construct a scatterplot of mid career year and ratio of strikeouts to walks. Comment on the general pattern in this scatterplot.
with(career.10000, plot(midYear, SO/BB))
Chap2 Ex4 subset function strikeout-walk ratios
'컴퓨터 언어 > R' 카테고리의 다른 글
[R] Chap2 Ex4 subset function strikeout-walk ratios (0) | 2015.02.01 |
---|---|
[R] Chap2 Ex3 Pitchers in the 350 Wins Club (0) | 2015.01.31 |
[R] Chap2 Ex2 Character, Factor, and Logical Variables in R (0) | 2015.01.30 |
[R] Chap2 Ex1 Top Base Stealers in the Hall of Frame (0) | 2015.01.29 |
R 통계 프로그램 사용법 및 설치 매뉴얼 다운로드 소개 (11) | 2015.01.27 |