i have data set motor vehicle crashes happening daily in nyc 1 jan 2014 31 dec 2012. want plot time series of number of injured cyclists, , motorists, monthly in single plot.
my data looks this:
date time location cyclists injured motorists injured 2014-1-1 12:05 bronx 0 1 2014-1-1 12:34 bronx 1 2 2014-1-2 6:05 bronx 0 0 2014-1-3 8:01 bronx 1 2 2014-1-3 12:05 manhattan 0 1 2014-1-3 12:56 manhattan 0 2
and on till 31 dec 2014.
now plot monthly time series this, understand first need total each of sums each month, , plot monthly totals. not know how can this.
i used aggregate function using code, gives me sum each day , not month. please help.
cyclist <- aggregate(number.of.cyclist.injured ~ date, data = final_data,sum)
thank :)
mannat here answer using data.table
package aggregate. use install.packages(data.table)
first r.
library(data.table) # others # copied data csv file, mannat not need step, # other helpers @ data in data section below final_data <- as.data.table(read.csv(file.path(mypath, "soaccidents.csv"), header = true, stringsasfactors = false)) # mannat # mannat need convert existing data.frame data.table final_data <- as.data.table(final_data) # check data formats, dates strings # , field date not date str(final_data) final_data$date <- as.date(final_data$date, "%m/%d/%y") # use data table aggregate on months # first lets add field plot date year , month yyyymm 201401 final_data[, plotdate := as.numeric(format(date, "%y%m"))] # key plot date setkeyv(final_data, "plotdate") # second aggregate , , label columns plotdata <- final_data[, .(cyclists.monthly = sum(cyclists.injured), motorists.monthly = sum(motorists.injured)), = plotdate] # plotdate cyclists.monthly motorists.monthly #1: 201401 2 8 # can plot (makes more sense more data) # example, cyclists plot(plotdata$plotdate, plotdata$cyclists.monthly)
mannat if not familiar data.table
, please see cheatsheet
data
for others looking work on this. here result dput:
final_data <- data.table(date = c("01/01/2014", "01/01/2014", "01/01/2014", "01/01/2014", "1/19/2014", "1/19/2014"), time = c("12:05", "12:34","06:05", "08:01", "12:05", "12:56"), location = c("bronx", "bronx","bronx", "bronx", "manhattan", "manhattan"), cyclists.injured = c(0l, 1l, 0l, 1l, 0l, 0l), motorists.injured = c(1l, 2l, 0l, 2l, 1l, 2l))
plots
either use ggplot2
package
or plots please see plot multiple lines (data series) each unique color in r plotting help.
# not have full data 1 point line charts not working # needed month testing, added fake february testfeb <- data.table(plotdate = 201402, cyclists.monthly = 4, motorists.monthly = 10) plotdata <- rbindlist(list(plotdata, testfeb)) # plotdate cyclists.monthly motorists.monthly #1 201401 2 8 #2 201402 4 10 # plot code, modify limits see fit plot(1, type = "n", xlim = c(201401,201412), ylim = c(0, max(plotdata$motorists.monthly)), ylab = 'monthly accidents', xlab = 'months') lines(plotdata$plotdate, plotdata$motorists.monthly, col = "blue") lines(plotdata$plotdate, plotdata$cyclists.monthly, col = "red") # add legend legend(x = "topright", legend = c("motorists","cyclists"), lty=c(1,1,1), lwd=c(2.5,2.5,2.5), col=c("blue", "red")) # or set legend inset x position e.g. "bottom" or "bottomleft"