STAT 412/612 Week 12: Homework
forcats and lubridate
YOUR NAME
2020-04-10
STAT代写 STAT 412/612 Week 12: Homework Submit your R Markdown file and your PDF, knitted directly from R Markdown, on Blackboard.
Instructions
- Submit your R Markdown file and your PDF, knitted directly from R Markdown, on Blackboard. Only include the necessary code, not any extraneous code, to answer the questions.
- Learning outcomes:
– Manipulate factors with forcats.
– Manipulate dates with lubridate.
Question 1: Capital Bikeshare Data
-
Load in the data containing trip information from the Capital Bikeshare program. Also load in the station information. Rename variables that have spaces in the names STAT代写
trip data
station data
Note: These data were originally from http://data.codefordc.org/group/transportation.
-
Parse the date-time information from the trip data. Recall the times are recorded in the America/New_York time zone, not the UTC time zone. Specify that in your parser. STAT代写
## # A tibble: 6 x 9 |
||||||
## |
duration start_time |
end_time |
start_station_n~ |
|||
## |
<dbl> |
<dttm> |
<dttm> |
<dbl> |
||
## 1 |
301295 |
2016-03-31 23:59:00 |
2016-04-01 00:04:00 |
31280 |
||
## 2 |
557887 |
2016-03-31 23:59:00 |
2016-04-01 00:08:00 |
31275 |
||
## 3 |
555944 |
2016-03-31 23:59:00 |
2016-04-01 00:08:00 |
31101 |
||
## 4 |
766916 |
2016-03-31 23:57:00 |
2016-04-01 00:09:00 |
31226 |
||
## 5 |
139656 |
2016-03-31 23:57:00 |
2016-03-31 |
23:59:00 |
31011 |
|
## 6 |
967713 |
2016-03-31 |
23:57:00 |
2016-04-01 |
00:13:00 |
31266 |
## # ... with 5 more variables: start_station <chr>, end_station_number <dbl>, ## # end_station <chr>, bike_number <chr>, member_type <chr>
3. Calculate the average number of trips for each weekday (Sunday, Monday, Tuesday . . . ) given the day has trips. There are several days with no trips. STAT代写
- Save the resulting days of week and corresponding average number of trips as a data frame called sumdf and print it out.
- It should look like this:
``` ## # A tibble: 7 x 2 ## wday mean_num_trips
## |
<ord> |
<dbl> |
## 1 Sun |
5111. |
|
## 2 Mon |
6057. |
|
## 3 Tue |
6617. |
|
## 4 Wed |
6846. |
|
## 5 Thu |
7309. |
|
## 6 |
Fri |
6358. |
## 7 |
Sat |
6027 |
``` |
4.Reproduce this plot in R:
-
In a stunning show of contempt, the IEEE Computer Societydecided to add a new weekday called “Fooday” with abbreviation “Foo”. Fooday was decided to be the first day of the week (ahead of Sunday). STAT代写
On the first Fooday ever, people used Capital Bikeshare in record numbers, yielding 15567 trips. Add Fooday as the first level to the wday variable in sumdf and add its average number of trips (now 15567 since there
has only been one Fooday so far).
Hint: Create a new data frame that contains the Fooday trips and use bind_rows().
Your final data frame should look like this:
“`
## # A tibble: 8 x 2 ## wday mean_num_trips
## |
<fct> |
<dbl> |
## 1 Foo |
15567 |
|
## 2 Sun |
5111. |
|
## 3 Mon |
6057. |
|
## 4 Tue |
6617. |
|
## 5 Wed |
6846. |
|
## 6 Thu |
7309. |
|
## 7 Fri |
6358. |
|
## 8 Sat |
6027 |
|
``` |
In another stunning show of contempt, the IEEE Computer Societydecided to change the abbreviations from three letters to two letters. STAT代写 Change the levels of wday so that each day uses only two-letter abbreviations. Your final data frame should look like this:
## # A tibble: 8 x 2 ## wday mean_num_trips
## |
<fct> |
<dbl> |
## 1 Fo |
15567 |
|
## 2 Su |
5111. |
|
## 3 Mo |
6057. |
|
## 4 Tu |
6617. |
|
## 5 We |
6846. |
|
## 6 |
Th |
7309. |
## 7 |
Fr |
6358. |
## 8 |
Sa |
6027 |
7.In the stations data frame, it seems that installDate is populated by the number of milliseconds since January 1, 1970, 00:00:00 (in the America/New_York time zone). Parse this into a date-time and make a histogram of the install dates. It should look something like this:
您必须登录才能发表评论。