- https://github.com/papers-we-love/papers-we-love/blob/master/data_science/tidy_data.pdf
- Hadley Wickham
- Data Structure
○ A dataset is a collection of values, usually either numbers or strings
○ Every value belongs to a variable and an ovservation.
§ A variable contains all values that measure the same underlying attribute (height, duration..)
§ An observation contains all values measured on the same unit (person, day..) across attributes
- In tidy data:
○ Each variable forms a column.
○ Each observation forms a row.
○ Each type of observational unit forms a table.
- Example:
year artist time track date week rank
2000 2 Pac 4:22 Baby Don’t Cry 2000-02-26 1 87
2000 2 Pac 4:22 Baby Don’t Cry 2000-03-04 2 82
2000 2 Pac 4:22 Baby Don’t Cry 2000-03-11 3 72
2000 2 Pac 4:22 Baby Don’t Cry 2000-03-18 4 77
2000 2 Pac 4:22 Baby Don’t Cry 2000-03-25 5 87
2000 2 Pac 4:22 Baby Don’t Cry 2000-04-01 6 94
2000 2 Pac 4:22 Baby Don’t Cry 2000-04-08 7 99
2000 2Ge+her 3:15 The Hardest Part Of ... 2000-09-02 1 91
2000 2Ge+her 3:15 The Hardest Part Of ... 2000-09-09 2 87
2000 2Ge+her 3:15 The Hardest Part Of ... 2000-09-16 3 92
2000 3 Doors Down 3:53 Kryptonite 2000-04-08 1 81
2000 3 Doors Down 3:53 Kryptonite 2000-04-15 2 70
2000 3 Doors Down 3:53 Kryptonite 2000-04-22 3 68
2000 3 Doors Down 3:53 Kryptonite 2000-04-29 4 67
2000 3 Doors Down 3:53 Kryptonite 2000-05-06 5 66