R for Biologists
The tidyverse is a collection of R packages, aimed at data scientists. They share the same "grammar" and so work seamlessly together. They are a powerful suit of functions that have quickly also become an essential tool-kit for biologists.
The core packages are:
Today we will run through a small project that will help us get familiar with the tidyverse. To do so, we will use some of the available datasets from datacarpentry.
Lets get started by creating a new notebook! Let's call it Mammal Body sizes.
The very first thing we have to do when starting any project is to set your working directory! When working with a notebook, this will automatically be the place where your notebook is saved.
#
setwd("~/Documents/git_projects/RforBiologists_UDSM/")
# to check where you currently are:
getwd()
## [1] "/Users/christophliedtke/Documents/git_projects/RforBiologists_UDSM"
We can also use relative paths:
# home directory
~/
# one directory back
../
Usually we will work with functions outside of the R base packages so we will want to load them, like the tidyverse. If we have already installed them, we can just "attach" them, if not, we have to install them first.
library("tidyverse")
library("readxl")
Your data will usually be stored in some sort as some sort of rectangular format. Usually a table! You will tend to stored that as an excel spreadsheet, .csv or .tsv etc. Excel is to be avoided, because it usually comes with a lot of formatting that can sometimes cause problems. It also requires an extra library. For completeness we will look at an example too.
Lets load our first dataset which is "tab separated data".
mammals<-read_tsv(file = "../data/mammals_anage.tsv")
# print the first few rows of this table:
mammals
## # A tibble: 1,349 × 27
## Order Family Genus Species `Common name` Female maturity (day…¹
## <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 Afrosoricida Tenrecidae Echi… telfai… Lesser hedge… 365
## 2 Afrosoricida Tenrecidae Geog… aurita Large-eared … NA
## 3 Afrosoricida Tenrecidae Hemi… semisp… Streaked ten… 35
## 4 Afrosoricida Tenrecidae Micr… dobsoni Dobson's shr… 669
## 5 Afrosoricida Tenrecidae Micr… talaza… Talazac's sh… 639
## 6 Afrosoricida Tenrecidae Seti… setosus Greater hedg… 198
## 7 Afrosoricida Tenrecidae Tenr… ecauda… Tailess tenr… 182
## 8 Artiodactyla Antilocaprid… Anti… americ… Pronghorn 547
## 9 Artiodactyla Bovidae Addax nasoma… Addax 1065
## 10 Artiodactyla Bovidae Aepy… melamp… Impala 456
## # ℹ 1,339 more rows
## # ℹ abbreviated name: ¹`Female maturity (days)`
## # ℹ 21 more variables: `Male maturity (days)` <dbl>,
## # `Gestation/Incubation (days)` <dbl>, `Weaning (days)` <dbl>,
## # `Litter/Clutch size` <dbl>, `Litters/Clutches per year` <dbl>,
## # `Inter-litter/Interbirth interval` <dbl>, `Birth weight (g)` <dbl>,
## # `Weaning weight (g)` <dbl>, `Adult weight (g)` <dbl>, …
What kind of data do we see? How many variables (columns) are there and how many entries (rows)?
This is a "tibble" (a table on steroids!)
Structure (columns of different data types, 1349, 27 dimensions)
Biological data
some are strings (character), others are numerics
Messy, lot's of missing data.
In the same way we can load "comma separated" tables.
dat2<-read_csv("../data/mammal_life_history.csv")
Excel tables require an extra step
library(readxl)
dat3<-read_xlsx("../data/bird_sizes.xlsx")
# printing a whole variable
mammals$Family
## [1] "Tenrecidae" "Tenrecidae" "Tenrecidae"
## [4] "Tenrecidae" "Tenrecidae" "Tenrecidae"
## [7] "Tenrecidae" "Antilocapridae" "Bovidae"
## [10] "Bovidae" "Bovidae" "Bovidae"
## [13] "Bovidae" "Bovidae" "Bovidae"
## [16] "Bovidae" "Bovidae" "Bovidae"
## [19] "Bovidae" "Bovidae" "Bovidae"
## [22] "Bovidae" "Bovidae" "Bovidae"
## [25] "Bovidae" "Bovidae" "Bovidae"
## [28] "Bovidae" "Bovidae" "Bovidae"
## [31] "Bovidae" "Bovidae" "Bovidae"
## [34] "Bovidae" "Bovidae" "Bovidae"
## [37] "Bovidae" "Bovidae" "Bovidae"
## [40] "Bovidae" "Bovidae" "Bovidae"
## [43] "Bovidae" "Bovidae" "Bovidae"
## [46] "Bovidae" "Bovidae" "Bovidae"
## [49] "Bovidae" "Bovidae" "Bovidae"
## [52] "Bovidae" "Bovidae" "Bovidae"
## [55] "Bovidae" "Bovidae" "Bovidae"
## [58] "Bovidae" "Bovidae" "Bovidae"
## [61] "Bovidae" "Bovidae" "Bovidae"
## [64] "Bovidae" "Bovidae" "Bovidae"
## [67] "Bovidae" "Bovidae" "Bovidae"
## [70] "Bovidae" "Bovidae" "Bovidae"
## [73] "Bovidae" "Bovidae" "Bovidae"
## [76] "Bovidae" "Bovidae" "Bovidae"
## [79] "Bovidae" "Bovidae" "Bovidae"
## [82] "Bovidae" "Bovidae" "Bovidae"
## [85] "Bovidae" "Bovidae" "Bovidae"
## [88] "Bovidae" "Bovidae" "Bovidae"
## [91] "Bovidae" "Bovidae" "Bovidae"
## [94] "Bovidae" "Bovidae" "Bovidae"
## [97] "Bovidae" "Bovidae" "Bovidae"
## [100] "Bovidae" "Bovidae" "Bovidae"
## [103] "Bovidae" "Bovidae" "Bovidae"
## [106] "Bovidae" "Bovidae" "Bovidae"
## [109] "Bovidae" "Bovidae" "Bovidae"
## [112] "Bovidae" "Bovidae" "Bovidae"
## [115] "Bovidae" "Bovidae" "Bovidae"
## [118] "Camelidae" "Camelidae" "Camelidae"
## [121] "Camelidae" "Camelidae" "Camelidae"
## [124] "Cervidae" "Cervidae" "Cervidae"
## [127] "Cervidae" "Cervidae" "Cervidae"
## [130] "Cervidae" "Cervidae" "Cervidae"
## [133] "Cervidae" "Cervidae" "Cervidae"
## [136] "Cervidae" "Cervidae" "Cervidae"
## [139] "Cervidae" "Cervidae" "Cervidae"
## [142] "Cervidae" "Cervidae" "Cervidae"
## [145] "Cervidae" "Cervidae" "Cervidae"
## [148] "Cervidae" "Cervidae" "Cervidae"
## [151] "Cervidae" "Cervidae" "Cervidae"
## [154] "Cervidae" "Cervidae" "Cervidae"
## [157] "Cervidae" "Giraffidae" "Giraffidae"
## [160] "Hippopotamidae" "Hippopotamidae" "Moschidae"
## [163] "Moschidae" "Moschidae" "Suidae"
## [166] "Suidae" "Suidae" "Suidae"
## [169] "Suidae" "Suidae" "Suidae"
## [172] "Suidae" "Suidae" "Tayassuidae"
## [175] "Tayassuidae" "Tayassuidae" "Tragulidae"
## [178] "Tragulidae" "Tragulidae" "Tragulidae"
## [181] "Ailuridae" "Canidae" "Canidae"
## [184] "Canidae" "Canidae" "Canidae"
## [187] "Canidae" "Canidae" "Canidae"
## [190] "Canidae" "Canidae" "Canidae"
## [193] "Canidae" "Canidae" "Canidae"
## [196] "Canidae" "Canidae" "Canidae"
## [199] "Canidae" "Canidae" "Canidae"
## [202] "Canidae" "Canidae" "Canidae"
## [205] "Canidae" "Canidae" "Canidae"
## [208] "Canidae" "Canidae" "Canidae"
## [211] "Canidae" "Canidae" "Canidae"
## [214] "Eupleridae" "Eupleridae" "Eupleridae"
## [217] "Eupleridae" "Felidae" "Felidae"
## [220] "Felidae" "Felidae" "Felidae"
## [223] "Felidae" "Felidae" "Felidae"
## [226] "Felidae" "Felidae" "Felidae"
## [229] "Felidae" "Felidae" "Felidae"
## [232] "Felidae" "Felidae" "Felidae"
## [235] "Felidae" "Felidae" "Felidae"
## [238] "Felidae" "Felidae" "Felidae"
## [241] "Felidae" "Felidae" "Felidae"
## [244] "Felidae" "Felidae" "Felidae"
## [247] "Felidae" "Felidae" "Felidae"
## [250] "Felidae" "Felidae" "Herpestidae"
## [253] "Herpestidae" "Herpestidae" "Herpestidae"
## [256] "Herpestidae" "Herpestidae" "Herpestidae"
## [259] "Herpestidae" "Herpestidae" "Herpestidae"
## [262] "Herpestidae" "Herpestidae" "Herpestidae"
## [265] "Herpestidae" "Herpestidae" "Herpestidae"
## [268] "Herpestidae" "Herpestidae" "Hyaenidae"
## [271] "Hyaenidae" "Hyaenidae" "Hyaenidae"
## [274] "Mephitidae" "Mephitidae" "Mephitidae"
## [277] "Mephitidae" "Mephitidae" "Mustelidae"
## [280] "Mustelidae" "Mustelidae" "Mustelidae"
## [283] "Mustelidae" "Mustelidae" "Mustelidae"
## [286] "Mustelidae" "Mustelidae" "Mustelidae"
## [289] "Mustelidae" "Mustelidae" "Mustelidae"
## [292] "Mustelidae" "Mustelidae" "Mustelidae"
## [295] "Mustelidae" "Mustelidae" "Mustelidae"
## [298] "Mustelidae" "Mustelidae" "Mustelidae"
## [301] "Mustelidae" "Mustelidae" "Mustelidae"
## [304] "Mustelidae" "Mustelidae" "Mustelidae"
## [307] "Mustelidae" "Mustelidae" "Mustelidae"
## [310] "Mustelidae" "Mustelidae" "Mustelidae"
## [313] "Mustelidae" "Mustelidae" "Mustelidae"
## [316] "Mustelidae" "Mustelidae" "Nandiniidae"
## [319] "Odobenidae" "Otariidae" "Otariidae"
## [322] "Otariidae" "Otariidae" "Otariidae"
## [325] "Otariidae" "Otariidae" "Otariidae"
## [328] "Otariidae" "Otariidae" "Otariidae"
## [331] "Otariidae" "Phocidae" "Phocidae"
## [334] "Phocidae" "Phocidae" "Phocidae"
## [337] "Phocidae" "Phocidae" "Phocidae"
## [340] "Phocidae" "Phocidae" "Phocidae"
## [343] "Phocidae" "Phocidae" "Phocidae"
## [346] "Phocidae" "Phocidae" "Phocidae"
## [349] "Phocidae" "Procyonidae" "Procyonidae"
## [352] "Procyonidae" "Procyonidae" "Procyonidae"
## [355] "Procyonidae" "Procyonidae" "Procyonidae"
## [358] "Procyonidae" "Ursidae" "Ursidae"
## [361] "Ursidae" "Ursidae" "Ursidae"
## [364] "Ursidae" "Ursidae" "Ursidae"
## [367] "Viverridae" "Viverridae" "Viverridae"
## [370] "Viverridae" "Viverridae" "Viverridae"
## [373] "Viverridae" "Viverridae" "Viverridae"
## [376] "Viverridae" "Viverridae" "Viverridae"
## [379] "Viverridae" "Viverridae" "Viverridae"
## [382] "Viverridae" "Viverridae" "Viverridae"
## [385] "Viverridae" "Balaenidae" "Balaenidae"
## [388] "Balaenidae" "Balaenopteridae" "Balaenopteridae"
## [391] "Balaenopteridae" "Balaenopteridae" "Balaenopteridae"
## [394] "Balaenopteridae" "Delphinidae" "Delphinidae"
## [397] "Delphinidae" "Delphinidae" "Delphinidae"
## [400] "Delphinidae" "Delphinidae" "Delphinidae"
## [403] "Delphinidae" "Delphinidae" "Delphinidae"
## [406] "Delphinidae" "Delphinidae" "Delphinidae"
## [409] "Delphinidae" "Delphinidae" "Delphinidae"
## [412] "Delphinidae" "Delphinidae" "Delphinidae"
## [415] "Eschrichtiidae" "Hyperoodontidae" "Hyperoodontidae"
## [418] "Hyperoodontidae" "Hyperoodontidae" "Hyperoodontidae"
## [421] "Iniidae" "Kogiidae" "Lipotidae"
## [424] "Monodontidae" "Monodontidae" "Phocoenidae"
## [427] "Phocoenidae" "Phocoenidae" "Physeteridae"
## [430] "Platanistidae" "Pontoporiidae" "Emballonuridae"
## [433] "Emballonuridae" "Emballonuridae" "Hipposideridae"
## [436] "Megadermatidae" "Megadermatidae" "Miniopteridae"
## [439] "Miniopteridae" "Molossidae" "Molossidae"
## [442] "Molossidae" "Molossidae" "Mystacinidae"
## [445] "Noctilionidae" "Phyllostomidae" "Phyllostomidae"
## [448] "Phyllostomidae" "Phyllostomidae" "Phyllostomidae"
## [451] "Phyllostomidae" "Phyllostomidae" "Phyllostomidae"
## [454] "Phyllostomidae" "Phyllostomidae" "Phyllostomidae"
## [457] "Phyllostomidae" "Phyllostomidae" "Phyllostomidae"
## [460] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [463] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [466] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [469] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [472] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [475] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [478] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [481] "Pteropodidae" "Pteropodidae" "Pteropodidae"
## [484] "Pteropodidae" "Pteropodidae" "Rhinolophidae"
## [487] "Rhinolophidae" "Rhinolophidae" "Rhinolophidae"
## [490] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [493] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [496] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [499] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [502] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [505] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [508] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [511] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [514] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [517] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [520] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [523] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [526] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [529] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [532] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [535] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [538] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [541] "Vespertilionidae" "Vespertilionidae" "Vespertilionidae"
## [544] "Dasypodidae" "Dasypodidae" "Dasypodidae"
## [547] "Dasypodidae" "Dasypodidae" "Dasypodidae"
## [550] "Dasypodidae" "Dasypodidae" "Dasypodidae"
## [553] "Dasypodidae" "Dasypodidae" "Dasypodidae"
## [556] "Dasypodidae" "Dasyuridae" "Dasyuridae"
## [559] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [562] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [565] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [568] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [571] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [574] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [577] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [580] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [583] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [586] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [589] "Dasyuridae" "Dasyuridae" "Dasyuridae"
## [592] "Dasyuridae" "Dasyuridae" "Myrmecobiidae"
## [595] "Cynocephalidae" "Didelphidae" "Didelphidae"
## [598] "Didelphidae" "Didelphidae" "Didelphidae"
## [601] "Didelphidae" "Didelphidae" "Didelphidae"
## [604] "Didelphidae" "Didelphidae" "Didelphidae"
## [607] "Didelphidae" "Didelphidae" "Didelphidae"
## [610] "Didelphidae" "Didelphidae" "Didelphidae"
## [613] "Didelphidae" "Didelphidae" "Didelphidae"
## [616] "Acrobatidae" "Acrobatidae" "Burramyidae"
## [619] "Burramyidae" "Burramyidae" "Hypsiprymnodontidae"
## [622] "Macropodidae" "Macropodidae" "Macropodidae"
## [625] "Macropodidae" "Macropodidae" "Macropodidae"
## [628] "Macropodidae" "Macropodidae" "Macropodidae"
## [631] "Macropodidae" "Macropodidae" "Macropodidae"
## [634] "Macropodidae" "Macropodidae" "Macropodidae"
## [637] "Macropodidae" "Macropodidae" "Macropodidae"
## [640] "Macropodidae" "Macropodidae" "Macropodidae"
## [643] "Macropodidae" "Macropodidae" "Macropodidae"
## [646] "Macropodidae" "Macropodidae" "Macropodidae"
## [649] "Macropodidae" "Macropodidae" "Macropodidae"
## [652] "Macropodidae" "Macropodidae" "Macropodidae"
## [655] "Macropodidae" "Macropodidae" "Macropodidae"
## [658] "Macropodidae" "Macropodidae" "Petauridae"
## [661] "Petauridae" "Petauridae" "Petauridae"
## [664] "Petauridae" "Phalangeridae" "Phalangeridae"
## [667] "Phalangeridae" "Phalangeridae" "Phalangeridae"
## [670] "Phalangeridae" "Phalangeridae" "Phalangeridae"
## [673] "Phascolarctidae" "Potoroidae" "Potoroidae"
## [676] "Potoroidae" "Potoroidae" "Potoroidae"
## [679] "Potoroidae" "Potoroidae" "Pseudocheiridae"
## [682] "Pseudocheiridae" "Pseudocheiridae" "Tarsipedidae"
## [685] "Vombatidae" "Vombatidae" "Vombatidae"
## [688] "Erinaceidae" "Erinaceidae" "Erinaceidae"
## [691] "Erinaceidae" "Erinaceidae" "Erinaceidae"
## [694] "Erinaceidae" "Erinaceidae" "Erinaceidae"
## [697] "Erinaceidae" "Procaviidae" "Procaviidae"
## [700] "Procaviidae" "Leporidae" "Leporidae"
## [703] "Leporidae" "Leporidae" "Leporidae"
## [706] "Leporidae" "Leporidae" "Leporidae"
## [709] "Leporidae" "Leporidae" "Leporidae"
## [712] "Leporidae" "Leporidae" "Leporidae"
## [715] "Leporidae" "Leporidae" "Ochotonidae"
## [718] "Ochotonidae" "Ochotonidae" "Ochotonidae"
## [721] "Macroscelididae" "Macroscelididae" "Macroscelididae"
## [724] "Macroscelididae" "Macroscelididae" "Macroscelididae"
## [727] "Macroscelididae" "Macroscelididae" "Macroscelididae"
## [730] "Microbiotheriidae" "Ornithorhynchidae" "Tachyglossidae"
## [733] "Tachyglossidae" "Notoryctidae" "Peramelidae"
## [736] "Peramelidae" "Peramelidae" "Peramelidae"
## [739] "Peramelidae" "Peramelidae" "Peramelidae"
## [742] "Peramelidae" "Peramelidae" "Thylacomyidae"
## [745] "Equidae" "Equidae" "Equidae"
## [748] "Equidae" "Equidae" "Equidae"
## [751] "Equidae" "Rhinocerotidae" "Rhinocerotidae"
## [754] "Rhinocerotidae" "Rhinocerotidae" "Rhinocerotidae"
## [757] "Tapiridae" "Tapiridae" "Tapiridae"
## [760] "Tapiridae" "Manidae" "Manidae"
## [763] "Bradypodidae" "Bradypodidae" "Cyclopedidae"
## [766] "Megalonychidae" "Megalonychidae" "Myrmecophagidae"
## [769] "Myrmecophagidae" "Myrmecophagidae" "Aotidae"
## [772] "Aotidae" "Aotidae" "Aotidae"
## [775] "Atelidae" "Atelidae" "Atelidae"
## [778] "Atelidae" "Atelidae" "Atelidae"
## [781] "Atelidae" "Atelidae" "Atelidae"
## [784] "Atelidae" "Atelidae" "Atelidae"
## [787] "Atelidae" "Atelidae" "Callitrichidae"
## [790] "Callitrichidae" "Callitrichidae" "Callitrichidae"
## [793] "Callitrichidae" "Callitrichidae" "Callitrichidae"
## [796] "Callitrichidae" "Callitrichidae" "Callitrichidae"
## [799] "Callitrichidae" "Callitrichidae" "Callitrichidae"
## [802] "Callitrichidae" "Callitrichidae" "Callitrichidae"
## [805] "Callitrichidae" "Callitrichidae" "Callitrichidae"
## [808] "Callitrichidae" "Callitrichidae" "Cebidae"
## [811] "Cebidae" "Cebidae" "Cebidae"
## [814] "Cebidae" "Cebidae" "Cebidae"
## [817] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [820] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [823] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [826] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [829] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [832] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [835] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [838] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [841] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [844] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [847] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [850] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [853] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [856] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [859] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [862] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [865] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [868] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [871] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [874] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [877] "Cercopithecidae" "Cercopithecidae" "Cercopithecidae"
## [880] "Cheirogaleidae" "Cheirogaleidae" "Cheirogaleidae"
## [883] "Cheirogaleidae" "Cheirogaleidae" "Cheirogaleidae"
## [886] "Daubentoniidae" "Galagidae" "Galagidae"
## [889] "Galagidae" "Galagidae" "Galagidae"
## [892] "Galagidae" "Galagidae" "Galagidae"
## [895] "Hominidae" "Hominidae" "Hominidae"
## [898] "Hominidae" "Hominidae" "Hylobatidae"
## [901] "Hylobatidae" "Hylobatidae" "Hylobatidae"
## [904] "Hylobatidae" "Hylobatidae" "Hylobatidae"
## [907] "Hylobatidae" "Hylobatidae" "Hylobatidae"
## [910] "Indriidae" "Indriidae" "Indriidae"
## [913] "Lemuridae" "Lemuridae" "Lemuridae"
## [916] "Lemuridae" "Lemuridae" "Lemuridae"
## [919] "Lemuridae" "Lemuridae" "Lemuridae"
## [922] "Lemuridae" "Lemuridae" "Lepilemuridae"
## [925] "Lepilemuridae" "Lorisidae" "Lorisidae"
## [928] "Lorisidae" "Lorisidae" "Lorisidae"
## [931] "Pitheciidae" "Pitheciidae" "Pitheciidae"
## [934] "Pitheciidae" "Pitheciidae" "Pitheciidae"
## [937] "Pitheciidae" "Pitheciidae" "Pitheciidae"
## [940] "Pitheciidae" "Pitheciidae" "Tarsiidae"
## [943] "Tarsiidae" "Tarsiidae" "Elephantidae"
## [946] "Elephantidae" "Abrocomidae" "Aplodontiidae"
## [949] "Bathyergidae" "Bathyergidae" "Bathyergidae"
## [952] "Bathyergidae" "Bathyergidae" "Bathyergidae"
## [955] "Bathyergidae" "Calomyscidae" "Calomyscidae"
## [958] "Capromyidae" "Capromyidae" "Capromyidae"
## [961] "Capromyidae" "Castoridae" "Castoridae"
## [964] "Caviidae" "Caviidae" "Caviidae"
## [967] "Caviidae" "Caviidae" "Caviidae"
## [970] "Caviidae" "Caviidae" "Chinchillidae"
## [973] "Chinchillidae" "Chinchillidae" "Chinchillidae"
## [976] "Cricetidae" "Cricetidae" "Cricetidae"
## [979] "Cricetidae" "Cricetidae" "Cricetidae"
## [982] "Cricetidae" "Cricetidae" "Cricetidae"
## [985] "Cricetidae" "Cricetidae" "Cricetidae"
## [988] "Cricetidae" "Cricetidae" "Cricetidae"
## [991] "Cricetidae" "Cricetidae" "Cricetidae"
## [994] "Cricetidae" "Cricetidae" "Cricetidae"
## [997] "Cricetidae" "Cricetidae" "Cricetidae"
## [1000] "Cricetidae" "Cricetidae" "Cricetidae"
## [1003] "Cricetidae" "Cricetidae" "Cricetidae"
## [1006] "Cricetidae" "Cricetidae" "Cricetidae"
## [1009] "Cricetidae" "Cricetidae" "Cricetidae"
## [1012] "Cricetidae" "Cricetidae" "Cricetidae"
## [1015] "Cricetidae" "Cricetidae" "Cricetidae"
## [1018] "Cricetidae" "Cricetidae" "Cricetidae"
## [1021] "Cricetidae" "Cricetidae" "Cricetidae"
## [1024] "Cricetidae" "Cricetidae" "Cricetidae"
## [1027] "Cricetidae" "Cricetidae" "Cricetidae"
## [1030] "Cricetidae" "Cricetidae" "Cricetidae"
## [1033] "Cricetidae" "Cricetidae" "Cricetidae"
## [1036] "Cricetidae" "Cricetidae" "Cricetidae"
## [1039] "Cricetidae" "Cricetidae" "Cricetidae"
## [1042] "Cricetidae" "Ctenodactylidae" "Ctenodactylidae"
## [1045] "Ctenomyidae" "Ctenomyidae" "Cuniculidae"
## [1048] "Cuniculidae" "Dasyproctidae" "Dasyproctidae"
## [1051] "Dasyproctidae" "Dasyproctidae" "Dasyproctidae"
## [1054] "Dasyproctidae" "Dasyproctidae" "Dasyproctidae"
## [1057] "Dasyproctidae" "Dinomyidae" "Dipodidae"
## [1060] "Dipodidae" "Dipodidae" "Dipodidae"
## [1063] "Dipodidae" "Dipodidae" "Dipodidae"
## [1066] "Dipodidae" "Dipodidae" "Dipodidae"
## [1069] "Echimyidae" "Echimyidae" "Echimyidae"
## [1072] "Echimyidae" "Echimyidae" "Echimyidae"
## [1075] "Echimyidae" "Erethizontidae" "Erethizontidae"
## [1078] "Erethizontidae" "Erethizontidae" "Geomyidae"
## [1081] "Geomyidae" "Geomyidae" "Geomyidae"
## [1084] "Geomyidae" "Geomyidae" "Gliridae"
## [1087] "Gliridae" "Gliridae" "Gliridae"
## [1090] "Gliridae" "Gliridae" "Gliridae"
## [1093] "Gliridae" "Heteromyidae" "Heteromyidae"
## [1096] "Heteromyidae" "Heteromyidae" "Heteromyidae"
## [1099] "Heteromyidae" "Heteromyidae" "Heteromyidae"
## [1102] "Heteromyidae" "Heteromyidae" "Heteromyidae"
## [1105] "Heteromyidae" "Heteromyidae" "Heteromyidae"
## [1108] "Heteromyidae" "Heteromyidae" "Heteromyidae"
## [1111] "Heteromyidae" "Hystricidae" "Hystricidae"
## [1114] "Hystricidae" "Hystricidae" "Hystricidae"
## [1117] "Hystricidae" "Hystricidae" "Hystricidae"
## [1120] "Hystricidae" "Muridae" "Muridae"
## [1123] "Muridae" "Muridae" "Muridae"
## [1126] "Muridae" "Muridae" "Muridae"
## [1129] "Muridae" "Muridae" "Muridae"
## [1132] "Muridae" "Muridae" "Muridae"
## [1135] "Muridae" "Muridae" "Muridae"
## [1138] "Muridae" "Muridae" "Muridae"
## [1141] "Muridae" "Muridae" "Muridae"
## [1144] "Muridae" "Muridae" "Muridae"
## [1147] "Muridae" "Muridae" "Muridae"
## [1150] "Muridae" "Muridae" "Muridae"
## [1153] "Muridae" "Muridae" "Muridae"
## [1156] "Muridae" "Muridae" "Muridae"
## [1159] "Muridae" "Muridae" "Muridae"
## [1162] "Muridae" "Muridae" "Muridae"
## [1165] "Muridae" "Muridae" "Muridae"
## [1168] "Muridae" "Muridae" "Muridae"
## [1171] "Muridae" "Muridae" "Muridae"
## [1174] "Muridae" "Muridae" "Muridae"
## [1177] "Muridae" "Muridae" "Muridae"
## [1180] "Muridae" "Muridae" "Muridae"
## [1183] "Muridae" "Muridae" "Muridae"
## [1186] "Muridae" "Muridae" "Muridae"
## [1189] "Muridae" "Muridae" "Muridae"
## [1192] "Muridae" "Muridae" "Muridae"
## [1195] "Muridae" "Muridae" "Muridae"
## [1198] "Muridae" "Muridae" "Muridae"
## [1201] "Muridae" "Muridae" "Muridae"
## [1204] "Myocastoridae" "Nesomyidae" "Nesomyidae"
## [1207] "Nesomyidae" "Nesomyidae" "Nesomyidae"
## [1210] "Nesomyidae" "Nesomyidae" "Nesomyidae"
## [1213] "Octodontidae" "Octodontidae" "Octodontidae"
## [1216] "Pedetidae" "Petromuridae" "Platacanthomyidae"
## [1219] "Sciuridae" "Sciuridae" "Sciuridae"
## [1222] "Sciuridae" "Sciuridae" "Sciuridae"
## [1225] "Sciuridae" "Sciuridae" "Sciuridae"
## [1228] "Sciuridae" "Sciuridae" "Sciuridae"
## [1231] "Sciuridae" "Sciuridae" "Sciuridae"
## [1234] "Sciuridae" "Sciuridae" "Sciuridae"
## [1237] "Sciuridae" "Sciuridae" "Sciuridae"
## [1240] "Sciuridae" "Sciuridae" "Sciuridae"
## [1243] "Sciuridae" "Sciuridae" "Sciuridae"
## [1246] "Sciuridae" "Sciuridae" "Sciuridae"
## [1249] "Sciuridae" "Sciuridae" "Sciuridae"
## [1252] "Sciuridae" "Sciuridae" "Sciuridae"
## [1255] "Sciuridae" "Sciuridae" "Sciuridae"
## [1258] "Sciuridae" "Sciuridae" "Sciuridae"
## [1261] "Sciuridae" "Sciuridae" "Sciuridae"
## [1264] "Sciuridae" "Sciuridae" "Sciuridae"
## [1267] "Sciuridae" "Sciuridae" "Sciuridae"
## [1270] "Sciuridae" "Sciuridae" "Sciuridae"
## [1273] "Sciuridae" "Sciuridae" "Sciuridae"
## [1276] "Sciuridae" "Sciuridae" "Sciuridae"
## [1279] "Sciuridae" "Sciuridae" "Sciuridae"
## [1282] "Sciuridae" "Sciuridae" "Sciuridae"
## [1285] "Sciuridae" "Sciuridae" "Sciuridae"
## [1288] "Sciuridae" "Sciuridae" "Sciuridae"
## [1291] "Sciuridae" "Sciuridae" "Sciuridae"
## [1294] "Sciuridae" "Sciuridae" "Sciuridae"
## [1297] "Sciuridae" "Sciuridae" "Sciuridae"
## [1300] "Sciuridae" "Spalacidae" "Spalacidae"
## [1303] "Spalacidae" "Spalacidae" "Spalacidae"
## [1306] "Thryonomyidae" "Ptilocercidae" "Tupaiidae"
## [1309] "Tupaiidae" "Tupaiidae" "Tupaiidae"
## [1312] "Tupaiidae" "Dugongidae" "Trichechidae"
## [1315] "Trichechidae" "Solenodontidae" "Solenodontidae"
## [1318] "Soricidae" "Soricidae" "Soricidae"
## [1321] "Soricidae" "Soricidae" "Soricidae"
## [1324] "Soricidae" "Soricidae" "Soricidae"
## [1327] "Soricidae" "Soricidae" "Soricidae"
## [1330] "Soricidae" "Soricidae" "Soricidae"
## [1333] "Soricidae" "Soricidae" "Soricidae"
## [1336] "Soricidae" "Soricidae" "Soricidae"
## [1339] "Soricidae" "Soricidae" "Soricidae"
## [1342] "Soricidae" "Talpidae" "Talpidae"
## [1345] "Talpidae" "Talpidae" "Talpidae"
## [1348] "Talpidae" "Orycteropodidae"
# printing just the unqiue variables
unique(mammals$Family)
## [1] "Tenrecidae" "Antilocapridae" "Bovidae"
## [4] "Camelidae" "Cervidae" "Giraffidae"
## [7] "Hippopotamidae" "Moschidae" "Suidae"
## [10] "Tayassuidae" "Tragulidae" "Ailuridae"
## [13] "Canidae" "Eupleridae" "Felidae"
## [16] "Herpestidae" "Hyaenidae" "Mephitidae"
## [19] "Mustelidae" "Nandiniidae" "Odobenidae"
## [22] "Otariidae" "Phocidae" "Procyonidae"
## [25] "Ursidae" "Viverridae" "Balaenidae"
## [28] "Balaenopteridae" "Delphinidae" "Eschrichtiidae"
## [31] "Hyperoodontidae" "Iniidae" "Kogiidae"
## [34] "Lipotidae" "Monodontidae" "Phocoenidae"
## [37] "Physeteridae" "Platanistidae" "Pontoporiidae"
## [40] "Emballonuridae" "Hipposideridae" "Megadermatidae"
## [43] "Miniopteridae" "Molossidae" "Mystacinidae"
## [46] "Noctilionidae" "Phyllostomidae" "Pteropodidae"
## [49] "Rhinolophidae" "Vespertilionidae" "Dasypodidae"
## [52] "Dasyuridae" "Myrmecobiidae" "Cynocephalidae"
## [55] "Didelphidae" "Acrobatidae" "Burramyidae"
## [58] "Hypsiprymnodontidae" "Macropodidae" "Petauridae"
## [61] "Phalangeridae" "Phascolarctidae" "Potoroidae"
## [64] "Pseudocheiridae" "Tarsipedidae" "Vombatidae"
## [67] "Erinaceidae" "Procaviidae" "Leporidae"
## [70] "Ochotonidae" "Macroscelididae" "Microbiotheriidae"
## [73] "Ornithorhynchidae" "Tachyglossidae" "Notoryctidae"
## [76] "Peramelidae" "Thylacomyidae" "Equidae"
## [79] "Rhinocerotidae" "Tapiridae" "Manidae"
## [82] "Bradypodidae" "Cyclopedidae" "Megalonychidae"
## [85] "Myrmecophagidae" "Aotidae" "Atelidae"
## [88] "Callitrichidae" "Cebidae" "Cercopithecidae"
## [91] "Cheirogaleidae" "Daubentoniidae" "Galagidae"
## [94] "Hominidae" "Hylobatidae" "Indriidae"
## [97] "Lemuridae" "Lepilemuridae" "Lorisidae"
## [100] "Pitheciidae" "Tarsiidae" "Elephantidae"
## [103] "Abrocomidae" "Aplodontiidae" "Bathyergidae"
## [106] "Calomyscidae" "Capromyidae" "Castoridae"
## [109] "Caviidae" "Chinchillidae" "Cricetidae"
## [112] "Ctenodactylidae" "Ctenomyidae" "Cuniculidae"
## [115] "Dasyproctidae" "Dinomyidae" "Dipodidae"
## [118] "Echimyidae" "Erethizontidae" "Geomyidae"
## [121] "Gliridae" "Heteromyidae" "Hystricidae"
## [124] "Muridae" "Myocastoridae" "Nesomyidae"
## [127] "Octodontidae" "Pedetidae" "Petromuridae"
## [130] "Platacanthomyidae" "Sciuridae" "Spalacidae"
## [133] "Thryonomyidae" "Ptilocercidae" "Tupaiidae"
## [136] "Dugongidae" "Trichechidae" "Solenodontidae"
## [139] "Soricidae" "Talpidae" "Orycteropodidae"
# checking how many variables there are!
## length of variable
length(mammals$Family)
## [1] 1349
## number of rows
nrow(data)
## NULL
So far, we have applied only single functions, but me way want to do two things at once. For example, getting only unique values from a variable, and then checking how many there are (i.e. unique() and then length()). The "traditional" way to do this in R is to nest functions:
length(unique(mammals$Family))
## [1] 141
This is not a very intuitive way to code, because you would work from the inside out and this can get quite messy quite quickly. The "tidy" way to to do this instead, is to use "pipes".
mammals$Family %>% unique() %>% length()
## [1] 141
# pipes don't stop at the end of a line!
mammals$Family %>%
unique() %>%
length()
## [1] 141
isn't this a much nicer way to code?
This is a large table and you may only want to work with a few columns.
mammals %>%
select(Genus, Species, `Maximum longevity (yrs)`)
## # A tibble: 1,349 × 3
## Genus Species `Maximum longevity (yrs)`
## <chr> <chr> <dbl>
## 1 Echinops telfairi 19
## 2 Geogale aurita NA
## 3 Hemicentetes semispinosus 2.7
## 4 Microgale dobsoni 5.6
## 5 Microgale talazaci 5.8
## 6 Setifer setosus 14.1
## 7 Tenrec ecaudatus 8.7
## 8 Antilocapra americana 17
## 9 Addax nasomaculatus 28
## 10 Aepyceros melampus 25.6
## # ℹ 1,339 more rows
you may want to sort data based on specific columns
# single variable
mammals %>%
select(Genus, Species, `Maximum longevity (yrs)`) %>%
arrange(`Maximum longevity (yrs)`)
## # A tibble: 1,349 × 3
## Genus Species `Maximum longevity (yrs)`
## <chr> <chr> <dbl>
## 1 Myodes rutilus 2.1
## 2 Myosorex varius 2.1
## 3 Blarina brevicauda 2.2
## 4 Crocidura flavescens 2.2
## 5 Abrocoma bennettii 2.3
## 6 Arvicola amphibius 2.5
## 7 Condylura cristata 2.5
## 8 Hemicentetes semispinosus 2.7
## 9 Blarina hylophaga 2.8
## 10 Crocidura suaveolens 2.8
## # ℹ 1,339 more rows
# or multiple and descending
mammals %>%
select(Genus, Species, `Maximum longevity (yrs)`) %>%
arrange(Genus, desc(`Maximum longevity (yrs)`))
## # A tibble: 1,349 × 3
## Genus Species `Maximum longevity (yrs)`
## <chr> <chr> <dbl>
## 1 Abrocoma bennettii 2.3
## 2 Acinonyx jubatus 20.5
## 3 Acomys cahirinus 5.9
## 4 Acomys wilsoni 5.6
## 5 Acomys cilicicus 4
## 6 Acomys minous NA
## 7 Acomys russatus NA
## 8 Acrobates pygmaeus 8.8
## 9 Acrocodia indica 36.5
## 10 Addax nasomaculatus 28
## # ℹ 1,339 more rows
You can also use select to move columns around
mammals %>%
select(Genus, Species, `Maximum longevity (yrs)`) %>%
select(`Maximum longevity (yrs)`, everything())
## # A tibble: 1,349 × 3
## `Maximum longevity (yrs)` Genus Species
## <dbl> <chr> <chr>
## 1 19 Echinops telfairi
## 2 NA Geogale aurita
## 3 2.7 Hemicentetes semispinosus
## 4 5.6 Microgale dobsoni
## 5 5.8 Microgale talazaci
## 6 14.1 Setifer setosus
## 7 8.7 Tenrec ecaudatus
## 8 17 Antilocapra americana
## 9 28 Addax nasomaculatus
## 10 25.6 Aepyceros melampus
## # ℹ 1,339 more rows
Lastly, we may only want to work with very specific family.
mammals %>%
filter(Family=="Galagidae") %>%
select(Genus, Species, `Maximum longevity (yrs)`)
## # A tibble: 8 × 3
## Genus Species `Maximum longevity (yrs)`
## <chr> <chr> <dbl>
## 1 Euoticus elegantulus NA
## 2 Galago moholi 16.6
## 3 Galago senegalensis 17.1
## 4 Galagoides demidoff 13.4
## 5 Otolemur crassicaudatus 22.7
## 6 Otolemur garnettii 20
## 7 Paragalago zanzibaricus NA
## 8 Sciurocheirus alleni NA
Usually we will want to get some sort of summary statistcs from our data. Some functions exist to do very specific things, like count.
mammals %>%
count(Order, sort=T)
## # A tibble: 28 × 2
## Order n
## <chr> <int>
## 1 Rodentia 360
## 2 Carnivora 205
## 3 Primates 174
## 4 Artiodactyla 173
## 5 Chiroptera 112
## 6 Diprotodontia 72
## 7 Cetacea 46
## 8 Dasyuromorphia 38
## 9 Soricomorpha 33
## 10 Didelphimorphia 20
## # ℹ 18 more rows
However, the "summarize" family of functions are extremely useful because they can be applied in a very flexible manner.
## on a single variable with a single statistic
mammals %>%
summarise(mean_longevity=mean(`Maximum longevity (yrs)`, na.rm=T))
## # A tibble: 1 × 1
## mean_longevity
## <dbl>
## 1 19.8
## on a single variable with multiple statistics
mammals %>%
filter(Family=="Galagidae") %>%
summarize(mean_longevity=mean(`Maximum longevity (yrs)`, na.rm=T),
sd_longevity=sd(`Maximum longevity (yrs)`, na.rm=T),
N=n())
## # A tibble: 1 × 3
## mean_longevity sd_longevity N
## <dbl> <dbl> <int>
## 1 18.0 3.54 8
We may also want to do this for groups of data. For this we use
group_by()
which is extremely powerful especially when
combined with summarise()
.
mammals %>%
group_by(Family) %>%
summarize(mean_longevity=mean(`Maximum longevity (yrs)`, na.rm=T),
sd_longevity=sd(`Maximum longevity (yrs)`, na.rm=T),
N=n())
## # A tibble: 141 × 4
## Family mean_longevity sd_longevity N
## <chr> <dbl> <dbl> <int>
## 1 Abrocomidae 2.3 NA 1
## 2 Acrobatidae 8.8 NA 2
## 3 Ailuridae 19 NA 1
## 4 Antilocapridae 17 NA 1
## 5 Aotidae 29 4.93 4
## 6 Aplodontiidae NaN NA 1
## 7 Atelidae 37.8 8.41 14
## 8 Balaenidae 116 82.3 3
## 9 Balaenopteridae 85.8 24.8 6
## 10 Bathyergidae 18.2 7.50 7
## # ℹ 131 more rows
"Mutate" is the last, important vocabulary we will address. This is extremely useful for making a new variable. The genus and species names are in separate columns, but often we may want to work with binomial names.
mammals %>%
mutate(Sp=paste(Genus, Species, sep="_")) %>%
select(Genus, Species, Sp)
## # A tibble: 1,349 × 3
## Genus Species Sp
## <chr> <chr> <chr>
## 1 Echinops telfairi Echinops_telfairi
## 2 Geogale aurita Geogale_aurita
## 3 Hemicentetes semispinosus Hemicentetes_semispinosus
## 4 Microgale dobsoni Microgale_dobsoni
## 5 Microgale talazaci Microgale_talazaci
## 6 Setifer setosus Setifer_setosus
## 7 Tenrec ecaudatus Tenrec_ecaudatus
## 8 Antilocapra americana Antilocapra_americana
## 9 Addax nasomaculatus Addax_nasomaculatus
## 10 Aepyceros melampus Aepyceros_melampus
## # ℹ 1,339 more rows
NOTE: although we just made this new variable, we did not save it. So if we go back and look in our original dataset, we would not see it. In order to make the changes to the dataset we would have to overwrite it.
mammals<-mammals %>%
mutate(Sp=paste(Genus, Species, sep="_"))
At the end of every project, you most likely will want to export some data. The preferred way to do this for rectangular data (i.e. a table) is to save them as .csv or .tsv.
# make a new object
mammals_export<-
mammals %>%
group_by(Family) %>%
summarize(mean_longevity=mean(`Maximum longevity (yrs)`, na.rm=T),
sd_longevity=sd(`Maximum longevity (yrs)`, na.rm=T),
N=n())
# write as .csv
#write_csv(mammals_export, file="mammal_longevity.csv")