6.2 Summarising data
6.2.1 Table summaries
# To generate a table with the full descriptions
species.description.table <- with(species.annotated,
table(NCA.description, Endemicity.description))
head(species.description.table[,1:2],10)
## Endemicity.description
## NCA.description Australian endemic Australian vagrant
## Australian endemic 0 0
## Australian vagrant 0 0
## Conservation dependent 0 0
## Critically endangered 0 0
## Critically endangered wildlife 0 0
## Endangered 0 0
## Endangered wildlife 24 0
## Extinct 0 0
## Extinct in the wild wildlife 5 0
## International vagrant 0 0
But this table includes endemicity under the NCA status, and vice versa. This is because when we used cbind or merge earlier, those two columns were created as factors. The values of those factors were not just those present in the list, but everything in the original annotation file (i.e. all NCA, EPBC and Endemicity descriptions).
# To get rid of all the irrelevent or inappropriate factor values, we can
# re-factor those two columns
species.annotated$NCA.description <- factor(species.annotated$NCA.description)
species.annotated$Endemicity.description <-
factor(species.annotated$Endemicity.description)
# Then repeat the table generation, and check again
species.description.table <- with(species.annotated,
table(NCA.description, Endemicity.description))
species.description.table[,1:2]
## Endemicity.description
## NCA.description Australian endemic Australian vagrant
## Endangered wildlife 24 0
## Extinct in the wild wildlife 5 0
## Least concern wildlife 1968 2
## Near threatened wildlife 30 0
## Special least concern wildlife 4 0
## Vulnerable wildlife 53 0
# The NCA.description column is, like all factors, sorted alphabetically by default.
# To put it in a custom order, we can reassign the levels
levels(species.annotated$NCA.description)
## [1] "Endangered wildlife" "Extinct in the wild wildlife"
## [3] "Least concern wildlife" "Near threatened wildlife"
## [5] "Special least concern wildlife" "Vulnerable wildlife"
species.annotated$NCA.description <- factor(species.annotated$NCA.description,
levels(species.annotated$NCA.description)[c(2,1,6,4,5,3)])
levels(species.annotated$NCA.description)
## [1] "Extinct in the wild wildlife" "Endangered wildlife"
## [3] "Vulnerable wildlife" "Near threatened wildlife"
## [5] "Special least concern wildlife" "Least concern wildlife"
# And now produce the final, non-redundant and ordered table
species.description.table <- with(species.annotated,
table(NCA.description, Endemicity.description))
species.description.table[,1:2]
## Endemicity.description
## NCA.description Australian endemic Australian vagrant
## Extinct in the wild wildlife 5 0
## Endangered wildlife 24 0
## Vulnerable wildlife 53 0
## Near threatened wildlife 30 0
## Special least concern wildlife 4 0
## Least concern wildlife 1968 2