6.1 Combining two files with identical row identifiers
The first situation might be where the output from our measurement platform is produced in one file for each sample, and then needs to be merged so all results for all samples are in the same matrix or dataframe for analysis. In the simplest case here, the order and length of the files is identical, so they can just be joined as columns in the same dataset. You can do this using the cbind()
function
# Create several vectors of equal length. Each represents the data from one sample
vec.one <- (1:10)
vec.two <- (2:11)
vec.three <- c(rep(1,3), rep(4,3), rep(8,4))
# Then join them together to make a matrix with one column per sample vector
joined.data <- cbind(vec.one, vec.two, vec.three)
joined.data
## vec.one vec.two vec.three
## [1,] 1 2 1
## [2,] 2 3 1
## [3,] 3 4 1
## [4,] 4 5 4
## [5,] 5 6 4
## [6,] 6 7 4
## [7,] 7 8 8
## [8,] 8 9 8
## [9,] 9 10 8
## [10,] 10 11 8
Depending on the needs of your analysis, you can alternatively use the rbind()
function to join them as rows.
cbind
will work to join matrices as well as vectors, as long as they all have the same number of rows. Try this using some pre-prepared subsets of the Golub
data.
-
Load in the three datasets stored in file golub_cbind.RData using the
load()
function.-
This file contains three objects:
-
golub.names
- a vector of the gene names -
golub.all
- a dataframe of the ALL sample data, and -
golub.aml
- a dataframe of the AML sample data
-
-
This file contains three objects:
-
Use
cbind()
to create a duplicate of the Golub dataframe from these three objects.