
Splits a dataframe row-wise or col-wise into any arbitrary number of dataframes
Source:R/unjoin.R
unrbind.Rd
This function splits a dataframe into any number of dataframes such that they
can be rejoined by using rbind()
/dplyr::bind_rows()
for unrbind()
or
cbind()
/dplyr::bind_cols()
for uncbind()
. The user may find it
appropriate to go on and apply messy()
to each new dataframe independently
to impede rejoining.
Usage
unrbind(data, sizes = NULL, probs = NULL, names = NULL, shuffle = TRUE)
uncbind(data, sizes = NULL, probs = NULL, names = NULL, shuffle = TRUE)
Arguments
- data
input dataframe
- sizes
A vector of numeric inputs summing to
nrow(data)
forunrbind()
orncol(data)
foruncbind()
; the number of rows of each resulting dataframe. Seeprobs
for an alternative approach. If neither are provided, the dataframe will be split roughly in half.- probs
A vector of numeric inputs summing to
1
; the proportion of rows/columns in each resulting dataframe. An alternative tosizes
.- names
The names of the output list. If
NULL
the list will be unnamed.- shuffle
Shuffle rows in
unrbind()
or columns inuncbind()
? Defaults toTRUE
.
Details
Real data can often be found in disparate files. For example, data reports
may come in monthly and require row-binding together to obtain a complete
annual time series. Scientific results may arrive from different laboratories
and require binding together for further analysis and comparisons. This
function may simulate a single dataframe having come from different sources
and requiring binding back together. Base R's split()
offers an alternative
to unrbind()
, but requires a pre-existing factor column to split by and
cannot as easily create random splits in the data.
See also
Other data deconstructors:
unjoin()
Examples
unrbind(dplyr::tibble(mtcars), probs = c(0.5, 0.3, 0.2))
#> [[1]]
#> # A tibble: 16 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 2 17.8 6 168. 123 3.92 3.44 18.9 1 0 4 4
#> 3 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2
#> 4 26 4 120. 91 4.43 2.14 16.7 0 1 5 2
#> 5 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 6 13.3 8 350 245 3.73 3.84 15.4 0 0 3 4
#> 7 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6
#> 8 19.2 8 400 175 3.08 3.84 17.0 0 0 3 2
#> 9 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1
#> 10 15.2 8 276. 180 3.07 3.78 18 0 0 3 3
#> 11 15.2 8 304 150 3.15 3.44 17.3 0 0 3 2
#> 12 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 13 10.4 8 460 215 3 5.42 17.8 0 0 3 4
#> 14 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 15 14.7 8 440 230 3.23 5.34 17.4 0 0 3 4
#> 16 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#>
#> [[2]]
#> # A tibble: 10 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 30.4 4 75.7 52 4.93 1.62 18.5 1 1 4 2
#> 2 16.4 8 276. 180 3.07 4.07 17.4 0 0 3 3
#> 3 21.5 4 120. 97 3.7 2.46 20.0 1 0 3 1
#> 4 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1
#> 5 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
#> 6 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 7 10.4 8 472 205 2.93 5.25 18.0 0 0 3 4
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 9 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1
#> 10 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
#>
#> [[3]]
#> # A tibble: 6 × 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 15.5 8 318 150 2.76 3.52 16.9 0 0 3 2
#> 2 15.8 8 351 264 4.22 3.17 14.5 0 1 5 4
#> 3 17.3 8 276. 180 3.07 3.73 17.6 0 0 3 3
#> 4 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 5 15 8 301 335 3.54 3.57 14.6 0 1 5 8
#> 6 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#>
uncbind(dplyr::tibble(mtcars), probs = c(0.5, 0.3, 0.2))
#> [[1]]
#> # A tibble: 32 × 6
#> hp wt gear disp am vs
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 110 2.62 4 160 1 0
#> 2 110 2.88 4 160 1 0
#> 3 93 2.32 4 108 1 1
#> 4 110 3.22 3 258 0 1
#> 5 175 3.44 3 360 0 0
#> 6 105 3.46 3 225 0 1
#> 7 245 3.57 3 360 0 0
#> 8 62 3.19 4 147. 0 1
#> 9 95 3.15 4 141. 0 1
#> 10 123 3.44 4 168. 0 1
#> # ℹ 22 more rows
#>
#> [[2]]
#> # A tibble: 32 × 3
#> qsec cyl carb
#> <dbl> <dbl> <dbl>
#> 1 16.5 6 4
#> 2 17.0 6 4
#> 3 18.6 4 1
#> 4 19.4 6 1
#> 5 17.0 8 2
#> 6 20.2 6 1
#> 7 15.8 8 4
#> 8 20 4 2
#> 9 22.9 4 2
#> 10 18.3 6 4
#> # ℹ 22 more rows
#>
#> [[3]]
#> # A tibble: 32 × 2
#> mpg drat
#> <dbl> <dbl>
#> 1 21 3.9
#> 2 21 3.9
#> 3 22.8 3.85
#> 4 21.4 3.08
#> 5 18.7 3.15
#> 6 18.1 2.76
#> 7 14.3 3.21
#> 8 24.4 3.69
#> 9 22.8 3.92
#> 10 19.2 3.92
#> # ℹ 22 more rows
#>