i'd split dataframe in 4 equals parts, because i'd use 4 cores of computer.
i did :
df2 <- split(df, 1:4) unsplit(df2, f=1:4)
and that
df2 <- split(df, 1:4) unsplit(df2, f=c('1','2','3','4')
but unsplit function did not work, have these warnings messages
1: in split.default(seq_along(x), f, drop = drop, ...) : data length not multiple of split variable ...
do have idea of reason ?
how many rows in df
? warning if number of rows in table not divisible 4. think using split factor f
incorrectly, unless want put each subsequent row different split data.frame.
if want split data 4 dataframes. 1 row after other make splitting factor same size number of rows in dataframe using rep_len
this:
## split this: split(df , f = rep_len(1:4, nrow(df) ) ) ## unsplit this: unsplit( split(df , f = rep_len(1:4, nrow(df) ) ) , f = rep_len(1:4,nrow(df) ) )
hopefully example illustrates why error occurs , how avoid (i.e. use proper splitting factor!).
## want split our data.frame 2 halves, rows not divisible 2 df <- data.frame( x = runif(5) ) df ## splitting still works but... ## warning because split factor 'f' not recycled multiple of it's length split( df , f = 1:2 ) #$`1` # x #1 0.6970968 #3 0.5614762 #5 0.5910995 #$`2` # x #2 0.6206521 #4 0.1798006 warning message: in split.default(x = seq_len(nrow(x)), f = f, drop = drop, ...) : data length not multiple of split variable ## instead let's use same split levels (1:2)... ## make equal length of rows in table: splt <- rep_len( 1:2 , nrow(df) ) splt #[1] 1 2 1 2 1 ## split works, , f not recycled because there ## same number of values in 'f' rows in table split( df , f = splt ) #$`1` # x #1 0.6970968 #3 0.5614762 #5 0.5910995 #$`2` # x #2 0.6206521 #4 0.1798006 ## , unsplitting works expected , reconstructs our original data.frame unsplit( split( df , f = splt ) , f = splt ) # x #1 0.6970968 #2 0.6206521 #3 0.5614762 #4 0.1798006 #5 0.5910995
Comments
Post a Comment