Skip to content

[R-package] cannot uses slices of the same Dataset as training and validation datasets #6008

Open
@mocista

Description

@mocista

Description

Defining a lgb.Dataset with free_raw_data = FALSE. Then slicing it in two parts and using them as train/validation sets in lightgbm (R). The call lgb.train(... fails with an error

please set ‘free_raw_data = FALSE’ when you construct lgb.Dataset"

Don't understand why. Can anyone help please?

Reproducible example

library(lightgbm)

boston = MASS::Boston
str(boston)
dim(boston)

set.seed(12)
boston_lgb_dataset = lgb.Dataset(scale(boston[, -14]), label = boston[,  14] ,free_raw_data = FALSE)

dtrain = lightgbm::slice(boston_lgb_dataset, c(1:350))
dtest = lightgbm::slice(boston_lgb_dataset, c(351:506))

params = list(
  objective = "regression"
  , metric = "l2"
  , min_data = 1L
  , learning_rate = .3
)
 
model = lgb.train( 
  params = params
  , data = dtrain
  , nrounds = 20L
   , valids = list( test  = dtest) 
 )

Environment info

R version 4.2.0

LightGBM version: 3.3.5

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions