Skip to content

Skorch inference in cross validation #1062

@faridehm

Description

@faridehm

Hello every body
I 'm using cross validation(cv) for a classification problem, I spilt my data into test and train, I used train for cv model, and test data for inference step.
my code for cv and early stopping at the same time is :

net = NeuralNetClassifier(
    module = SimpleNN,
    max_epochs = 300,
    lr = 0.001,
    train_split=False,
    # train_split=predefined_split(valid_ds)
    # module__dropout=0.2,
    iterator_train__batch_size = 10,
    iterator_train__shuffle = True,
    iterator_valid__batch_size =10,
    iterator_valid__shuffle = False,
    criterion = nn.BCEWithLogitsLoss(weight=pos_weight),
    optimizer = torch.optim.AdamW,
    optimizer__weight_decay=0.01,
    callbacks = [EarlyStopping(patience=5, monitor='train_loss')],
    device = device
)
# Train the model
print('Using...', device)
print("Training started...")
from sklearn.metrics import make_scorer
scoring = {'prec_macro': 'precision_macro',
           'rec_macro': make_scorer(recall_score, average='macro')}
scores = cross_validate(net, X_train.to(torch.float64), y_train, scoring='accuracy', return_train_score=True, cv=5,error_score='raise' )
sorted(scores.keys())
scores
print("Training completed!")

now when I try to save the model as following code, it return erroe "NotInitializedError: Cannot save state of an un-initialized model. Please initialize first by calling .initialize() or by fitting the model with .fit(...)."

net.save_params(
    f_params='model.pkl', f_optimizer='opt.pkl', f_history='history.json')
new_net = NeuralNetClassifier(
    module = SimpleNN,
    max_epochs = 300,
    lr = 0.001,
    train_split=False,
    # train_split=predefined_split(valid_ds)
    # module__dropout=0.2,
    # train_split=predefined_split(dataset_valid),
    iterator_train__batch_size = 10,
    iterator_train__shuffle = True,
    iterator_valid__batch_size =10,
    iterator_valid__shuffle = False,
    criterion = nn.BCEWithLogitsLoss(weight=pos_weight),
    optimizer = torch.optim.AdamW,
    optimizer__weight_decay=0.01,
    callbacks = [EarlyStopping(patience=5, monitor='train_loss')],
    device = device
)

new_net.initialize() # This is important!
new_net.load_params(
    f_params='model.pkl', f_optimizer='opt.pkl', f_history='history.json')

new_net.fit(np.array(X_test, dtype=float), y_test) 

Is it reliable this code for cv? I want to save 5 separated model for 5 fold CV, but I could not find any related document, appreciated any advice ..

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions