Issue via Email

B. Machine Learning, Chapter 3, Regression & Model Assessment, Section 3.6.1 k-Fold Cross-Validation. In the code snippets to calculate the 10-fold cross validation across all cases, the training set is amassed incorrectly and has redundant data.

Instead of


train = pd.concat([soldata[splits[i]:], soldata[splits[i + 1]:]])

it should be

train = pd.concat([soldata[:splits[i]], soldata[splits[i + 1] :]])

The change is pretty minor when we consider the whole dataset, but switching to only looking at a subset of the data, there is a significant variation in error based on the choice of k. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue via Email #250

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue via Email #250

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions