Skip to content

Conversation

@Mytherin
Copy link
Collaborator

In this PR we remove the individual vector cardinalities (sel_vector and count). Instead, vectors have a reference to a VectorCardinality object that has their cardinality information. DataChunk inherits from VectorCardinality, and vectors that live within a chunk all reference their parent chunk as their cardinality.

This change removes a lot of duplicate information: in a chunk, all vectors are supposed to have the same count and selection vector. However, previously this was not enforced directly, but rather continuously checked using the DataChunk::Verify method. Now, all the vectors within a DataChunk are enforced to have the same cardinality.

This also means that the cardinality of individual vectors can no longer be set; instead the parent VectorCardinality must be modified. This also cleans up the ExpressionExecutor, as now the vectors that live in the expression executor all have the same cardinality, and there can no longer exist a function that (by accident) returns a vector with a different cardinality.

Vectors that have their own cardinality still exist: they are now called FlatVector. This vector inherits from the regular Vector, but has its own VectorCardinality property. This is needed in surprisingly few places, however, and generally should not be used unless there is a good reason to use it.

@Mytherin Mytherin merged commit 8930a4b into master Feb 18, 2020
@Mytherin Mytherin deleted the removedupinfofromvector branch February 18, 2020 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants