Common method bias in PLS-SEM: A full collinearity
assessment approach
Ned Kock
Full reference:
Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach.
International Journal of e-Collaboration, 11(4), 1-10.
Abstract
We discuss common method bias in the context of structural equation modeling employing the
partial least squares method (PLS-SEM). Two datasets were created through a Monte Carlo
simulation to illustrate the discussion: one contaminated by common method bias, and the other
not contaminated. A practical approach is presented for the identification of common method
bias based on variance inflation factors generated via a full collinearity test. Our discussion
builds on an illustrative model in the field of e-collaboration, with outputs generated by the
software WarpPLS. We demonstrate that the full collinearity test is successful in the
identification of common method bias with a model that nevertheless passes standard convergent
and discriminant validity assessment criteria based on a confirmation factor analysis.
KEYWORDS: Partial Least Squares; Structural Equation Modeling; Common Method Bias;
Monte Carlo Simulation
1
Introduction
The method of path analysis has been developed by Wright (1934; 1960) to study causal
assumptions in the field of evolutionary biology (Kock, 2011), and now provides the foundation
on which structural equation modeling (SEM) rests. Both path analysis and SEM rely on the
creation of models expressing causal relationships through links among variables.
Two main types of SEM exist today: covariance-based and PLS-based SEM. While the former
relies on the minimization of differences between covariance matrices, the latter employs the
partial least squares method (PLS) developed by Herman Wold (Wold, 1980). PLS-based SEM
is often referred to simply as PLS-SEM, and is widely used in the field of e-collaboration and
many other fields.
Regardless of SEM flavor, models expressing causal assumptions include latent variables.
These latent variables are measured indirectly through other variables generally known as
indicators (Maruyama, 1998; Mueller, 1996). Indicator values are usually obtained from
questionnaires where answers are provided on numeric scales, of which the most commonly used
are Likert-type scales (Cohen et al., 2003).
Using questionnaires answered on Likert-type scales constitutes an integral part of an SEM
study’s measurement method. Common method bias is a phenomenon that is caused by the
measurement method used in an SEM study, and not by the network of causes and effects among
latent variables in the model being studied.
We provide a discussion of common method bias in PLS-SEM, and of a method for its
identification based on full collinearity tests (Kock & Lynn, 2012). Our discussion builds on an
illustrative model in the field of e-collaboration, with outputs from the software WarpPLS,
version 5.0 (Kock, 2015).
The algorithm used to generate latent variable scores based on indicators was PLS Mode A,
employing the path weighting scheme. While this is the algorithm-scheme combination most
commonly used in PLS-SEM, it is by no means the only combination available. The recent
emergence of factor-based PLS-SEM algorithms further broadened the space of existing
combinations (Kock, 2014).
We created two datasets based on a Monte Carlo simulation (Robert & Casella, 2005; Paxton
et al., 2001). One of the two datasets was contaminated by common method bias; the other was
not. We demonstrate that the full collinearity test is successful in the identification of common
method bias with a model that nevertheless passes standard validity assessment criteria based on
a confirmation factor analysis.
In our discussion all variables are assumed to be standardized; i.e., scaled to have a mean of
zero and standard deviation of one. This has no impact on the generality of the discussion.
Standardization of any variable is accomplished by subtraction of its mean and division by its
standard deviation. A standardized variable can be rescaled back to its original scale by reversing
these operations.
What is common method bias?
Common method bias, in the context of PLS-SEM, is a phenomenon that is caused by the
measurement method used in an SEM study, and not by the network of causes and effects in the
model being studied. For example, the instructions at the top of a questionnaire may influence
the answers provided by different respondents in the same general direction, causing the
2
indicators to share a certain amount of common variation. Another possible cause of common
method bias is the implicit social desirability associated with answering questions in a
questionnaire in a particularly way, again causing the indicators to share a certain amount of
common variation.
A mathematical understanding of common method bias can clarify some aspects of its nature.
The adoption of an illustrative model can help reduce the level of abstraction of a mathematical
exposition. Therefore, our discussion is based on the illustrative model depicted in Figure 1,
which is inspired by an actual empirical study in the field of e-collaboration (Kock, 2005; 2008;
Kock & Lynn, 2012). The illustrative model incorporates three latent variables, each measured
through six indicators. It assumes that the unit of analysis is the firm.
Figure 1. Illustrative model
The latent variables are: collaborative culture (𝐹� ), the perceived degree to which a firm’s
culture promotes continuous collaboration among its members to improve the firm’s productivity
and the quality of the firm’s products; e-collaboration technology use (𝐹� ), the perceived degree
of use of e-collaboration technologies by the members of a firm; and competitive advantage (𝐹� ),
the perceived degree of competitive advantage that a firm possesses when compared with firms
that compete with it.
Mathematically, if our model were not contaminated with common method bias, each of the
six indicators 𝑥�� would be derived from its latent variable 𝐹� (of which there are three in the
model) according to (1), where: 𝜆�� is the loading of indicator 𝑥�� on 𝐹� , 𝜃�� is the standardized
indicator error term, and 𝜔�� is the weight of 𝜃�� with respect to 𝑥�� .
𝑥�� = 𝜆�� 𝐹� + 𝜔�� 𝜃�� , 𝑖 = 1 … 3, 𝑗 = 1 … 6. (1)
Since 𝜃�� and 𝐹� are assumed to be uncorrelated, the value of 𝜔�� in this scenario can be easily
obtained as:
𝜔�� = �1 − 𝜆�� � .
If our model were contaminated with common method bias, each of the six indicators 𝑥��
would be derived from its latent variable 𝐹� according to (2), where: 𝑀 is a standardized variable
3
that represents common method variation, and 𝜔� is the common method weight (a.k.a.
common method loading, or the positive square root of the common method variance).
𝑥�� = 𝜆�� 𝐹� + 𝜔� 𝑀 + 𝜔�� 𝜃�� , 𝑖 = 1 … 3, 𝑗 = 1 … 6. (2)
In this scenario, the value of 𝜔�� can be obtained as:
𝜔�� = �1 − 𝜆�� � − 𝜔� � .
In (2) we assume that the common method weight 𝜔� is the same for all indicators. An
alternative perspective assumes that the common method weight 𝜔� is not the same for all
indicators, varying based on a number of factors. Two terms are used to refer to these different
perspectives, namely congeneric and noncongeneric, although there is some confusion in the
literature as to which term refers to what perspective.
Note that the term 𝜔� 𝑀 introduces common variation that is shared by all indicators in the
model. Since latent variables aggregate indicators in PLS-SEM, this shared variation has the
effect of artificially increasing the level of collinearity among latent variables. As we will see
later, this also has the predictable effect of artificially increasing path coefficients.
Data used in the analysis
We created two datasets of 300 rows of data, equivalent to 300 returned questionnaires, with
answers provided on Likert-type scales going from 1 to 7. This was done based on a Monte Carlo
simulation (Robert & Casella, 2005; Paxton et al., 2001). The data was created for the three
latent variables and the eighteen indicators (six per latent variable) in our illustrative model.
Using this method we departed from a “true” model, which is a model for which we know the
nature and magnitude of all of the relationships among variables beforehand. One of the two
datasets was contaminated by common method bias; the other was not. In both datasets path
coefficients and loadings were set as follows:
𝛽�� = 𝛽�� = 𝛽�� = .45.
𝜆�� = .7, 𝑖 = 1 … 3, 𝑗 = 1 … 6.
That is, all path coefficients were set as . 45 and all indicator loadings as . 7. In the dataset
contaminated by common method bias, the common method weight was set to a value slightly
lower than the indicator loadings:
𝜔� = .6.
In Monte Carlo simulations where samples of finite size are created, true sample coefficients
vary. Usually true sample coefficients vary according to a normal distribution centered on the
true population value. Given this, and since we created a single sample of simulated data, our
true sample coefficients differed from the true population coefficients.
Nevertheless, when we compared certain coefficients obtained via a PLS-SEM analysis for the
two datasets, with and without contamination, the effects of common method bias became
4
visible. This is particularly true for path coefficients, which tend to be inflated by common
method bias. As noted earlier, path coefficient inflation is a predictable outcome of shared
variation among latent variables.
Path coefficient inflation
Table 1 shows the path coefficients for the models not contaminated by common method bias
(No CMB) and contaminated (CMB). As we can see, all three path coefficients were greater in
the model contaminated by common method bias. The differences among path coefficients
ranged from a little over 20 to nearly 40 percent.
Table 1. Path coefficients
𝜷𝟐𝟏 𝜷𝟑𝟏 𝜷𝟑𝟐
No CMB .447 .409 .357
CMB .625 .512 .435
Note: CMB = common method bias.
This path coefficient inflation effect is one of the key reasons why researchers are concerned
about common method bias, as it may cause type I errors (false positives). Nevertheless,
common method bias may also be associated with path coefficient deflation, potentially leading
to type II errors (false negatives).
As we can see, the inflation effect can lead to marked differences in path coefficients. In the
case of the path coefficient 𝛽�� , the difference is of approximately 39.82 percent. As noted
earlier, path coefficient inflation occurs because common variation is introduced, being shared
by all indicators in the model. As latent variables aggregate indicators, they also incorporate the
common variation, leading to an increase in the level of collinearity among latent variables.
Greater collinearity levels in turn lead to inflated path coefficients.
One of the goals of a confirmatory factor analysis is to assess two main types of validity in a
model: convergent and discriminant validity. Acceptable convergent validity occurs when
indicators load strongly on their corresponding latent variables. Acceptable discriminant validity
occurs when the correlations among a latent variable and other latent variables in a model are
lower than a measure of communality among the latent variable indicators.
Given these expectations underlying acceptable convergent and discriminant validity, one
could expect that a confirmatory factor analysis would allow for the identification of common
method bias. In fact, many researchers in the past have proposed the use of confirmatory factor
analysis as a more desirable alternative to Harman’s single-factor test – a widely used common
method bias test that relies on exploratory factor analysis. Unfortunately, as we will see in the
next section, conducting a confirmatory factor analysis is not a very effective way of identifying
common method bias. Models may pass criteria for acceptable convergent and discriminant
validity, and still be contaminated by common method bias.
Confirmatory factor analysis
Table 2 is a combined display showing loadings and cross-loadings. Loadings, shown in
shaded cells, are unrotated. Cross-loadings are oblique-rotated. Acceptable convergent validity
would normally be assumed if the loadings were all above a certain threshold, typically .5. As we
5
can see, all loadings pass this test. This is the case for both models, with and without common
method bias contamination. That is, both models present acceptable convergent validity.
These results highlight one interesting aspect of the common method bias phenomenon in the
context of PLS-SEM. There appears to be a marked inflation in loadings, similarly to what was
observed for path coefficients. Since convergent validity relies on the comparison of loadings
against a fixed threshold, then it follows that common method bias would tend to artificially
increase the level of convergent validity of a model.
Table 3 shows correlations among latent variables and square roots of average variances
extracted (AVEs). The latter are shown in shaded cells, along diagonals. Acceptable discriminant
validity would typically be assumed if the number in the diagonal cell for each column is greater
than any of the other numbers in the same column.
Table 2. Assessing convergent validity
No CMB CMB
𝑭𝟏 𝑭𝟐 𝑭𝟑 𝑭𝟏 𝑭𝟐 𝑭𝟑
𝒙𝟏𝟏 .742 .010 -.095 .902 .072 -.075
𝒙𝟏𝟐 .730 .029 .010 .912 .060 -.100
𝒙𝟏𝟑 .772 .051 -.043 .900 -.075 .054
𝒙𝟏𝟒 .771 -.061 .109 .891 .004 -.064
𝒙𝟏𝟓 .766 .004 .042 .913 -.085 .176
𝒙𝟏𝟔 .729 -.033 -.044 .890 .026 .001
𝒙𝟐𝟏 .022 .690 -.102 .011 .900 .031
𝒙𝟐𝟐 -.060 .709 -.027 -.003 .892 -.063
𝒙𝟐𝟑 .049 .701 .005 .080 .893 -.113
𝒙𝟐𝟒 .018 .766 .031 -.068 .921 .077
𝒙𝟐𝟓 -.106 .731 .040 .020 .905 .002
𝒙𝟐𝟔 .055 .766 .033 -.036 .924 .057
𝒙𝟑𝟏 .022 -.003 .721 .020 -.005 .911
𝒙𝟑𝟐 -.039 .029 .712 .052 -.013 .908
𝒙𝟑𝟑 -.029 -.063 .693 -.003 -.012 .913
𝒙𝟑𝟒 -.018 -.008 .724 -.037 .035 .909
𝒙𝟑𝟓 .013 -.060 .754 -.065 -.072 .920
𝒙𝟑𝟔 .041 .088 .762 .030 .065 .903
Notes: CMB = common method bias; loadings are unrotated and cross-loadings are oblique-rotated; loadings shown
in shaded cells.
Table 3. Assessing discriminant validity
No CMB CMB
𝑭𝟏 𝑭𝟐 𝑭𝟑 𝑭𝟏 𝑭𝟐 𝑭𝟑
𝑭𝟏 .752 .447 .568 .901 .625 .785
𝑭𝟐 .447 .728 .540 .625 .906 .756
𝑭𝟑 .568 .540 .728 .785 .756 .911
Notes: Square roots of average variances extracted (AVEs) shown on shaded diagonal.
That is, if the square root of the AVE for a given latent variable is greater than any correlation
involving that latent variable, and this applies to all latent variables in a model, then the model
presents acceptable discriminant validity. As we can see, this is the case for both of our models,
6
with and without common method bias contamination. Both models can thus be assumed to
display acceptable discriminant validity.
Here we see another interesting aspect of the common method bias phenomenon in the context
of PLS-SEM. While correlations among latent variables increase, the same happens with the
AVEs. This simultaneous increase in correlations and AVEs is what undermines the potential of
a discriminant validity check in the identification of common method bias.
In summary, two key elements of a traditional confirmatory factor analysis are a convergent
validity test and a discriminant vadity test. According to our analysis, neither test seems to be
very effective in the identification of common method bias. An analogous analysis was
conducted by Kock & Lynn (2012), which prompted them to offer the full collinearity test as an
effective alternative for the identification of common method bias.
The full collinearity test
Collinearity has classically been defined as a predictor-predictor phenomenon in multiple
regression models. In this traditional perspective, when two or more predictors measure the same
underlying construct, or a facet of such construct, they are said to be collinear. This definition is
restricted to classic, or vertical, collinearity.
Lateral collinearity is defined as a predictor-criterion phenomenon, whereby a predictor
variable measures the same underlying construct, or a facet of such construct, as a variable to
which it points in a model. The latter is the criterion variable in the predictor-criterion
relationship of interest.
Kock & Lynn (2012) proposed the full collinearity test as comprehensive procedure for the
simultaneous assessment of both vertical and lateral collinearity (see, also, Kock & Gaskins,
2014). Through this procedure, which is fully automated by the software WarpPLS, variance
inflation factors (VIFs) are generated for all latent variables in a model. The occurrence of a VIF
greater than 3.3 is proposed as an indication of pathological collinearity, and also as an
indication that a model may be contaminated by common method bias. Therefore, if all VIFs
resulting from a full collinearity test are equal to or lower than 3.3, the model can be considered
free of common method bias.
Table 4 shows the VIFs obtained for all the latent variables in both of our models, based on a
full collinearity test. As we can see, the model contaminated with common method bias includes
a latent variable with VIF greater than 3.3, which is shown in a shaded cell. That is, the common
method bias test proposed by Kock & Lynn (2012), based on the full collinearity test procedure,
seems to succeed in the identification of common method bias.
Table 4. Full collinearity VIFs
𝑭𝟏 𝑭𝟐 𝑭𝟑
No CMB 1.541 1.472 1.739
CMB 2.619 2.347 3.720
Note: CMB = common method bias.
While it is noteworthy that the full collinearity test was successful in the identification of
common method bias in a situation where a confirmation factor analysis was not, this success is
not entirely surprising given our previous discussion based on the mathematics underlying
common method bias. That discussion clearly points at an increase in the overall level of
7
collinearity in a model, corresponding to an increase in the full collinearity VIFs for the latent
variables in the model, as a clear outcome of common method bias.
Discussion and conclusion
There is disagreement among methodological researchers about the nature of common method
bias, how it should be addressed, and even whether it should be addressed at al. Richardson et al.
(2009) discuss various perspectives about common method bias, including the perspective put
forth by Spector (1987) that common method bias is an “urban legend”. Assuming that the
problem is real, what can we do to avoid common method bias in the first place? A seminal
source in this respect is Podsakoff et al. (2003), who provide a number of suggestions on how to
avoid the introduction of common method bias during data collection.
Our discussion focuses on the identification of common method bias based on full collinearity
assessment, whereby a model is checked for the existence of both vertical and lateral collinearity
(Kock & Gaskins, 2014; Kock & Lynn, 2012). If we find evidence of common method bias, is
there anything we can do to eliminate or at least reduce it? The answer is arguably “yes”, and,
given the focus of our discussion, the steps discussed by Kock & Lynn (2012) for dealing with
collinearity are an obvious choice: indicator removal, indicator re-assignment, latent variable
removal, latent variable aggregation, and hierarchical analysis. Readers are referred to that
publication for details on how and when to implement these steps.
Full collinearity VIFs tend to increase with model complexity, in terms of number of latent
variables in the model, because: (a) the likelihood that questions associated with different
indicators will overlap in perceived meaning goes up as the size of a questionnaire increases,
which should happen as the number of constructs covered grows; and (b) the likelihood that
latent variables will overlap in terms of the facets of the constructs to which they refer goes up as
more latent variables are added to a model.
Models found in empirical research studies in the field of e-collaboration typically contain
more than three latent variables. This applies to many other fields where path analysis and SEM
are employed. Therefore, we can reasonably conclude that our illustration of the full collinearity
test of common method bias discussed here is conservative in its demonstration of the likely
effectiveness of the test in actual empirical studies.
Kock & Lynn (2012) pointed out that classic PLS-SEM algorithms are particularly effective at
reducing model-wide collinearity, because those algorithms maximize the variance explained in
latent variables by their indicators. Such maximization is due in part to classic PLS-SEM
algorithms not modeling measurement error, essentially assuming that it is zero. As such, the
indicators associated with a latent variable always explain 100 percent of the variance in the
latent variable.
Nevertheless, one of the key downsides of classic PLS-SEM algorithms is that path
coefficients tend to be attenuated (Kock, 2015b). In a sense, they reduce collinearity levels “too
much”. The recently proposed factor-based PLS-SEM algorithms (Kock, 2014) address this
problem. Given this, one should expect the use of factor-based PLS-SEM algorithms to yield
slightly higher full collinearity VIFs than classic PLS-SEM algorithms, with those slightly higher
VIFs being a better reflection of the true values.
Consequently, the VIF threshold used in common method bias tests should arguably be
somewhat higher than 3.3 when factor-based PLS-SEM algorithms are used. In their discussion
of possible thresholds, Kock & Lynn (2012) note that a VIF of 5 could be employed when
algorithms that incorporate measurement error are used. Even though they made this remark in
8
reference to covariance-based SEM algorithms, the remark also applies to factor-based PLS-
SEM algorithms, as both types of algorithms incorporate measurement error.
Our goal here is to help empirical researchers who need practical and straightforward
methodological solutions to assess the overall quality of their measurement frameworks. To that
end, we discussed and demonstrated a practical approach whereby researchers can conduct
common method bias assessment based on a full collinearity test of a model. Our discussion was
illustrated with outputs of the software WarpPLS (Kock, 2015), in the context of e-collaboration
research. Nevertheless, our discussion arguably applies to any field where path analysis and
SEM can be used.
Acknowledgments
The author is the developer of the software WarpPLS, which has over 7,000 users in more
than 33 different countries at the time of this writing, and moderator of the PLS-SEM e-mail
distribution list. He is grateful to those users, and to the members of the PLS-SEM e-mail
distribution list, for questions, comments, and discussions on topics related to SEM and to the
use of WarpPLS.
References
Cohen, J., Cohen, P., West, S.G., & Aiken, L.S. (2003). Applied multiple regression/correlation
analysis for the behavioral sciences. Mahwah, N.J.: L. Erlbaum Associates.
Kock, N. (2005). What is e-collaboration. International Journal of E-collaboration, 1(1), 1-7.
Kock, N. (2008). E-collaboration and e-commerce in virtual worlds: The potential of Second
Life and World of Warcraft. International Journal of e-Collaboration, 4(3), 1-13.
Kock, N. (2011). A mathematical analysis of the evolution of human mate choice traits:
Implications for evolutionary psychologists. Journal of Evolutionary Psychology, 9(3), 219-
247.
Kock, N. (2014). A note on how to conduct a factor-based PLS-SEM analysis. Laredo, TX:
ScriptWarp Systems.
Kock, N. (2015). WarpPLS 5.0 User Manual. Laredo, TX: ScriptWarp Systems.
Kock, N. (2015b). One-tailed or two-tailed P values in PLS-SEM? International Journal of e-
Collaboration, 11(2), 1-7.
Kock, N., & Gaskins, L. (2014). The mediating role of voice and accountability in the
relationship between Internet diffusion and government corruption in Latin America and
Sub-Saharan Africa. Information Technology for Development, 20(1), 23-43.
Kock, N., & Lynn, G.S. (2012). Lateral collinearity and misleading results in variance-based
SEM: An illustration and recommendations. Journal of the Association for Information
Systems, 13(7), 546-580.
Maruyama, G.M. (1998). Basics of structural equation modeling. Thousand Oaks, CA: Sage
Publications.
Mueller, R.O. (1996). Basic principles of structural equation modeling. New York, NY:
Springer.
Paxton, P., Curran, P.J., Bollen, K.A., Kirby, J., & Chen, F. (2001). Monte Carlo experiments:
Design and implementation. Structural Equation Modeling, 8(2), 287-312.
9
Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., & Podsakoff, N.P. (2003). Common method biases
in behavioral research: A critical review of the literature and recommended remedies.
Journal of Applied Psychology, 88(5), 879-903.
Richardson, H.A., Simmering, M.J., & Sturman, M.C. (2009). A tale of three perspectives:
Examining post hoc statistical techniques for detection and correction of common method
variance. Organizational Research Methods, 12(4), 762-800.
Robert, C.P., & Casella, G. (2005). Monte Carlo statistical methods. New York, NY: Springer.
Spector, P.E. (1987). Method variance as an artifact in self-reported affect and perceptions at
work: Myth or significant problem? Journal of Applied Psychology, 72(3), 438-443.
Wold, H. (1980). Model construction and evaluation when theoretical knowledge is scarce. In J.
Kmenta and J. B. Ramsey (Eds.), Evaluation of econometric models (pp. 47-74). Waltham,
MA: Academic Press.
Wright, S. (1934). The method of path coefficients. The Annals of Mathematical Statistics, 5(3),
161-215.
Wright, S. (1960). Path coefficients and path regressions: Alternative or complementary
concepts? Biometrics, 16(2), 189-202.
10