Incentive Mechanisms For Federated Learning: From Economic and Game Theoretic Perspective
Incentive Mechanisms For Federated Learning: From Economic and Game Theoretic Perspective
Abstract—Federated learning (FL) becomes popular and has computing and storage capabilities [5] of end devices and edge
shown great potentials in training large-scale machine learning servers are utilized to make model training closer to where data
(ML) models without exposing the owners’ raw data. In FL, the is generated. This paradigm has promoted the emergence of a
data owners can train ML models based on their local data and
only send the model updates rather than raw data to the model new class of ML techniques that exploit the participation of
owner for aggregation. To improve learning performance in terms numerous clients without exposing the original data. One most
of model accuracy and training completion time, it is essential to famous distributed machine learning framework is Federated
recruit sufficient participants. Meanwhile, the data owners are Learning (FL). FL allows each client with data source, i.e.,
rational and may be unwilling to participate in the collaborative data owner, to train a local model, and a global model is
learning process due to the resource consumption. To address
the issues, there have been various works recently proposed to obtained through model aggregation. This process will be
motivate the data owners to contribute their resources. In this repeated until an accuracy target of the model is reached. FL
paper, we provide a comprehensive review for the economic and usually adopts client-server architecture for this interaction. By
game theoretic approaches proposed in the literature to design this way, it decouples ML from acquiring, storing, and training
various schemes for incentivizing data owners to participate in FL data in data centers, overcoming the limitations of traditional
training process. In particular, we first present the fundamentals
and background of FL, economic theories commonly used in approaches.
incentive mechanism design. Then, we review applications of In spite of the aforementioned great benefits of FL, it
game theory and economic approaches applied for incentive still faces several critical challenges. On the one hand, the
mechanisms design of FL. Finally, we highlight some open issues data owners, i.e., clients, typically consume their resources,
and future research directions concerning incentive mechanism e.g., computing and communication resources, for the local
design of FL.
training. This prevent self-interested clients from contributing
Index Terms—Federated learning, incentive mechanisms, eco- their resources for FL model training, unless they can obtain
nomic theories, game theoretic models. sufficient economic compensation. On the other hand, some
unreliable clients may perform undesirable behaviours, which
I. I NTRODUCTION affect the performance of global model of a FL task. In
particular, the client may launch a poisoning attack [6]–[8]
I N the past few years, we have witnessed the rapid devel-
opment of machine learning (ML) in the field of artificial
intelligence (AI) applications, such as computer vision, au-
that sends malicious updates to mislead the global model
parameters leading to the failure of current collaborative
tomatic speech recognition, natural language processing and learning. In addition, the privacy can not be guaranteed, i.e,
recommendation system [1]–[3]. The success of these machine participants information is still in danger of exposure. For
learning technologies, especially deep learning (DL), builds example, a generative adversarial network attack [9] could
on large volume of data (i.e., big data). With the advent be launched in FL, in which an adversary can pretend to be
of the Internet of Things (IoT), massive data is collected a participant engaging in the model training and learn other
by Internet connected smart devices with limited resources participants’ data.
(e.g., smartphones, sensors, etc.). In most traditional ML As a result, designing an efficient incentive mechanism for
technologies, the local data collected by smart devices need to FL is of crucial importance due to the following reasons.
be transmitted and processed at a cloud or data center to train First, to train and update the local models, the clients need
effective inference models. However, this causes excessive to consume the computing resource and network resource.
computation and storage costs, and the smart devices also Thus, the clients need to paid by reward or money to motivate
suffer from serious privacy leakage risk [4]. them for their contributions. Second, the clients are different
To address the aforementioned issues, mobile edge com- in behaviour, and thus they may be free to join and drop out
puting (MEC) has been proposed as a solution where the the FL platform. This significantly impacts the accuracy and
the delay of the training. Thus, a high incentive motivates
X. Tu, K. Zhu, Y. Zhang and J. Li are with the College of Computer Science more clients to join the FL platform and improves the training
and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing
210016, China (email: {tuxz98, zhukun, yangzhang, juanli}@nuaa.edu.cn). performance.
N. C. Luong is with the Faculty of Computer Science, PHENIKAA Univer- There are two challenging tasks for incentive mechanism
sity, Hanoi 12116, Vietnam (email: luong.nguyencong@phenikaa-uni.edu.vn). designs in FL. The first is how to motivate and maintain more
D. Niyato is with School of Computer Science and Engineering, Nanyang
Technological University, Singapore 639798 (email: dniyato@ntu.edu.sg). participants from the participants’ perspective. That is, how
K. Zhu is the corresponding author. to provide a participating opportunity that a high reward is
2
allocated to the clients and the privacy of the clients can be TABLE I
guaranteed. The second challenge is how to evaluate the par- M AJOR A BBREVIATIONS
ticipants’ contribution from the FL task publisher/platform’s Abbreviation Description
perspective. The demands of different FL task are various, thus BB Budge Balance
how to maximize the sustainable operation of the federation CE Computational Efficiency
while minimizing the incentive cost is challenging. DO(s) Data Owner(s)
To address the aforementioned challenges, many work has ED(s) Edge Node(s)
EMD Earth Mover’s Distance
been done from different aspects. Also, there are related
IC Incentive Compatibility
surveys including economic and pricing models for edge IR Individual Rationality
computing [10], FL [11], and incentive mechanism design for KKT Karush-Kuhn-Tucker
FL [12], [13], but they do not focus on how to adopt economic MDP(s) Mobile Device Group(s)
and game models to determine optimal interactions among MO(s) Model Owner(s)
buyers and sellers. To the best of our knowledge, there is no NE Nash Equilibrium
such a survey discussing the incentive mechanism design of FL SE Stackelberg Equilibrium
SP(s) Service Provider(s)
specifically from economic and game aspects. This motivates SV Shapley Value
us to deliver the survey with the aim of providing a guide for SWM Social Welfare Maximization
the relevant researchers engaged in the study of FL incentive WBB Weak Budge Balance
mechanism. We hope the reader will understand how economic
and game theoretic approaches can be adopted to effectively
design incentive mechanisms for FL. new global model is sent to participants for the next
The remainder of this paper is structured as follows. Sec- global training.
tion II reviews the background of FL such as the defini-
tion, architecture, advantages and incentive mechanisms in The step 2 and 3 are repeated until the global model reaches
FL. Fundamentals of the economic and game theory for an accuracy target.
incentive mechanisms in FL are given in Section III. In As an emerging distributed training framework, FL has been
Section IV, different game theoretic models used for incentive attracting great attentions. Multiple architectures of FL have
mechanisms in FL are discussed, including Stackelberg game, been proposed based on different networks, such as client-
non-cooperative game, etc. Section V reviews the auction server network and P2P network. However, for the works in
approaches for incentive mechanisms in FL. Applications of this survey, the client-server network based architecture (as
contract and matching theory for incentive mechanisms in FL shown in Fig. 1) is mainly considered, which consists of the
are discussed in Section VI. Section VII summarizes potential following entities:
future research directions. Finally, conclusions are drawn in • Clients: This layer is composed of devices, such as mo-
Section VIII. The list of abbreviations frequently used in this bile devices, sensors, and vehicles. Their functions are to
paper is given in Table I. contribute resources, i.e, training data and computational
resource to the FL process. Then, they act as the data
II. I NCENTIVE M ECHANISM D ESIGN FOR F EDERATED owners (DOs) to train a model from the FL service
L EARNING platform, i.e., the model owner (MO), by using their
A. Fundamentals of Federated Learning collected data.
• Communications and networking: This layer includes
The concept of FL was proposed by Google [14], [15]– data communication and networking infrastructures for
[17] which aims to build ML models based on distributed delivering the model parameters between clients and FL
datasets across multiple devices while protecting data privacy. service platform.
In general, FL refers to a collaborative learning method that • FL service platform: This layer consists of sufficient data
enables participants to interact and cooperate with each other storage and computational resource to firstly aggregate
for generating a global model without data sharing. The FL the local model update from the clients, and then broad-
process generally consists of three steps: casts the updated global model to all clients. Meanwhile,
• Step 1 (Model initialization): In this phase, the central the platform provides FL services to service requesters
server broadcasts the initialized global model to the or service users, e.g., through application interfaces.
selected local clients that are known as participants. • Service users/requesters: This layer contains the users
• Step 2 (Local model update): Each client trains its local who request FL service.
model based on the shared global model and local dataset.
After the training, the clients upload their local models With the above architecture, FL brings following benefits
to the server. compared with conventional centralized learning scheme and
• Step 3 (Model aggregation): Upon receiving the local
individual learning scheme:
models from the participants, the server uses model ag- • Privacy protection: Since the FL process does not involve
gregation algorithms, e.g., Federated Averaging algorithm the direct exchange or collection of raw data among
(FedAvg) which averages the parameters of the local participants, the privacy of participants is guaranteed to
models, to generate an updated global model. Then, the a certain extent. In a long-run, the privacy protection can
3
behaviors, the authors in [29] proposed a scoring rule based of this approach is to solve a value-minus regret drift
framework to incentivize DOs to upload their model updates optimization problem, which achieves contribution fair-
in a trustworthy way. ness, regret distribution fairness, and expectation fairness,
The major contribution evaluation strategies in existing FL simultaneously.
system can be divided into the following categories: Summary: In this section, we provide a brief introduction
• Self-report based Contribution Evaluation: Self-report of FL that consists of a typical FL training process, general
based contribution evaluation is the most straight-forward architecture, and its advantages. In addition, the fundamentals
way, in which the DO actively reports the amount of of incentive mechanism for FL are discussed, e.g., concepts,
its contributed resources to the MO. In the context definitions and motivations. The next section presents some
of self-reporting, there are multiple metrics to evaluate basic and fundamentals of economic and game models.
DOs’ contribution, such as the capacity of computational
resource [30] and the data size [31]. III. OVERVIEW AND F UNDAMENTALS OF E CONOMIC AND
• Shapley Value based Contribution Evaluation: Shapley
G AME T HEORY FOR I NCENTIVE M ECHANISMS D ESIGN
value [32] is a marginal method based on contribution
which takes into account the impact of the participation Economic and game theoretic approaches based incentive
order of DOs, to fairly evaluate their marginal contribu- mechanism design for FL have received significant attentions.
tion to the federation. This method is usually adopted in This section presents the fundamentals and background of
cooperative game. The SV of DO is denoted as: the economic and game models commonly used for incentive
X
mechanism design. A taxonomy of the economic and game
|S|!(|N | − |S| − 1)!
ψi (N, v) =
N!
(v(S ∪ {i}) − v(S)), (1) models is provided in Fig.2.
S⊆N \{i}
Fig. 2. A taxonomy of economic and game approaches for incentive mechanism design in FL.
strategies of player i, and ui is the payoff of player i given game, BNE can be obtained where each player selects a
i’s chosen strategy and the others’ strategies. Let pi ∈ Pi be strategy that maximizes its expected payoff given its beliefs
any possible strategy of player i. about the type and strategies of other players. Because the
non-cooperative game models the conflict of selfish players,
Definition 1: Denote p∗i as the best response of player i to the the game in the FL market where the DOs are competitive
chosen strategies of other N − 1 players, i.e., p∗−i . Then, the and the budget of the MO is limited can be formulated as the
set of strategies (p∗i , ......, p∗N ) is termed a Nash equilibrium non-cooperative game. The corresponding Nash equilibrium
(NE) of this game, meaning that no player can gain higher allows DOs to have an optimal participation strategy. In the
payoff by deviating its current strategy when the strategies of context of FL, it is used for the computation resource trading
the other players remain the same, that is [47], with competitive prediction service provider as proposed in
[54].
ui p∗i , p∗−i > ui pi , p∗−i , ∀pi ∈ Pi .
(2)
2) Stackelberg Game: Stackelberg game [55] is a
The inequality (2) implies that NE is a stable outcome of sequential-move game in which the players acting as the
the game in which the sellers at the equilibrium have no an leaders move first and then other players acting as followers
incentive to change their strategies since they cannot improve move after observing leaders’ moves. Therefore, it is also
their payoffs unilaterally. Note that, there could be no NE termed as leader-follower game [53]. The Stackelberg game
or multiple NE in a game. This makes players difficult to aims to model multi-agent decision making processes and
predict the outcome of the game. Therefore, it is important maximize the utility of both the leader and the followers given
to check the existence and uniqueness of the NE when using the leader’s strategy.
the non-cooperative game. It has been proven that a unique Consider again the scenario in Section III-A with two
NE exists if the strategy space of each player is a convex mobile devices as players, i.e., computational resource sellers.
set which is non-empty, closed, and bounded, and if its P1 and P2 denote the sets of the pricing strategies of players
payoff function is a continuous and quasi-concave function 1 and 2, respectively. Both mobile devices 1 and 2 aim to
[48]–[50]. The existence and uniqueness of NE can also be maximize its own utility ui (p1 , p2 ) , i ∈ 1, 2, where p1 and
proved by adopting the Kakutani’s fixed point theorem [50] p2 are their chosen strategies from P1 and P2 , respectively.
or the Brouwer’s fixed point theorem [51]. Because the non- Assume that player 1 chooses its strategy at stage 1, and thus
cooperative game formulates the conflict relationship among acts as the leader. Player 2 chooses its strategy at stage 2, and
self-interest players, the incentive problem among competitive acts as the follower. The optimization problem of the leader
participants of collaborative learning under limited budget was and the follower together form the Stackelberg game, and their
modeled as non-cooperative game [52]. solutions construct the Stackelberg equilibrium (SE).
The non-cooperative game assumes that information such
Definition 2: Let p∗1 and p∗2 denote the solutions of the
as feature, payoff function and strategy of the players are
optimization problems of the leader and the follower, respec-
the common knowledge among players, which is termed a
tively. Then, the point (p∗1 , p∗2 ) is the SE for the Stackelberg
complete information game. However, in practice, a player
game if any (p1 , p2 ) with p1 > 0 and p2 > 0, we have
may not be fully aware of this information of other players,
u1 (p∗1 , p∗2 ) > u1 (p1 , p∗2 ) and u2 (p∗1 , p∗2 ) > u2 (p∗1 , p2 ).
only knowing the occurrence probability of each type. Such a
game is known as an incomplete information game. In FL mar- The backward induction method [56] is commonly used to
ket, the prior knowledge about Dos’ reliability or reputation derive the SE. In the above example, given p1 , the optimization
that helps allocate rewards may be unknown for the MO. A problem of the follower is solved first to find p∗2 , and then the
typical example is the Bayesian game [53] where the outcome p∗1 is obtained through substituting p∗2 in the leader problem.
of the game can be predicted using Bayesian game analysis. Since player 1 takes advantage of knowing player 2 knows
The equilibrium solution of such a game is Bayesian Nash p1 , player 1 imposes a solution which will facilitate itself.
equilibrium (BNE). Similar to NE in a complete information Therefore, the utility of the leader at the SE is higher than that
6
of the follower, called the first mover advantage. Accordingly, • Utility: The buyer’s utility is the difference between
when reaching SE, the leader could achieve a payoff at least its valuation of the requested commodity and its final
as high as the one obtained from the corresponding NE. payment. The seller’s utility, i.e., revenue, refers to the
This feature makes the Stackelberg game suitable for the total payment received from the buyers. In FL markets,
incentive mechanism design in FL. For example, it allows the utility of the buyer, e.g., model owner, can be propor-
the mobile devices (i.e., the followers) to determine optimal tional to the accuracy of the global model and inversely
computational resource prices after knowing the amount of proportional to the total payment that the model owner
CPU resources that the MO (i.e., the leader) needs as proposed pays the data owners.
in [30] or the reward offered by the MO as proposed in [57]. • Social welfare: It refers to the sum of utilities of the users
3) Coalitional Game: In a cooperative game, players coop- (i.e., both buyers and sellers) in an auction.
erate with each other with the aim of maximizing a common
Auction mechanism has been widely applied in many fields,
objective of the coalition. Moreover, enforceable contracts
such as resource allocation in wireless systems [59], secure
are made among the players. In this case, the players can
data offloading [60], and network security [61]. For the rest of
coordinate strategies and reach an agreement on how to assign
this section, we introduce the detail of auction types commonly
the total payoff to the players in a coalition. The objective of
applied to incentive mechanism design in FL.
a coalitional game is to find a stable solution which ensures
that the outcome of the game is immune to changes of groups 1) Sealed-bid Auction: Different from the open-cry auction
of players (i.e., each player has no incentive to move from its (e.g., English auction and Dutch auction), the bids of the
current coalition to another coalition). buyers are open to each other during the auction, in a sealed-
bid auction, the buyers submit sealed bids simultaneously to
the auctioneer. Accordingly, no bidder can know the bidding
B. Auction information of others and cannot change its own bid. There
An auction is an economic mechanism, the goals of which three types of sealed-bid auctions.
are to allocation commodities (e.g., training data, computa- • First-price sealed-bid (FPSB) auction: The bidder with
tional resource, and bandwidth) and establish corresponding the highest bid is the winner who can receive the item
prices via a process known as bidding [58]. An auction consists and pays the highest bid.
of an explicit set of rules which determine resource allocation • Second-price sealed-bid (SPSB) auction or Vickrey auc-
and prices of the basis from market participants [58]. There tion: In this auction, the winner only pays the second-
are following terminologies used in the auction: highest bid rather than the highest bid that it submitted
• Bidder: A bidder is a buyer who wants to purchase items [62]. Since the winner pays the price less than its ex-
in the auction. In the FL market, bidders can be model pected price, the Vickrey auction motivates buyers to bid
owners or FL service requesters. truthfully. The auction thus achieves truthfulness. This
• Seller: A seller offers services or resources to the buyers. feature enables Vickrey auction to be widely used for
In FL markets, the sellers are typically data owners or the incentive mechanism design in FL to prevent the
clients who use their local data to train the shared model misbehaviors of unreliable clients.
required by the model owner. • Vickrey-Clarke-Groves (VCG) auction: A VCG auction is
• Auctioneer: An auctioneer is an intermediate agent which a generalized Vickrey auction with multiple commodities.
conducts the auction, i.e., price and winner determination. In the VCG auction, the commodities are allocated so-
In many cases, the seller itself is the auctioneer. cially optimal manner, and the winner pays for the loss
• Price: A price in an auction may be an asking price of the social value owing to winning the commodities.
or a bidding price. The asking price is the price of a Such payment rule enables bidders to give their true value
commodity that the seller wants to obtain, and the bidding for the commodities. The VCG is thus a strategy-proof
price is the price that the buyer wants to pay for his or truthful mechanism. In FL, VCG mechanism can be
requested commodity. Hammer price is the price at which used to motivate IoT devices (the Dos) to report their true
the buyer and the seller agree to make a deal, i.e., final valuations to the network operator for maximizing social
payment. welfare as proposed in [63].
• Commodity: An auction commodity refers to the object
2) Forward, Reverse and Double Auction: The auction
traded between a buyer and a seller. Each commodity has
mentioned above are classified as the forward auctions from
a value at which the buyer/seller wants to buy/sell. In FL
the seller’s side. Considering the buyer’s side, there are re-
markets, the commodity can be a data unit (a training
verse and double auctions. In particular, there are following
data sample) or a computing resource unit that the data
definitions:
owner offers.
• Valuation: In an auction, the valuation refers to monetary • Forward auction: In the forward auction, multiple buyers
valuation of commodities. Different buyers and sellers submit their bids, i.e., bidding price, to compete for the
may value commodities with different valuations depend- requested items offered by one seller.
ing on their preferences. The valuation of a participant • Reverse auction: In the reverse auction, multiple sellers
can be private that is unknown to the other’s participants submit their asks, i.e, asking price, to compete for selling
or public that is known to the others. the items to the single buyer. The reverse auction is
7
often used together with other auction mechanisms, e.g., have been widely applied in incentive mechanism design for
sealed-bid reverse auctions. FL. A three-dimensional contract incentive mechanism that
• Double auction: In FL market, there may exists multiple jointly considers the task expenditure and privacy issue is
MOs and multiple DOs, the double auction can be used to design as proposed in [70]. A two-period incentive mechanism
match the supply and demand. In a double auction, buyers based on dynamic contract is proposed in [71] to incentivize
and sellers simultaneously submit their bids, i.e., bidding users with different willingness to participate. The contract-
price, and asks, i.e, asking price, to an auctioneer [64]. based personalized privacy-preserving incentive for FL in [72]
The auctioneer determines a price p, i.e., the transaction can provide customized payments for workers with different
price, to clear the market, at which the asking prices privacy preferences.
from sellers are less than p while the bidding prices from 2) Matching Theory: Matching theory aims to optimally
buyers are more than p. The transaction price is typically match two disjoint sets of agents together, given their individ-
set as p = (pb + ps ) /2, where pb is the bidding price ual utilities. In the general allocation game model, there could
of one buyer and ps is the asking price of one seller. be multiple agents in both sides of the matching, and agents
The buyers receive resources, and the sellers gain the from one side have transactions with agents in the opposite
transaction price. The process is repeated until no more side. Therefore, such a game is called two-sided matching. In
transactions occur or a predetermined end time achieves. matching theory, agents compete with each other to maximize
3) Combinatorial Auction: In combinatorial auction, each their own utility, i.e, selfishness, and always make decisions
bid of a buyer indicates a bundle of multiple commodities that can increase their utilities, i.e, rationality. In FL, it is used
rather than an individual commodity [65]. Based on the for the task allocation withe the aim of minimizing the system
information included in the bid as well as the capacity latency of multiple-task FL in MEC [73], [74].
of commodities from sellers, the auctioneer determines the Summary: This section has introduced the basis of eco-
optimal allocation strategy as well as the winner of the nomic and game theoretic approaches proposed to design
auction. However, solving winner determination problem is incentive mechanisms for FL. Specifically, we provide the
a challenge for the combinatorial auction since the problem is definitions, mechanism descriptions, and rationality behind
generally NP-hard, which means that there is no polynomial- the use of these approaches for incentive mechanisms. In the
time algorithm to find the optimal allocation. There are many subsequent sections, we provide comprehensive reviews on the
algorithms that have been proposed to obtain the approximate applications of economic and game theoretic approaches for
solutions for the problem, such as the Lagrangian relaxation incentive mechanism design in FL.
approach [66]. In FL, the combinatorial auction is used to
allocate network operator’s bandwidth to multiple FL SP as
proposed in [67].
IV. A PPLICATIONS OF G AME T HEORY FOR I NCENTIVE
M ECHANISM D ESIGN IN FL
C. Contract and Matching Theory
Contract theory [68] and matching theory [69] have been In the FL service market, there are multiple partici-
regarded as two powerful tools to model the dynamic and pants/stakeholders which may belong to different entities, i.e.,
mutually beneficial relations among different types of rational service users, FL service providers (also MOs), and data
and selfish agents. In particular, they can effectively deal with providers (also DOs). Each participant determines the optimal
the high dynamics of trading market, selfish and competitive strategy and through constantly interacting with other agents
participants. In the following, the brief introductions of con- to achieve different objectives. The objectives include the
tract theory and matching theory which have been used to revenue, utility, cost, and system performance. The interaction
design incentive mechanism in FL is presented. among entities is complex and their objectives often conflict
1) Contract Theory: Contract theory is an economical with each other, which makes game-theoretic approaches
theory that regards all transaction and institutions as a kind of become effective tools for designing incentive mechanisms
contract [22]. It is widely used where the asymmetric infor- with low complexity for FL. With the traditional methods,
mation is available between employer and employees (i.e., the it is difficult to incorporate economic implication into the
futures of an employee is not known exactly to the employer). solutions. Therefore, when the rationality of the stakeholders
In the FL market, since the employees are selfish and they in the FL system are important, the traditional methods may
may not reveal their true bids as well as the property of not be suitable.
privacy protection of FL mechanism, there exists information Game models that are commonly adopted in incentive mech-
asymmetry. Contract theory can design the optimal contract anism of FL consists of Stackelberg game, non-cooperative
to reduce the moral hazard, adverse selection, and extortion game, and coalition game. Specifically, the Stackelberg game
of the parties amid information asymmetry. This characteristic is used to maximize the utility of the MOs and the DOs.
makes the contract theory suitable for the incentive mechanism Otherwise, the non-cooperative game is used in the case each
designs in FL. In the context of FL, an employer can be a player, i.e., the DO or the MO, aims to maximize its own
model owner who wants to recruit workers to complete the FL utility. To form a stable DO federation, the coalition game
model training. Similarly, an employee can be any client (data can be used. The game approaches are reviewed, discussed,
owner) who wants to participate in the FL. The contract theory and analyzed in the following sections.
8
A. Incentive Mechanisms Based on Stackelberg Game Fig. 4. Market-oriented architecture for motivating information trading with
federated learning based on hierarchical game.
In an FL system, the MO hires the DOs for model training.
Thus, the MO acts as a buyer, and the DOs are sellers. The
MO can first set a reward, and then the DOs decide its level Newton-Raphson method. The simulation results show that the
of participation. To stimulate both the MO and the DOs to proposed game approach outperforms the heuristic approach
participate in the FL system, the Stackelberg game is adopted. up to 22% gain in the offered reward while achieving the
The first work of Stackelberg game can be found in [75]. same target accuracy. However, the proposed game approach
The system model is shown in Fig. 3 in which the BS, i.e., is constrained to a single leader, i.e., an MEC server.
the MO, is the leader, and the user equipments (UEs), i.e., Opposite to the model in [76], in [77], the MEC servers
the DOs, are the followers. The BS as the buyer first sets of operators act as sellers (leaders), that cooperatively train a
a monetary reward to maximize its utility. Given the BS’s global model from a coordinator at the cloud, i.e., the buyer
reward, each UE determines its local training strategy, i.e., the (the follower), based on the sensing data from IoT sensors
amount of CPU resources, to maximize its utility. Here, the that are distributed by the coordinator. The interaction process
UE’s utility is a concave function of its local training accuracy can be summarized as follows. First, each MEC operator in-
and the BS’s reward, and the BS’s utility is a strictly concave dividually sets its optimal price to maximize its utility, i.e, the
function of the number of global iterations required to reach payment from the coordinator minus its power consumption
the global accuracy and the reward. A unique NE among UEs for local training, accounting its residual computational re-
can be obtained by taking the first-order condition. Given source. Given the price and configuration profile of operators,
the UEs’ best responses, the BS updates the global model the coordinator allocates the sensing data from IoT sensors
and adjusts its reward to maximize its utility. The backward to maximize its utility while satisfying the latency constraint
induction method is applied to solve the game. The simulation of the distributed learning process. The coordinator’s utility is
results show that an increase in the reward incentivizes the UEs defined as the difference between the gain owing to the model
to generate more local models that leads to a higher global training and the payment to operators. To find an optimal data
accuracy. However, the BS needs to pay a higher incentive distribution strategy for the coordinator, the KKT is used. The
cost to the UEs. theoretical analysis proves the existence of a unique SE for the
The same system model and Stackelberg game approach two-level game. The simulation results show that the operators
can also be in [76] where the MO is an MEC server, i.e., can increase the prices to improve their utilities. However, if
the buyer, that offers the reward. Also, the utility of the UE the prices are too high, their utilities seem not to increase,
is the difference between the reward offered by the server since the coordinator reduces the amount of distributing input
and the local training cost. To obtain the best responses of the sensing data.
UEs, Karush-Kuhn-Tucker (KKT) and the first-order condition Consider a general scenario, the authors in [78] proposed to
are used. Given the best responses, the server determines adopt a two-layer Stackelberg game to model the interactions
the reward to maximize its utility under the limitation of a among an FL SP (SP), users, and DOs. The system model
fixed number of global iterations. Given the constraint, to is shown in Fig. 4. In the lower-layer game, the SP is the
achieve the desired training result, a threshold accuracy of leader, and the users are the followers. Each user determines
the local training at the UEs is set. The UEs whose local its optimal demand value of learning service to maximize its
accuracies are higher than a threshold are selected for the utility. The competition among users is modeled as a non-
training. The threshold is optimized using the Lagrangian and cooperative game. The NE solution of the game is obtained
9
TABLE II
A PPLICATIONS OF S TACKELBERG G AME FOR I NCENTIVE M ECHANISM D ESIGN IN FL
by taking the first-order derivative. The utility of user is the maximize its profit. Here, the profit is the difference between
benefit gained from the direct network effect caused by all the the revenue obtained from providing the learning service to the
users in the system and the benefit gained from the indirect MO and the relay service to other mobile users and the energy
effect caused by other participants minus the negative impact consumption cost. The experimental results show that the route
on service quality as well as the payments from the users. In selection of model transmission among the mobile devices
the upper-layer game, the SP acts as the leader and the DOs can significantly improve the performance of the model and
act as the followers. Based on the DOs’ required charging make mobile users gain the best benefit. In the future, power
price and users’ service demand, the SP decides its service optimization can be considered to construct a energy-efficient
price to maximize its utility, i.e., its profit, which is defined cooperative FL system.
as the payment obtained from users minus its fixed processing Different from the aforementioned works, the authors in
cost and rewards to the workers. Given the SP’s price, each [79] considered an FL network in which the MO is an un-
DO determines its charging price to maximize its utility, i.e., manned aerial vehicle (UAV) as the leader, and the data owners
profit, which is the difference between the total remuneration are ground clients as the followers. The interaction between
received from SP and the additional rewards obtained owing to the leader and the followers is as follows. First, the UAV,
joining FL training and its cost for perceiving and collecting i.e., the buyer, offers a reward when it publishes the FL task.
data. The first order derivative is used again to obtain the op- Then, each client decides its participating level to maximize its
timal strategies of the SP and the DOs. The optimal solutions utility that is the difference between the reward and the cost for
of all three entities constitute the SE. However, this work does training FL task. The clients’ optimal strategies are determined
not consider constrained computing resources of DOs. using the first-order and the second-order derivative. Given the
In a practical FL network, a mobile user can be out of the clients’ responses, UAV determines its reward to maximize
coverage range of the MO. In this case, the mobile user can ask its utility. Here, the utility is defined as the total benefit
other mobile users (i.e., relay node) in the network to forward gained from the training of clients minus its payment to
its local model to the MO. In this scenario, the authors in [31] these clients. The theoretical analysis has proved the existence
adopted the Stackelberg game in which the MO is the follower and uniqueness of NE among the clients as well as the SE
(i.e., the buyer) and the mobile users are the leaders (i.e., the between the UAV and the clients. To further adapt to the
sellers). The MO determines the size of dataset provided by dynamics of air-ground network, the authors design a dynamic
the mobile users to maximize its utility. The utility of the MO incentive mechanism to adaptively select the optimal number
is a function of the total data size of all the users and the prices of clients, taking into account the limitation of drone coverage.
paid to the mobile users. Given the data size, each mobile user The difference between the two incentive mechanisms is the
determines the optimal price for one training data sample to definition of the clients’ loss. The simulation results show
10
that the social welfare obtained by the dynamic incentive Service request
FL service
mechanism is higher than that obtained by the static incentive ĂĂ
Charging reward
Charging price
mechanism. The reason is that dynamic incentive mechanism AI service requesters
Energy supply
Energy payment
can always select the optimal clients adapting to the time- Energy supplier MEC node
varying environment. (leader)
WPT node
Considering the privacy leakage issue in FL, a Stackelberg (follower)
game-based incentive mechanism to motivate users with sen- Wireless
Renewable
sitive data to participate in FL with guaranteed privacy is charging
energy
channel
proposed in [80]. The system model includes a cloud server, Renewable
energy Stationary
WPT node
i.e., the MO, and mobile users, i.e., the data owners. Specif- (follower) charing mode
ically, the server publishes an FL task and announces a total Edge device Mobile device
reward for attracting users. Given the server’s reward, each Dynamic
user determines its desired privacy budget, i.e, the monetary charging mode
TABLE III
A PPLICATIONS OF N ON - COOPERATIVE G AME AND O THER G AMES FOR I NCENTIVE M ECHANISM D ESIGN IN FL
Market Structure
Ref. Scenario Game Model Mechanism Objective Solution
Seller Buyer Item
Individual utility
Evolutionary Hamilton
MDGs’ Given the training price offered by the maximization for
General game, solution,
[85] MDGs MOs learning MDGs, the evolutionary equilibrium MDG buyers, and the
FL system differential evolutionary
[86] service selection strategy for the MOs is determined. accumulative profits
game equilibrium
maximization for sellers
The training strategies selection problem of
Mobile the mobile devices is formulated as
General Evolutionary Mobile Sellers’ individual utility Evolutionary
MOs evolutionary game. The uniqueness and
devices’
[88] FL system game devices maximization equilibrium
datastability of the evolutionary equilibrium
solution is proved.
The strategic behaviors of participating
Truthfulness, individual
FL Non- The providers in federated prediction is modeled
rationality, budget Bayesian
prediction cooperative Prediction providers’ as a Bayesian game. The Bayesian Nash
Users feasibility, fair reward, Nash
[54] serving Bayesian SPs prediction equilibrium participation strategy is obtained
and sellers’ utility equilibrium
framework game service applying the idea of divergence-based BTS
maximization
method.
Sellers provide good or bad updates for the
Mixed
Clients’ buyer, and the buyer selects to accept or Payoff maximization for
Mix-strategy Central strategy
Robust FL Clients learning reject these updates. The probability that the both the sellers and the
[92] game server Nash
service clients provide good updates is calculated by buyer.
equilibrium
using mix-strategy game.
Once receiving the participated round and
monetary transfer from the organization, the Efficiency, individual
Non- Organizations’
Cross-silo central serve calculates them. The sellers rationality, budget Nash
cooperative Organizations Organizations learning
[95] FL determine their optimal decisions through balance, and sellers’ equilibrium
game service
solving the social welfare maximization payoff maximization
problem using the Lagrangian.
Global profit
Edge The equilibrium solution for the devices’
maximization for the
Cooperative Edge Edge devices’ participating strategy is derived through Equilibrium
FEL buyer, and guarantee
[97] game devices server learning decomposing the original problem and solution
incentive compatibility
service solving it.
for the sellers
Agents selects an optimum cluster that
Data Agents’
General Hedonic enable their utility to be improved to join, Stable cluster formation Nash
owner MAE learning
[98] FL system game and the utility of each players is allocated by and budget balance equilibrium
agents service
MAE according the marginal contribution.
Guarantee individual
The sellers choose optimum contract bundles
rationality, incentive
Contract for utility maximization and contribute the
FL in Workers’ compatibility and
theory, Model corresponding data to the training. The Optimal
mobile Workers training revenue improvement
[99] coalition owners buyers adjust their reward strategies and solution
networks data for workers, buyers’
game collaborative with other owners to form
marginal contribution
federation to maximize their payoffs.
maximization
Given the sellers’ data pricing, and then the
Data
buyers determine their optimum bid. The The utility maximization
Proof of Reverse Data Pool providers’ Optimal
Euler-Lagrange equation and variational for both buyers and
[100] FL game providers members training solution
method are used to obtained the best sellers
data
strategies of both parties.
contributing to federated training process, in [98], the hedonic In [100], a novel consensus mechanism, proof of FL, is
game is used to simulate the problem where the agents act proposed, in which the pool members are the buyers to coop-
as the players of the game and the clusters as the coalitions. eratively train a global model from the pool manager based
The system model includes multiple agents (i.e., sellers) which on their privacy data purchased from the data providers (i.e.,
cooperate with other agents selectively to form clusters and sellers). To guard against the risk of privacy data leakage in
train model, and a model aggregating entity which as a utility FL, a reverse game-based data trading mechanism is proposed
allocator (i.e., buyers) is responsible for assigning a utility to which enables a pool member (miner) to maximize its utility
all possible clusters. The utility of any cluster is the sum of only when it trains the model without any leakage. The
the raw utility and learning utility. The strategy of the agent strategy of the data provider is to determine an optimal data
is to decide which cluster to join with the aim of maximizing pricing rules to maximize its utility, which is the final price of
its utility, and the utility is the sum of the minimum payment the traded data minus the expected loss brought by sensitive
price that any agent can obtain if it join a cluster and the gain data leakage. Given the data provider’s pricing rules, the pool
owing to joining a specific cluster. To ensure the Nash-stability member calculates its optimal bid according to its real privacy
of the hedonic game, the definition of the Nash-stable set is information, i.e, the IC property, for maximizing its utility. The
introduced to design an effective utility allocation method for utility is the legally expected net profit from this data trading
agents. For all clusters, the utility allocation method satisfies and the expected profit of leaking sensitive data minus the
the property of budget balance since the overall utility of all final price of the traded data. The optimal strategies of both
agent in one cluster is not greater than the cluster’s utility. parties can be obtained through solving the Euler-Lagrange
The theoretical analysis prove that when the utility function equation of their utility by adopting the variational method.
of a cluster is superadditive and preferences of a play are The simulation results show that the higher reputation value
additively separable and symmetric, it always admit a Nash- of the pools , the bigger probability of successful data trading
stable coalition partition. and the higher utility of both parties.
Unlike [98] and [101] which studied the federation among Summary: In this section, we have reviewed the applica-
DOs, the federation of multiple MOs is considered in [99]. tions of game theory approaches for the incentive mechanism
Firstly, to motivate the worker to participate in FL, a contract design in FL. The objective is to find the equilibrium solution
theory-based incentive mechanism in mobile crowdsensing among players to achieve utility maximization, while satisfy-
network is proposed. In this system, the MOs (buyers) train ing IR, IC and BB. The Stackelberg game-based approaches
local models based on the data collected by workers (sellers), are summarized along with the references in Table II, and
and collaborate with other MOs selectively to form federations Table III summarizes the approaches based on non-cooperative
to maximize their own profits. Given the prevailing federation game and other games. As observed from the tables, the
parameters (i.e., others’ strategies for workers’ contribution), Stackelberg game are used to maximize the utilities for both
every MO’s contract-theoretic incentive mechanism is de- the MO and the DO while non-cooperative game is mostly
signed separately with the aim of maximizing their marginal adopted to maximize the utility of DOs with competitive
contribution while minimizing incentive cost paid to workers. relationship. In fact, auction is also an effective incentive
The strategy of the MO is to determine an optimal reward method to motivate the players to join in the FL, especially in
to maximize its profit, which is defined as its marginal a high competition environment with limited resources. The
contribution minus the incentive cost and resources cost for next section discusses how to apply auctions for incentive
model training. The marginal contribution of the MO is the mechanism design in FL.
utility it produces after joining the FL. Moreover, the strategy
of each worker is to determine its data quantity contribution V. A PPLICATIONS OF AUCTION FOR I NCENTIVE
to maximize its individual utility, i.e., the difference between M ECHANISM D ESIGN IN FL
the reward from the MO and the cost incurred by collecting Auctions have some good properties. For example, IC
data. Given a set of feasible data contribution subsequences, encourages the clients to report true cost that gain the utility of
the optimal reward strategy is obtained by using the approach the model owner or encourages the model owner to report its
in [102]. Using the optimization tool, i.e., cvxpy [103] or the true value of the computing resource, while IR guarantees that
Bunching and Ironing algorithm [104], an optimal set of data every client joining the FL has a non-negative utility. Thus,
contribution is obtained. Then, a coalitional game is designed auction is a very effective solution for the incentive mechanism
to model the problem of federation formation, and the merge in FL.
and split algorithm proposed in [105] is adopted to study
the stable federation partition given different characteristics
of data quantity and quality the MOs have. The numerical A. Incentive Mechanisms Based on Sealed-bid Auctions
results show that the reward of workers increases as its data In a sealed-bid auction, sealed bids from bidders are si-
contribution increases and there exists the federation formation multaneously sent to the auctioneer. This approach is mainly
equilibrium in the proposed mechanism design. However, applied in participant selections for FL. The MO sends a
there may be competing or malicious MOs which can affect request of FL task to the DOs. After receiving the service
the performance of a federation adversely. Furthermore, the request, each DO (as a bidder) submits a message to the MO
reputation mechanisms [106] can be introduced to choose (as the buyer/auctioneer), which includes data such as bidding
reliable MOs. information, cost, and computation capacity. Particularly, the
15
bidding information submitted by each DO is not open in the Buyer Devices Worker Devices
auction and FL trading process. In each round of auction, the (Bidders) (Sellers/Auctioneers)
MO selects the DO which meets the training conditions of
FL at this stage through solving a winner selection problem.
The selection process is repeated until the FL task can be
completed. Bids & Resource demand
1) FPSB Auction: To motivate data owners to contribute & Task delay deadline
their idle resources for the performance improvement of Phase 1
computation-intensive application, the authors in [39] designed Candidate result & Claimed
prices & Execution time
a double auction-based FEderated learning and Subjective
logic-driven distributed Task allocation system in mobile de- Selection result
vice cloud, namely FEST. In the FL market, it is necessary & Computational task
to match the bidders and the sellers since there are multiple Phase 2
bidders (model owners) and sellers (data owners). In FEST, Computational result
buyer devices as bidders recruit high-quality worker devices, Phase 3
i.e., sellers, with reliable and experienced execution perfor- Payment
mance to achieve the best system performance. The FEST-
auction process consists of three phases: bidder selection, task Fig. 9. Double auction based distributed task allocation in mobile device
allocation and worker payment assignment, as shown in Fig. 9. cloud.
In the first phase, after receiving the task requests (including
buyers’ task bid, resource requirement and delay requirement
of the task) from the bidder, each seller computes the priority to maximize the overall system performance or are selfish to
factor of each bidder, which is an integrated metric with maximize their own performance. For the cooperative FL SPs,
related to task bid, execution time and preference to seller. intra-service and inter-service resource allocations are consid-
Then each seller greedily selects bidders sorted based on ered. The optimal FL round length is obtained in intra-service
the descending order of priority factor until it satisfies the problem through optimizing the bandwidth allocation among
resource, energy and maximum workload constraints. In the the clients of each FL service. The optimal solution to the
second phase, a bidder only recruits one seller for a computing intra-service bandwidth problem is then used to solve the inter-
task. In fact, a bidder can win bids from multiple sellers. service problem, and the bandwidth allocated to each FL SPs
Therefore, each bidder chooses the best one from multiple is computed by using dual decomposition [110]. In addition,
candidate sellers which can maximize its utility. The utility of to address the selfishness issue among FL SPs, a multi-bid
the bidder is a composite function of the bidder’s valuation for auction mechanism is designed to allocate bandwidth to SPs
resources and the seller’s ask price for resources as well as its so as to maximize the SPs’ utilities. The network operator
execution time and reputation value. In addition, a payment (seller) finds an optimal allocation scheme to maximize each
policy based on the approach in [107] is designed where SP’s (buyer) utility through the use of the KKT condition.
the seller providing high-quality service is rewarded whereas Each SP is then charged according to the second-price auction
the seller failing to rendering the required task quality is payment policy. The simulation results show that the proposed
punished. Theoretical analysis proved that the proposed FEST two bandwidth allocation schemes are superior to benchmark
framework achieves the properties of CC, IR, truthfulness, and algorithms in terms of inherent fairness by design, where band-
BB. Simulation results imply that the FEST system achieve width is equally or proportionally allocated among clients and
performance improvement as high as 30% and 25% in terms SPs. However, when a client can simultaneously participate
of quality-of-experience and utility of the bidder, respectively, in multiple FL service, resource allocation should take into
compared with the truthful incentive mechanism (TIM) [108] account both bandwidth resource and client computational
and distributed truthful auction mechanism (DTAM) [109] resources.
schemes. FEST performs better than the two existing ap- Given the growing complexity and capacity of the future
proaches since they only consider the resource availability of network, it is difficult to completely model or solve the
sellers while not considering the quality-of-experience of the dynamic conditions in the network with the standard ap-
bidders. However, the growth of the number of FL worker will proaches. The authors in [111] proposed deep learning-based
lead to much higher computational complexity, which makes (DL) optimal auction incentive scheme, where multiple MOs
the FEST based auction approach not suitable to large-scale (buyers) intend to buy the 3C-L resources (i.e, communi-
FL applications. cation, computation, caching, and learning resources) from
2) SPSB Auction: The aforementioned works are for re- a worker (seller). In this incentive mechanism, the concept
sources allocation in which only a single FL service is consid- of information freshness, i.e., Age of Information (AoI) is
ered. However, as the diversification of ML-driven applications adopted. Meanwhile, the DL-based pricing scheme is proposed
develop and grow, it is predicted that the wireless network to price the workers contribution in terms of AoI, in which
will host multiple co-existing FL services. To fill the gap, both conditions of truthfulness and revenue maximization of
the authors in [67] considered a multi-FL service scenario, the seller are ensured. Although the proposed incentive scheme
where FL service providers (SPs) cooperate with each other uses the second-price auction to determine the winner, it allows
16
the worker to earn higher revenue than that of the conventional Model owner
second-price auction scheme. The reason is the information (Buyer)
asymmetry among the buyer and the sellers can affect the
performance of the latter scheme, but the former can avoid
this issue through learning from the historical data so as to
improve the performance.
To overcome the issue of the failure of the communication Task
link and missing nodes that may affect the performance of Task
execution
Bids Distribution Payments
FL, a communication-efficient FL scenario where the UAVs results
ś
as wireless relays to improve the communication between IoV Ś Ŝ ŝ
components (workers, i.e., DOs) and the FL server is consid- Data owners
ered as proposed in [112] and [113]. The authors presented (Sellers)
a joint auction-coalition formation framework to allocate the
UAVs coalitions to cells of workers. Note that the payment that
a cell receives depends on the total time needed to complete
the FL task. Therefore, the workers in a cell have higher
incentive to bid for the UAV coalitions that can maximize
the utility of the cell(i.e., the difference between the valuation Fig. 10. Reverse auction based incentive mechanism for client selection in
of the cell for a coalition and its payment price), and then and FL.
pay them based on the second-price auction. Furthermore, the
UAVs aim to maximize sum of their individual profits through model for cooperative production of virtual products is pre-
finding a stable partition. To achieve the objective of both sented in [115] and [116]. Specifically, a mechanism similar to
cells and UAVs, a merge-and-split algorithm is proposed to FVCG, Procurement-VCG (PVCG) is proposed on the supply
find the optimal coalitional structure. Simulation results show side, and the Crémer-Mclean mechanism [117] is used to
that the proposed join auction-coalition scheme achieves the overcome the demand-side information asymmetry (the model
highest profit of coalition, compared with the merge-and-split users’ valuations on the trained FL model). Differently, in
with random allocation and random partitioning with second- PVCG, if one producer (i.e., the seller) cannot offer the data it
price auction. In the future, a joint trajectory optimization and claims at the time of bidding, the coordinator (the intermediary
energy efficient coalition formation game can be considered between producers and model users, i.e., the buyer) will
to complete the FL tasks more efficiently. impose a high punishment on it. The theoretical analysis
3) VCG Auction: In [114], the authors proposed Fair- demonstrated that the Crémer-Mclean mechanism, together
VCG (FVCG) as an improvement of VCG mechanism for with the procurement auction, maximizes producer surplus by
incentivizing DOs to contribute their data and truthfully report motivating producers to offer all their data to the federation
their cost in FL training processes. FVCG aims not only to and truthfully report their private information. However, when
maximize the social welfare but also minimize unfairness of the number of producers increase, more advanced algorithms
the learning federation without the FL server (buyer) knowing is required for FVCG and PVCG to learn the two payments.
the cost type and data quality of DOs (sellers). Guaranteeing In addition, the effectiveness and superiority of FVCG and
the fairness among DOs can encourage them to improve PVCG need to be verified compared with other sharing rule
their participation level. The proposed FVCG approach mainly (e.g., SV, [118], [119] and Labour Union [120]).
includes two steps: 1) the calculation of the data acceptance
vector, which illustrates the FL server decides how much data
to accept from each DO; 2) the calculation of the payment B. Incentive Mechanisms Based on Reverse Auctions
vector (including the VCG payment vector and the adjustment For reverse auction, sellers bid for the prices at which they
payment vector), which indicates the payment each DO can are willing to sell their resources. One of major advantages
receive from the server. In particular, the adjustment vector is of the reverse auction is to incentivize the participants, e.g.,
defined as the solution of a functional optimization problem, the data owners, to report their types, e.g., the cost of FL
which is difficult to solve. FVCG adopts an unsupervised training. This advantage is important to the model owner
and composite neural network based approach to solve the since it helps to reduce the incentive cost. As shown in Fig.
optimization problem since the neural network can approx- 10, there are four steps in a typical reverse auction based
imate continuous function with an arbitrary precision. The incentive mechanism including bidding, winner selection and
experimental evaluation results reveal the reasonableness of task assignment, feedback of the task results, and payment
FVCG and the proposed neural network-based approach. determination.
The proposed incentive mechanism in [114], i.e., FVCG, In [121], the authors modeled the incentive mechanism
overcomes the supply-side information asymmetry and ensures between the BS and multiple mobile users as a reverse
the fairness among the DOs in the FL process. In practice, combinatorial auction game, where the BS acts as the buyer
the information asymmetry is multidimensional including the (also the auctioneer) and the mobile users as the sellers are
demand side and the supply side, both of which affect the allowed to bid for the combination of resources. In particular,
sustainability of federation. For this reason, a game-theoretical each mobile user sends a message consisting of the required
17
amount of resources, local accuracy, and the corresponding with the benchmark scheme only depending on the bidders’
energy cost to the BS. Given the maximum tolerable time of bid prices, and the social welfare is improved in DRLA.
FL (the upper bound of the total time of one global iteration), However, there is a drawback in the approaches mentioned
each mobile user determines the optimal bid, which includes in [122] and [124], i.e., it needs to obtain the global distribu-
the uplink transmission power, the CPU cycle frequency and tion of all data before calculating EMD, which is impractical
the local accuracy level, to minimize its energy cost. And a under the setting of FL. Therefore, the authors in [125] adopted
low-complexity iterative algorithm is proposed to obtain the the reputation mechanism to indirectly represent data quality.
optimal solution of the problem. Once receiving the bids from Based on this method, they designed an reverse auction-
all mobile users, the BS performs winner selection and pay- based incentive mechanism (RRAFL), which helps the task
ment determination by solving an social welfare maximization requester (i.e., the buyer) choose high-quality and reliable
(SWM) problem using the primal-dual based greedy algorithm. mobile devices (i.e., the sellers) for performing the task.
Note that the utility of BS is the difference between the BS’s RRAFL is mainly composed of four parts. In the first phase,
satisfaction level (based on the local accuracy that the user can the comprehensive reputation of mobile devices is calculated.
offer in the bid) and the payment for mobile users. Simulation The next phase is winner selection and payment. Based on the
results show that the social welfare obtained by the proposed reputation value, the candidates calculate their unit reputation
mechanism is 400% larger than by the fixed price scheme since bid price, and then they are sorted in a non-decreasing order
the fixed price mechanism heavily depends on the prices of of their price. The requester select K candidates as winners
the resources. from the ranking form from front to back which can maximize
To avoid the collusion between the untrustworthy third party its utility, and pay to them. The utility of the request is a
and the ENs, the authors in [122] presented a a decentralized function of the comprehensive reputation. After the model
and transparent blockchain-based resource trading system for training, the winners’ contribution is measured to reevaluate
FL in edge computing (EC). To encourage both the resource and update to their reputation value. The experimental results
requester (individuals or companies who need model training) show the effectiveness of RRAFL. In the future, the further
and ENs (the owner of data and computing capacity) to join in study on how to measure the participants’ contributions more
the system, the incentive problem among them is formulated reasonably can be considered.
as a data quality-driven reverse auction (DQDRA) with the Paying attention to the same problem as that in [125],
aim of maximizing the valuation of the requester about the the author in [126] proposed a cluster-based clients selection
model accuracy. Accordingly, ENs, i.e, sellers, determine the method. The system model consists of a cloud aggregation
optimal bids which can maximize their utilities (including data server and multiple edge clients, i.e., SPs. Accordingly, the
size, the EMD metrics, i.e., a metric for data distribution, clients with the similar initial gradient are divided to the same
and prices) and submit them to the requester, i.e, the only cluster firstly. Second, to overcome the excessive consumption
one buyer. Once receiving the ENs’ information, the requester caused by random selection, an auction-based clients selection
decides the winner and the payment to them based on the method is designed in each cluster, in which the clients as
proposed monotone greedy algorithm by marginal density bidders provide data services and computing services, and the
for its valuation maximization. The service valuation of the cloud server as an auctioneer (also the buyer) trains a global
requester is a function of training data size. The simulation model. In the auction, each client determines its optimal bid
results show that the valuation of the requester in DQDRA price that maximize the expected revenue through using the
is larger than the valuation of the requester in the GREEDY- first-order optimal condition. Then, the sever randomly selects
SM and RANDOM-SM mechanisms proposed in [123]. The a group, and selects K clients with the lowest bid in this group
trading market with multiple competitive requesters can be as winners. The minimum local data size among winners as
taken into account in the future. the threshold. The winner selection process is repeated among
The same system model and auction mechanism to that in clients with the data size greater than the threshold in each
[122] can be also found in [124]. However, in [124], the buyer group. The simulation results show that the auction-based
is an FL platform, and the sellers are DOs, i.e., FL workers, clients selection scheme can show a faster convergence rate
which report the requested wireless channel to communicate compared to randomly selecting a client from each groups.
with the platform. The FL training problem between the To motivate power users to participate in the interruptible
platform and DOs is converted to a SWM problem where the load management, the authors in [127] proposed a multi-
utility of the platform is the data utility minus the total cost and attribute sealed auction game to simulate the interaction be-
the total payments to workers. To tackle the SWM problem, tween the power company and the power users, with the aim of
the original auction is decomposed into a set of sub-auction maximizing the power company’s revenue and finally decide
in terms of EMD. After that, each group greedily selects the the list of participating users. The auction model consists
workers and determines the payments based on the marginal of two main phases, i.e., user bidding and user selection.
virtual social welfare density. To further improve the social In the first phase, each power user reports the amount of
welfare, the author proposed a DRL-based auction (DRLA) interruption and its quotation to the company. In the second
which use the graph neural network to extract the features from phase, the K-means clustering method is used to cluster the
workers’ reported types and automatically decides the service users according to each users’ load features. Based on the
allocation and payment. The experimental results show that the objective of minimizing the total expenditure, the company
DRLA and RMA can achieve higher social welfare compared chooses the users from various clusters considering the users’
18
TABLE IV
A PPLICATIONS OF S EALED -B ID AUCTION M ODEL FOR I NCENTIVE M ECHANISM D ESIGN IN FL
Market Structure
Ref. Auction Model Mechanism Objective
Seller Buyer Item
Based on buyers’ bids, the sellers determines buyer Execution time of FL task
Sealed-bid Worker Buyer Computational winning candidates by using the greedy approach. The minimization, Buyer devices’ utility
[39] double auction devices devices resources seller that can maximize the utility of the buyer is maximization, CC, IR, BB, and
selected. truthfulness
Buyers submit to the auctioneer their bids including the
Second-price
Network FL service requested bandwidth and the unit price, the auctioneer Utility maximization for SPs, IR, IC
sealed-bid Bandwidth
[67] operator providers allocates bandwidth to buyers based on the market and fairness
auction
cleaning price.
MOs’ bids are transformed to new bids satisfying IR
deep
3C-L and IC properties, the winners are decided based on Revenue maximization for the
learning-based Worker MOs
[111] resources second-price auction and the payment price of them is seller, IR, and IC, best matching
optimal auction
derived using ReLU.
Second-price Buyers submit their bids to UAV coalition with
Train Total profit maximization for sellers,
[112] sealed-bid UAVs Workers sufficient energy, and then each coalition is allocated to
resources IC,IR
[113] auction cells of workers.
Based on the data quality and privacy cost of sellers,
Training the buyer determine the amount of acceptable data to
VCG auction DOs FL server IR,IC, SWM, WBB and fairness
[114] data maximize its revenue. The payment function is derived
by using composite neural networks.
Sellers submit to the auctioneer their bids including
SWM, truthfulness, ex-post
Virtual their capacity and type parameters, the coordinator
VCG auction Producers Consumers allocative efficiency, ex-post IR,
[115] products determines the acceptance ratio and transfer payment to
WBB
sellers.
TABLE V
A PPLICATIONS OF R EVERSE AUCTION M ODEL FOR I NCENTIVE M ECHANISM D ESIGN IN FL
Market Structure
Ref. Scenario Mechanism Objective
Seller Buyer Item
Wireless Mobile devices submit their optimal bids for these
Mobile Training SWM, energy cost minimization for
cellular BS resources, and the winner selection problem is then
[121] users resources sellers, truthfulness, IR and CE
network solved by using primal-dual based greedy algorithm.
Service valuation maximization for
Edge Data The winner and their payment are determined based the buyer, utility maximization for
ENs Requester
[122] computing resources on the monotone greedy algorithm. sellers, budget feasibility,
truthfulness, and CE.
Wireless FL FL Data Same as [122], but the improvement of social welfare
DOs SWM, IC, IR, and CE
[124] service market platform resources is further studied based on DRL.
Mobile Data The winner selection and payment are determined Utility maximization for the buyer,
Horizontal FL Requester
[125] devices resources based on greedy algorithm. IC, IR, budget feasibility, and CE
The optimal bid prices for sellers can be determined
Edge Cloud Training
MEC using the first-order condition, and the K sellers with Revenue maximization for sellers
[126] clients server resources
lowest bid are selected as the winners.
Power demand Power Power The participating users are determined based on the The total cost minimization for the
Power
[127] response users company price and the expected response rate of users. power company, and fairness
The optimal uplink transmission power and CPU
Cellular Utility maximization for sellers,
Computational frequency of sellers can be obtained using iteration
wireless Users BS social cost minimization for the
[128] resources algorithm, and then the randomize auction scheme is
network buyer, truthfulness, IR, and CE.
used to tackle the winner selection problem.
Given the buyer’s scoring rule, sellers determine their Profit maximization for both sellers
Training optimal bids. Then, the buyer selects K sellers with and buyer, lightweight incentive
MEC ENs Aggregator
[129] resources the highest scores as winners by using Lagrange mechanism, IR, IC, and Pareto
multiplier method. efficiency
The learning tasks allocation and payment
The sum of quality of aggregated
Mobile FL Training determination are determined by solving the sum of
Horizontal FL model updates maximization,
[130] users platform resources quality of aggregated model updates maximization
truthfulness, IR and CE
problem based on the greedy algorithm.
bids and the expected response rate. Meanwhile, the fairness cost, to maximize its utility. The utility of the user is the
mechanism in the scenario where users have participated difference between the reward and the true cost. The optima
in multiple consecutive times is considered. The simulation uplink transmission power and CPU frequency can be obtained
results show that the proposed greedy approach can reduce using the iteration algorithm. The winner selection problem for
the expenses of power company and improve the selected the BS is formulated as a social cost minimization problem.
probability of small business and residential users. However, To deal with the NP-hard minimization knapsack problem,
the bidding sequence should be considered since it may affect the framework of randomized auction [131], [132] is adopted.
the user selection of long-term demand respond process. The simulation results demonstrate the proposed randomized
The system model in [75] is also considered in [128]. auction scheme can guarantee the approximation factor of the
However, the difference is that the interaction between the integrality to the optimal minimum cost. However, the work
BS and users is modeled as a randomized auction where the does not consider the privacy protection of bidders.
former is the buyer and the latters are the sellers. In particular, In [129], FMore, a multi-dimensional procurement auction-
each users determines its bid, consisting of the resources based incentive mechanism with only one buyer, was proposed
(uplink transmission power and CPU frequency) and training with the aim of motivating high-quality ENs with low cost
19
to join in FL. The system model includes an aggregator is to maximize the SWM, the revenue of resources providers,
at the remote cloud, i.e., a buyer, which rents the data, and the FL system performance. The reviewed approaches
computation and communication resources for FL from ENs are summarized along with the references in Table IV and
i.e., sellers. Specifically, when receiving a bid request with Table V. As seen, the auction models are mostly used to
scoring function from the buyer, the ENs decide whether select high-quality participants to maximize the performance
to bid or not according to their available resources. The of FL system. However, the collusion problem is unfavourable
scoring function is related to the resources quality and sellers’ for auction-based incentive mechanisms. Thus, it is crucial to
expected payment. If one EN determines to join in FL, it will develop more effective methods to prevent dishonest bidding.
reply the aggregator the response message (ask) containing In the next section, we will discuss how to apply the contract
the resources quality and the corresponding prices according and matching theory for incentive mechanism design.
to sealed-bid auction. The K ENs with the best scores are
chosen for participating in collaborative learning, and the VI. A PPLICATIONS OF C ONTRACT AND M ATCHING
payment for these nodes is implemented according to the first- T HEORY FOR I NCENTIVE M ECHANISM D ESIGN IN FL
price auction policy. Note that the proposed multi-dimensional Due to the clear decision tree of duties and obligations,
auction has multiple winners which is different from [39]. contract theory is widely applied in FL systems. For the
Extensive simulations show that FMore can speed up federated contract theory based incentive scheme, the MO proposes a
learning via reducing training rounds by almost 51.3 %, and list of contracts to DOs. In particular, the MO is not informed
improve the model accuracy by 28% for the LSTM model about the private type of DOs when designing the contracts.
compared with the classic FL scheme, which enables FMore as Each DO then proactively selects the appropriate contract type
a lightweight resource auction approach for FL. However, the that maximizes its own utility based on the evaluation of their
budget constraint of the aggregator is not considered in FMore, own abilities. Besides, the matching theory is used to match
which is left to be extended for the future work. Moreover, the MO with the DO optimally for the efficient FL training.
for each EN, whether the selected probability should be the
same or different remains to be further studied.
Although the existing works that have applied the reputation A. Incentive Mechanisms Based on Contract Theory
mechanism can select the high-quality DOs to join in FL, none The contract theory-based incentive mechanism aims at
of them considers the learning quality of participants, which maximizing the payoff or utility of the MO. Generally, the
can significantly affect the performance of FL and the direction problem is to maximize the objective function, i.e., the MO’s
of incentive. To bridge this gap, the authors in [130] proposed utility, subject to IR constraint and IC constraint. The first
a novel distributed learning system named FAIR, i.e, Federated constraint indicates that the payoff gained by the DO under
leArning with qualIty awaReness. Particularly, FAIR integrates this contract is not less than its reservation payoff if not
the reverse auction-based incentive mechanism design and the participating. The second constraint means that the DO’s
quality-aware model aggregation algorithm, which allows it expected payoff is maximized when signing the contract.
to facilitate precise user incentive and model aggregation. The The authors in [134] proposed a contract-theoretic task-
system mainly consists of three components: the learning qual- aware incentive mechanism to model the tradeoff between
ity evaluation, incentive mechanism and model aggregation. preferences for service latency and AoI of different training
Specifically, the historical quality records are firstly leveraged tasks. The problem of training resource trading for the MO
to estimate the current learning quality. Based on the estimated having different preferences to the service latency and AoI is
quality, a reverse auction incentive mechanism is then estab- considered. The MO acts as the employer to set the contract for
lished to encourage mobile users participation, in which the the different FL tasks as (reward, the number of update cycles),
mobile users (the sellers) send their bids to the cloud platform and the workers as the employees which select one optimal
(the auctioneer). To maximize the collective learning quality of contract to participate in FL. Further, the authors considered
all the participants in each iteration, within the limited budget, to accurately design the reward structure, the update expense,
a learning quality maximization (LQM) problem is formulated. e.g., the energy consumption incurred in data collection and
A greedy algorithm based on Myerson’s theorem [133] of model training is required to be known. This asymmetric
truthfulness is designed to solve this NP-hard problem. Thus, information is tackled by using the contract theory. To obtained
the learning task allocation and reward assignment is deter- a feasible contract, the profit maximization problem for the
mined. What’s more, a new aggregation algorithm is proposed MO is built, subject to IR and IC constraints for the utility of
to filters out non-ideal model updates for the enhancement of worker. For the future work, the deviation from an ideal cache
the global model. The performance evaluation results show can be considered and then a caching scheme can be designed
that the performance of the ResNet model can be improved to better manage the AoI-service latency tradeoff.
by 68.9% using FAIR compared with the knapsack greedy and Considering the challenges when implementing the FL in
bid price first mechanisms. For the future work, integrating IoV networks (such as dynamic activities and the limited pay-
the communication and computation performance into FAIR ment budget), the authors in [135] proposed a novel economic
to enhance its robustness when employed in practical systems framework to address these issues. A multi-principal one-
can be considered. agent contract is designed where the best smart vehicles (SV,
Summary: In this section, we introduce the application of as principals) non-collaboratively offer contract agreements
auction model for incentive mechanisms in FL. The objective (containing information significance and offered payment) to
20
the vehicular service provider (VSP). The VSP (as the agent) Unlike the contract-based incentive mechanisms that con-
is then in charge of optimizing the offered contracts. The sider single-dimensional privacy information, the work in
payment budget of the VSP is referred as the asymmetric [140] proposed the multi-dimensional contract theoretic ap-
information. Specifically, the VSP perform an SV selection proach which considers the user’s two-dimensional privacy
method to decide a set of the best SVs for the FL process information including training cost and communication delay.
according the significance of their current locations and in- The interaction between the FL serve and users is modeled
formation history. Then, each selected SV can collect on- as a labor market, in which the former act as the employer
road information and offer a payment contract to the VSP and the latter act as the employees. The key contribution
based on its collected QoI. This contract model is modeled of this approaches is summarizing users’ multi-dimensional
as a profit maximization problem for the VSP and SVs is private information into a one-dimensional criterion (i.e., the
formulated amid the limited payment budget and common server’s preference on different user’ types) that allows a
constraints (i.e., IR and IC). To find the optimal contracts for complete order of users. Three scenarios are considered in-
the SVs, the problem is transformed into an equivalent low- cluding complete informational scenario, weakly incomplete
complexity problem. The equilibrium solution is the found information scenario, and strongly incomplete information
based on the proposed iterative algorithm. The experiment scenario is considered. In the first scenario, the serve knows
results demonstrate that the proposed scheme can improve the each user’s type, the contract design is to minimize the server’s
social welfare of the network up to 27.2 times and convergence cost with the IR constraint. In the second scenario, the same
57% faster compared with those of random and round-robin objective with IR and IC constraints is considered since the
scheduling methods. server does not know which user belongs to which type. In the
The economic approach in [135] is also applied into the third scenario, the contract design is to minimize the expect
electric vehicle network as proposed in [136]. In this network, cost of the server, where the objective function is similar to
multiple charging stations (CSs), i.e., principals (employers), that in the second scenario excepted the expectation part. The
firstly implement the energy demand prediction method lever- simulation results show that all the designed contract in the
aging FL, and then reserve energy from the smart grid provider three information scenarios can achieve up to 73.72% cost
(SGP) in advance based on the prediction results to maximize reduction of uniform contract when the number of users is
their own profits. The profit of the CS is the difference between large.
the revenue for serving electric vehicles and the energy transfer The approach in [140] assumed that the data of user is IID,
payments to the SGP. Note that the competition among the CSs in fact, data distribution may have an impact on the server’s
and the willingness to transfer energy of the SGP is unknown cost. Therefore, the authors further studied the case where
for each other. The CS’s profit maximization problem is users have non-IID data in [141]. And the EMD metric is
modeled as a non-collaborative energy contract problem under used again. The author derived the server’s optimal contract
the unknown information and the common constraints of the with the non-IID data under complete and weakly incomplete
SGP. To find the optimal contracts for the CSs, the method of information scenarios, respectively. Compared with the case
solution in [135] is also used . The simulation results show IID in which the server only selects the most preferred type
that the proposed framework can outperform other economic under complete information, in the non-IID case, the server
models by 48% and 36% in terms of the CSs’ utilities and may choose more than one type. Meanwhile, they find that
social welfare (i.e., the total profits of all participating entities) weakly incomplete information does not increase the server’s
of the network, respectively. cost (comparing with the complete information scenario) when
To simulate the mobile devices with high-quality data training data is IID, but it in general does when data is
to participate in FL, the authors in [137] designed a con- non-IID. The experimental results of the non-IID case show
tract theory-based incentive mechanism, where mobile devices that the server’s cost in the proposed mechanism is lower
(DOs) are the employees and the monopolist operator (i.e., task than that in optimal uniform contract (OUC), reverse multi-
publisher) is the employer. The task publisher design specific dimensional auction mechanism (RMA) [124] and Stackelberg
contracts for different types of DOs with different levels of game mechanism (SBG) [31]. For the future work, deriving
data quality to maximize its profits, which is a function of the the optimal contract for the strongly incomplete information
time of the global iteration and the reward paid to DOs. Under scenario, and expanding existing researches to a general sce-
the constraints of IR and IC, the optimal CPU resources and nario with multiple competitive servers can be considered.
the corresponding reward can be derived using the substitution The authors in [142] leveraged on multi-dimensional con-
method and the convex optimization tools. The simulation tract theory to design a trustworthy incentive mechanism for
results show that the the task publisher can obtain higher profit the UAV-enable IoV network. The system model consists of
using the proposed contract-based method compared to the a MO, i.e., the employer, perform a time-sensitive task, and
Stackelberg game method in [138]. Based on this approach, multiple UAVs, i.e., the employees, implement sensing task
the reputation mechanism based on blockchain is introduced and transmit the collected data to the MO. The objective
in [139] to measure the reliability and trustworthiness of of this incentive is to maximize the profit for both the MO
the mobile devices, so as to motivate high-reputation DOs and UAVs. The profit of UAV is the difference between the
with high-quality data to join in model learning. To further contractual rewards and energy cost, and the profit of the
improve the accuracy of reputation computation, more weight MO is the total revenue obtained from UAV and the reward
parameters can be taken into consideration. expenses. In this incentive scheme, a UAV is categorized into
21
a multi-dimensional type-(x, y, z, q), where x is traversal cost economic and game theoretic models have been widely applied
type, y is sensing cost type, z is computation cost type, and to design various incentive scheme in different scenario. How-
q is transmission cost type. The proposed contract design ever, incentive mechanism design is in its infancy. Thus, there
includes two phase, i.e, multi-dimensional contract design and are still open issues and new research directions on incentive
traversal cost compensation. In the first phase, the sensing cost mechanisms for FL as discussed in the following.
and computation cost is transformed into a single-dimensional 1) Online-Auction based Mechanism Design: Existing mon-
contract design problem, same as the method in [140]. In etary incentive schemes based on traditional auction method
the second, a fixed compensation is added to derive the final mostly are in offline, i.e., the auction starts only if a sufficient
contract bundles by considering additional traversal cost and number of bidders must be available. For example, in [129],
transmission cost. After obtaining the optimal contract, the the model aggragator starts to determine the winner when the
matching-based UAV-subregion assignment using the Gale- number of the bids from the ENs is sufficient or a predefined
Shapley algorithm [143] is considered to match the lowest users’ bids have to be synchronized. In such an auction, each
cost UAV to each subregion, i.e., target sensing region. The bidder may need to wait for a long time even if the bidder
simulation results demonstrate the IC property of the proposed does not win the auction. This discourages the bidder to join
contract design and show the efficiency of the matching. in the FL platform. Different from the offline auctions, online
auctions allow the auctioneer or the seller to makes decisions,
B. Incentive Mechanisms Based on Matching Theory i.e., on winner and prices, upon a bidder joins the auction.
The online auction can break the time and space constraints
Matching theory is widely used in the real market for its and save costs. Some recent works, e.g., [147] and [148], have
optimal two-side matching rule. This property is beneficial for investigated the online auction for mobile crowdsensing. Thus,
the worker selection of the requester of the FL service. the online auctions is a promising solution for the incentive
The author in [144] considered the network slicing place- mechanism design for FL.
ment strategy to offering customize service for different users’ 2) Advanced Contribution Evaluation Methods: To evaluate
heterogeneous requirements. In particular, the network slice the contribution of the data owners, Shapley value have been
controller (NSC), i.e., the seller, collects the requests from end commonly used. Based on the concept of the SV, many works
users (EU), i.e., buyers, and then deploys the virtual network have been done such as the federated SV [36] based on
functions (VNFs) to provider different communication services local model updates and contribution index [35] based on
for each of them. The proposed VNFs placement policy the intermediate results of the training process of FL. The
includes three modules, i.e., FL module, EUs behavior module, existing works demonstrate that the SV method in FL setting
and VNFs placement module. The first module is deployed has comparable performance. In addtion, a computation-and
among NSC and EUs designed to predict EUs’ VNFs demand communication-efficient estimation method is proposed in
information for each type of service. Based on the prediction [149]. The main focus of these works is to reduce the time
results, the network slice controller provide and place the complexity of the computing SV. Note that they assume that a
corresponding VNF to the area of EUs. A two-phase placement trusted server will honestly evaluate the contribution of each
strategy based on the matching theory is proposed to maximize participant. To establish a transparent evaluation mechanism,
the SP’s revenue. The SP revenue is the difference between blockchain-based SV [27] is adopted to overcome the mo-
the price payed by the EU based on its level of quality nopolistic behavior. However, all of the aforementioned works
of experience and the cost borne by the SP in provisioning do not consider the adversarial behaviours of the participants,
and placing the VNF in the service area. As shown in which may affect the the Shapley value calculation and lead
the simulation results, the proposed matching theory-based to the unfairness. More research works need to be investigated
assignment approach can reach a superior revenue compared for designing comprehensive and transparent contribution eval-
with the method based on Kolkata game [145] and potential uation mechanisms.
game [146]. 3) Incentive Mechanism with Privacy Protection: Existing
Summary: In this section, we discuss the applications of incentive mechanisms largely ignore the protection of pri-
contract and matching theory for incentive mechanism design vacy information, e.g., individual preferences, of data owners.
in FL. The objective is to maximize the seller’s utility or What’s worse, with the most current incentive mechanisms,
revenue, the buyer’s profit, and to minimize the latency of data owners who participate in the bidding process directly
FL training. The related works are summarized along with the reveal their privacy information. Meanwhile, the data owners
references in Table VI. As seen, the contract design mostly who lose in the bidding process receive no compensation at all
depend on the traditional optimization method, in the future, for their privacy revelation, which may reduce the participation
the DRL-based contract design can be considered to better enthusiasm of data owners. To protect data owners’ privacy
adapt to the changing user types. while maintaining high utility of trained models, some works
that consider the privacy problem have been done. The works
VII. S UMMARY, C HALLENGES , AND F UTURE R ESEARCH in [52], [150] proposed the game theoretic models for the FL
D IRECTIONS market, in which the data owners can obtain compensation
Different approaches reviewed in this survey show the according to their privacy loss quantified by local differential
effectiveness of economic and game theoretic models for the privacy. However, these frameworks may not be suitable
incentive mechanism design for the FL systems. Evidently, for some specific learning tasks. Thus, more research works
22
TABLE VI
A PPLICATIONS OF C ONTRACT AND M ATCHING T HEORY FOR I NCENTIVE M ECHANISM D ESIGN IN FL
Market Structure
Ref. Scenario Mechanism Objective
Seller Buyer Item
For the tradeoff between the service latency and AoI,
Smart Training Profit maximization for the MO, IR
Workers MO the frequency of data update are stipulated in the
[134] industries resource and IC
optimal contract
Vehicles’ The optimal contract problem is converted a equivalent
Profit maximization for both buyer
Vehicular local low-complexity problem, and then the equilibrium
IoV network Vehicles and sellers, QoI improvement for
[135] SP training solution for vehicles is obtained using the iteration
sellers, IR and IC
results algorithm.
Electric vehicle Profit maximization for buyers,
SGP CSs Power Same as [135]
[136] networks IC,and IR
Profit maximization for the buyer,
Mobile Task Data The optimal contributed data size and reward are
[137] Mobile network guarantee latency requirement, IR
devices publisher resources obtained using the substitution method.
[139] and IC
Payoff maximization for sellers and
The The optimal contract can be derived by converting the
Data cost minimization for the buyer,
[140] Efficient FL Users central two-dimensional information into a single-dimensional
resources guarantee latency requirement, IR
[141] server criterion.
and IC
The optimal contract can be obtained using the same
UAV-enable Sensing Profit maximization for both sellers
UAVs MO method in [140], and the lowest cost UAV is allocated
[142] IoV network task and the buyer, IR and IC.
each subregion based on the Gale-Shapley algorithm.
Network Revenue maximization for sellers,
The NVFs placement strategy is modeled as a
Network slicing NSC EUs slice and guarantee the QoE requirement
[144] two-phase matching game
service of the buyer
need to be investigated for both incentiving data owners and and background of FL and incentive mechanism design. Then,
protecting their privacy. we have introduced fundamentals of various economic and
4) Comprehensive Incentive Scheme: The most of current game models with the aim to understand the motivations of
incentive mechanisms aim to optimize a single objective. For using the economic and game models in incentive mechanism
example, the works [57], [75], [151] aim to select the optimal design of FL. After that, we have provided detailed reviews,
participants for maximizing the performance of the FL system. analyses, and comparisons of the economic and game theoretic
Also, some works related to the Shapley value aim to evaluate approaches in designing variety of incentive mechanisms of
the contribution of the participants to guarantee the fairness. FL. Finally, we have discussed some open issues and future
In fact, multiple objectives should be considered in the FL. research directions.
For example, the participant selection and the contribution
evaluation should be jointly investigated to maximize both the
FL performance while guaranteeing the fairness. Therefore, R EFERENCES
designing incentive mechanisms with multiple objectives are
[1] S. Pouyanfar, S. Sadiq, Y. Yan, H. Tian, Y. Tao, M. Reyes, M. Shyu,
needed. S. Chen, and S. S. Iyengar, “A survey on deep learning: Algorithms,
5) Integration of Incentive Mechanism and Cutting-edge techniques, and applications,” ACM Comput. Surv., vol. 51, no. 5, pp.
Technologies: Recently, DRL-based incentive mechanism de- 1–36, 2018.
[2] W. Hatcher and W. Yu, “A survey of deep learning: Platforms,
signs have attracted great attention and achieved good results, applications and emerging research trends,” IEEE Access, vol. 6, pp.
such as [83], [57]. Thus, some cutting-edge technologies, such 24 411–24 432, 2018.
as graph neural networks, generative adversarial networks, [3] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning. MIT
multi-agent reinforcement learning, etc., might find its poten- Press, 2016, http://www.deeplearningbook.org.
[4] X. Zhu, H. Li and Y. Yang, “Blockchain-based privacy preserving deep
tial application in the incentive design of new scenarios like learning,” in Proc. Int. Conf. Information Security and Cryptology.
mobile edge computing and 5G/B5G. Springer, 2018, pp. 370–383.
6) Multiple Incentive Schemes Co-Existence: Different par- [5] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey
on mobile edge computing: The communication perspective,” IEEE
ticipants might prefer different kinds of incentives, however Commun. Surveys Tuts., vol. 19, no. 4, pp. 2322–2358, 2017.
the current FL systems only adopt one single incentive mech- [6] Y. Zhao, J. Chen, J. Zhang, D. Wu, J. Teng and S. Yu, “Pdgan: A novel
anism (i.e., monetary incentive mechanism). Multiple incentive poisoning defense method in federated learning using generative adver-
sarial network,” in Proc. Int. Conf. Algor. Archi. Parallel Processing,
schemes allow different participants simultaneously obtain dif- ICA3PP. Springer, 2019, pp. 595–609.
ferent types of rewards. This idea faces some challenges, such [7] J. Zhang, B. Chen, X, Cheng, H. T. T. Binh and S. Yu, “PoisonGAN:
as how to determine what type of reward should be offered to Generative poisoning attacks against federated learning in edge com-
puting systems,” IEEE Internet Things J., vol. 8, no. 5, pp. 3310–3322,
which participant, and who make this decisions. In addition, 2021.
when a new participant arrives, the FL platform should be able [8] J. Zhang, J. Chen, D. Wu, B. Chen and S. Yu, “Poisoning attack in
to flexibly design a new scheme that is individualized to the federated learning using generative adversarial nets,” in Proc. 18th
new participant according to its preferences. IEEE Int. Conf. Trust Security Privacy Comput. Commun./13th IEEE
Int. Conf. Big Data Sci. Eng. (TrustCom/BigDataSE), Rotorua, New
Zealand, 2019, August 5-8, 2019, 2019, pp. 374–380.
[9] Z. Chen, A. Fu, Y. Zhang, Z. Liu, F. Zeng and R. H. Deng, “Secure
VIII. C ONCLUSION collaborative deep learning against gan attacks in the internet of things,”
This paper has presented a comprehensive survey on the IEEE Internet Things J., vol. 8, no. 7, pp. 5839–5849, 2020.
[10] H. Qiu, K. Zhu, N. C. Luong, C. Yi, D. Niyato, and D. I. Kim,
applications of economic and game theories to incentive mech- “Applications of auction and mechanism design in edge computing:
anism design in FL. Firstly, we have reviewed the fundamental A survey,” arXiv preprint arXiv:2105.03559, 2021.
23
[11] W. Y. B. Lim, N. C. Luong, D. T. Hoang, Y. Jiao, Y. Liang, Q. Yang, [33] G. Wang, C. X. Dang and Z. Zhou, “Measure contribution of partici-
D. Niyato, and C. Miao, “Federated learning in mobile edge networks: pants in federated learning,” in Proc. 2019 IEEE Int. Conf. Big Data
A comprehensive survey,” IEEE Commun. Surveys Tuts., vol. 22, no. 3, (Big Data), Los Angeles, CA, USA, December 9-12, 2019, pp. 2597–
pp. 2031–2063, 2020. 2604.
[12] Y. Zhan, J. Zhang, Z. Hong, L. Wu, P. Li and S. Guo, “A survey of [34] R. H. L. Sim, Y. Zhang, M. C. Chan and B. K. H. Low, “Collaborative
incentive mechanism design for federated learning,” IEEE Trans. Eme. machine learning with incentive-aware model rewards,” in Proc. Pro-
Topics Comput., vol. PP, pp. 1–1, 2021. ceedings of the 37th Int. Conf. Machine Learning, ICML 2020, Virtual
[13] R. Zeng, C. Zeng, X. Wang, B. Li and X. Chu, “A comprehensive Event, 13-18 July 2020. PMLR, pp. 8927–8936.
survey of incentive mechanism for federated learning,” arXiv preprint [35] T. Song, Y. Tong and S. Wei, “Profit allocation for federated learning,”
arXiv:2106.15406, 2021. in Proc. 2019 IEEE Int. Conf. Big Data (Big Data), Los Angeles, CA,
[14] Q. Yang, Y. Liu, T.Chen and Y. Tong, “Federated machine learning: USA, December 9-12, 2019, pp. 2577–2586.
Concept and applications,” ACM Trans. Intell. Sys. Technol., vol. 10, [36] T. Wang, J. Rausch, C. Zhang, R. Jia and D. Song, “A principled ap-
no. 2, pp. 1–19, 2019. proach to data valuation for federated learning,” in Federated Learning
[15] J. Konečnỳ, H. B. McMahan, D. Ramage, and P. Richtárik, “Federated - Privacy and Incentive. Springer, 2020, vol. 12500, pp. 153–167.
optimization: Distributed machine learning for on-device intelligence,” [37] A. Richardson, A. F.-Ratsikas and B. Faltings, “Budget-bounded in-
arXiv preprint arXiv:1610.02527, 2016. centives for federated learning,” in Federated Learning - Privacy and
[16] J. Konečný, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh and D. Incentive. Springer, 2020, vol. 12500, pp. 176–188.
Bacon, “Federated learning: Strategies for improving communication [38] Y. Xue, C. Niu, Z. Zheng, S. Tang, C. Lv, F. Wu, G. Chen, “Toward
efficiency,” CoRR, vol. abs/1610.05492, 2016. [Online]. Available: understanding the influence of individual clients in federated learning,”
http://arxiv.org/abs/1610.05492 in Proc. The AAAI Conf. Arti. Intell., vol. 35, no. 12, 2021, pp. 10 560–
[17] H. B. McMahan, E. Moore, D. Ramage and B. A. y Arcas, “Federated 10 567.
learning of deep networks using model averaging.” arXiv preprint [39] P. Roy, S. Sarker, Md. A. Razzaque, Md. M. Rashid, M. M. Hassan
arXiv:1602.05629, 2016. and G. Fortino, “Distributed task allocation in mobile device cloud
[18] J. Weng, J. Weng, M. Li, Y. Zhang and W. Luo, “Deepchain: Auditable exploiting federated learning and subjective logic,” J. Syst. Archit., vol.
and privacy-preserving deep learning with blockchain-based incentive,” 113, p. 101972, 2021.
IEEE Trans. Dependable Secur. Comput., vol. 18, no. 5, pp. 2438–2455, [40] J. Kang, Z. Xiong, D. Niyato, Y. Zou, Y. Zhang and M. Guizani, “Re-
2021. liable federated learning for mobile networks,” IEEE Wirel. Commun.,
[19] H. Yu, Z. Liu, Y. Liu, T. Chen, M. Cong, X. Weng and D. Niyato and vol. 27, no. 2, pp. 72–80, 2020.
Q. Yang, “A fairness-aware incentive scheme for federated learning,” [41] L. Feng, Z. Yang, S. Guo, X. Qiu, W. Li and P. Yu, “Two-layered
in Proc. AIES ’20: AAAI/ACM Conference on AI, Ethics, and Society, blockchain architecture for federated learning over mobile edge net-
New York, NY, USA, February 7-8, 2020, pp. 393–399. work,” IEEE Network, vol. PP, pp. 1–14, 2021.
[20] H.Yu, Z. Liu, Y. Liu, T. Chen, M. Cong, X. Weng and D. Niyato and [42] L. Lyu, X. Xu, Q. Wang and H. Yu, “Collaborative fairness in federated
Q. Yang, “A sustainable incentive scheme for federated learning,” IEEE learning,” in Federated Learning - Privacy and Incentive. Springer,
Intell. Syst., vol. 35, no. 4, pp. 58–69, 2020. 2020, vol. 12500, pp. 189–204.
[21] J. Zhang,C. Li, A. R. Kelly and M. S. Kankanhalli, “Hierarchically [43] X. Huang, R. Yu, J. Kang, Z. Xia and Y. Zhang, “Software defined
fair federated learning,” CoRR, vol. abs/2004.10386, 2020. [Online]. networking for energy harvesting internet of things,” IEEE Internet
Available: https://arxiv.org/abs/2004.10386 Things J., vol. 5, no. 3, pp. 1389–1399, 2018.
[22] J. Huang, R. Talbi, Z. Zhao, S. Boucchenak, L. Y. Chen, and S. Roos, [44] W. Zhang, Q. Lu, Q. Yu, Z. Li, Y. Liu, S. K. Lo, S. Chen, X. Xu
“An exploratory analysis on users’ contributions in federated learning,” and L. Zhu, “Blockchain-based federated learning for device failure
in Proc. 2th IEEE Int.l Conf. Trust, Priv. Secur. Intell. Sys. Appl. (TPS- detection in industrial iot,” IEEE Internet Things J., vol. 8, no. 7, pp.
ISA),2020, pp. 20–29. 5926–5937, 2021.
[23] B. Liu, B. Yan, Y. Zhou, Z. Liang and C. Xu, “Fedcm: A real- [45] K. Toyoda and A. N. Zhang, “Mechanism design for an incentive-
time contribution measurement method for participants in federated aware blockchain-enabled federated learning platform,” in Proc. IEEE
learning,” in Porc. Int. Joi. Con. Neur. Netw., IJCNN 2021, Shenzhen, Int. Conf. Big Data, Los Angeles, CA, USA, December 9-12, 2019,
China, July 18-22, 2021, 2021, pp. 1–8. 2019, pp. 395–403.
[24] T. Nishio, R. Shinkuma and Na. B. Mandayam, “Estimation of individ- [46] X. Bao, C. Su, Y. Xiong, W. Huang and Y. Hu, “Flchain: A blockchain
ual device contributions for incentivizing federated learning,” in Proc. for auditable federated learning with trust and incentive,” in Proc.
IEEE Globecom Workshops, GLOBECOM Workshops, Virtual Event, 5th Int. Conf. Big Data Comput. Commun., IEEE BIGCOM 2019,
Taiwan, December 7-11, 2020, pp. 1–6. QingDao, China, August 9-11, 2019, 2019, pp. 151–159.
[25] J. Zhao, X. Zhu, J. Wang, and J. Xiao, “Efficient client contribution [47] J. Nash, “Non-cooperative games,” Ann. Math., vol. 54, pp. 286–295,
evaluation for horizontal federated learning,” in Proc. 2021 IEEE Int. 1951.
Conf. Acoust., Speech Signal Process. (ICASSP), Toronto, Canada, [48] G. Debreu, “A social equilibrium existence theorem,” Nat. Aca. Sci.,
June 6-11, 2021, pp. 3060–3064. vol. 38, no. 10, pp. 886–893, 1952.
[26] H. Lv, Z. Zheng, T. Luo, F. Wu, S. Tang, L. Hua, R. Jia and C. Lv, [49] K. Fan, “Fixed-point and minimax theorems in locally convex topolog-
“Data-free evaluation of user contributions in federated learning,” in ical linear spaces,” Nat. Aca. Sci., vol. 38, no. 2, pp. 121–126, 1952.
Proc. 19th Int. Symp. Model. Optim. Mob., Ad hoc, Wirel. Networks, [50] I. L Glicksberg, “A further generalization of the kakutani fixed point
WiOpt 2021, Virtual Conference, October 18-21, 2021, pp. 81–88. theorem, with application to nash equilibrium points,” Ame. Math. Soc.,
[27] S. Ma, Y. Cao and L. Xiong, “Transparent contribution evaluation vol. 3, no. 1, pp. 170–174, 1952.
for secure federated learning on blockchain,” in Proc.37th IEEE Int. [51] S. Kakutani, “A generalization of brouwer’s fixed point theorem,” Duke
Conf. Data Eng. Workshops,ICDE Workshops,Chania,Greece,April 19- Math. J., vol. 8, no. 3, pp. 457 – 459, 1941.
22, 2021, pp. 88–91. [52] B. Pejo, Q. Tang and G. Biczók, “Together or alone: The price of
[28] Y. Liu, Z. Ai, S. Sun, S. Zhang, Z. Liu and H. Yu, “Fedcoin: A peer- privacy in collaborative learning,” Proc. Priv. Enhancing Technol.,
to-peer payment system for federated learning,” in Federated Learning no. 2, pp. 47–65, 2019.
- Privacy and Incentive. Springer, 2020, vol. 12500, pp. 125–138. [53] D. Fudenberg and J. Tirole, Game Theory. The MIT Press, 1991.
[29] Y. Liu and J. Wei, “Incentives for federated learning: a hypothesis [54] J. Weng, J. Weng, H. Huang, C. Cai and C. Wang, “Fedserving: A fed-
elicitation approach,” CoRR, vol. abs/2007.10596, 2020. [Online]. erated prediction serving framework based on incentive mechanism,” in
Available: https://arxiv.org/abs/2007.10596 Proc. 40th IEEE Conf. Compu.Commun., INFOCOM 2021, pp. 1–10.
[30] Y. Sarikaya and O. Ercetin, “Motivating workers in federated learning: [55] R. Amir and I. Grilo, “Stackelberg versus cournot equilibrium,” Games
A stackelberg game perspective,” IEEE Netw. Lett., vol. 2, pp. 23–27. Eco. Beha., vol. 26, no. 1, pp. 1–21, 1999.
[31] S. Feng, D. Niyato, P. Wang, D. I. Kim and Y. Liang, “Joint service [56] R. J Aumann, “Backward induction and common knowledge of ratio-
pricing and cooperative relay communication for federated learning,” nality,” Games Eco. Beha., vol. 8, no. 1, pp. 6–19, 1995.
in Proc. 2019 Int. Conf. Internet Things (iThings) IEEE Green Comput. [57] Y. Zhan and J. Zhang, “An incentive mechanism design for efficient
Commun. (GreenCom) IEEE Cyber, Phys. Social Comput. (CPSCom) edge learning by deep reinforcement learning approach,” in Proc.
IEEE Smart Data (SmartData), 2019, pp. 815–820. 39th IEEE Conf. Comput. Commun., INFOCOM 2020, Toronto, ON,
[32] L. S. Shapley, A Value for n-Person Games. Princeton Canada, July 6-9, 2020, pp. 2489–2498.
University Press, 2016, pp. 307–318. [Online]. Available: [58] R P. McAfee and J. McMillan, “Auctions and Bidding,” J. Eco. Lit.,
https://doi.org/10.1515/9781400881970-018 vol. 25, no. 2, pp. 699–738, June 1987.
24
[59] Y. Zhang,C. Lee, D. Niyato and P. Wang, “Auction approaches for [84] Y. Ye, “Combining binary search and newton′s method to compute
resource allocation in wireless systems: A survey,” IEEE Commun. real roots for a class of real functions,” J. Compl., vol. 10, no. 3, pp.
Surveys Tuts., vol. 15, no. 3, pp. 1020–1041, 2013. 271–280, sep 1994.
[60] R. Rayan and R. Ramalavanya, “A survey on secure data offloading [85] Y. Zou, S. Feng, J. Xu, S. Gong, D. Niyato and W. Cheng, “Dynamic
using auction based mechanism,” in Proc. 2017 Int. Conf. Comput. games in federated learning training service market,” in Proc. IEEE
Power, Energy Informat. Commu. (ICCPEIC), 2017, pp. 335–341. Pacific Rim Conf. Commun., Comput.Sign. Process., PACRIM 2019,
[61] N. C. Luong, D. T. Hoang, P. Wang, D. Niyato and Z. Han, “Applica- Victoria, BC, Canada, August 21-23, 2019, pp. 1–6.
tions of economic and pricing models for wireless network security: A [86] W. Cheng, Y. Zou, J. Xu and W. Liu, “Dynamic games for social model
survey,” IEEE Commun. Surveys Tuts., vol. 19, no. 4, pp. 2735–2767, training service market via federated learning approach,” IEEE Trans.
2017. Comput. Social Syst., 2021.
[62] D. L.Reiley, “Vickrey auctions in practice: From nineteenth-century [87] TABAK, DANTEL, “Numerical solutions of differential game prob-
philately to twenty-first-century e-commerce,” J. Eco. Pers., vol. 14, lems,” Int. J. Syst. Sci., vol. 6, no. 6, pp. 591–599, 1975.
no. 3, pp. 183–192, September 2000. [88] Y. Zou, S. Feng, D. Niyato, Y.o Jiao, S. Gong and W. Cheng,
[63] S. Kim, “Incentive design and differential privacy based federated “Mobile device training strategies in federated learning: An evolu-
learning: A mechanism design perspective,” IEEE Access, vol. 8, pp. tionary game approach,” in Proc. 2019 Int. Conf. Internet Things
187 317–187 325, 2020. (iThings) IEEE Green Comput. Commun. (GreenCom) IEEE Cy-
[64] F. Daniel, R.John , The double auction market: Institutions, theories, ber, Phy. Soc. Comput. (CPSCom) IEEE Smart Data (SmartData),
and evidence. Westview Press, 1993, vol. 14. iThings/GreenCom/CPSCom/SmartData 2019, pp. 874–879.
[89] R. J DiPerna and P. Lions, “Ordinary differential equations, transport
[65] P. Cramton, Y. Shoham and RSteinberg, “Combinatorial auctions,” MIT
theory and sobolev spaces,” Inventiones mathematicae, vol. 98, no. 3,
Press Books, vol. 1, 2010.
pp. 511–547, 1989.
[66] F. Hsieh, “Combinatorial reverse auction based on revelation of la- [90] Sastry, Shankar, “Lyapunov stability theory,” in Nonlinear systems.
grangian multipliers,” Deci. Supp. Sys., vol. 48, no. 2, pp. 323–330, Springer, 1999, pp. 182–234.
2010. [91] G. Radanovic and B. Faltings, “Incentives for truthful information
[67] J. Xu, H. Wang and L. Chen, “Bandwidth allocation for multiple elicitation of continuous signals,” in Proc. The 28th AAAI Conf. Arti.
federated learning services in wireless edge networks,” CoRR, 2021. Intell., July 27 -31, 2014, Québec City, Québec, Canada, C. E. Brodley
[Online]. Available: https://arxiv.org/abs/2101.03627 and P. Stone, Eds., 2014, pp. 770–776.
[68] P. Bolton, M. Dewatripont,etc., Contract theory. MIT press, 2005. [92] E. Tahanian, M. Amouei, H. Fateh and M. Rezvani, “A game-theoretic
[69] A. E Roth and M. Sotomayor, “Two-sided matching,” Hand. game approach for robust federated learning,” Int. J. Eng., vol. 34, pp. 832–
theory eco. appl., vol. 1, pp. 485–541, 1992. 842, 2021.
[70] M. Wu, D. Ye, J. Ding, Y. Guo, R. Yu and M. Pan, “Incentivizing [93] P. Blanchard, R. Guerraoui, J. Stainer, “Machine learning with ad-
differentially private federated learning: A multi-dimensional contract versaries: Byzantine tolerant gradient descent,” in Proc. Neural Inf.
approach,” IEEE Internet Things J., pp. 1–1, 2021. Process. Syst. (NeurIPS), Long Beach, CA, USA,December 4-9, 2017,
[71] W. Y. B. Lim, S. Garg, Z. Xiong, D. Niyato, C. Leung, C. Miao and pp. 119–129.
M. Guizani, “Dynamic contract design for federated learning in smart [94] B. McMahan, E. Moore, D. Ramage, S. Hampson and B. A. y Arcas,
healthcare applications,” IEEE Internet Things J., pp. 1–1, 2020. “Communication-efficient learning of deep networks from decentral-
[72] P. Sun, H. Che, Z. Wangm Y. Wang, T. Wang, L. Wu and H. ized data,” in Proc. The 20th Int. Conf. Arti. Intell. Stats, AISTATS, 20-
Shao, “Pain-fl: Personalized privacy-preserving incentive for federated 22 April 2017, Fort Lauderdale, FL, USA, ser. Proceedings of Machine
learning,” IEEE J. Sel. Areas Commun., 2021. Learning Research, vol. 54, pp. 1273–1282.
[73] D. Chen, C. S. Hong, L. Wang, Y. Zha, Y. Zhang, X. Liu and Z. Han, [95] M. Tang and V. W. S. Wong, “An incentive mechanism for cross-silo
“Matching theory based low-latency scheme for multi-task federated federated learning: A public goods perspective,” in Proc. 40th IEEE
learning in mec networks,” IEEE Internet Things J., 2021. Conf. Compu. Commun., INFOCOM 2021, pp. 1–10.
[74] J. Kang, Z. Xiong, D. Niyato, Z. Cao and A. Leshem, “Training task [96] N. Chatzipanagiotis, D. Dentcheva and M. M. Zavlanos, “An aug-
allocation in federated edge learning: A matching-theoretic approach,” mented lagrangian method for distributed optimization,” Math. Pro-
in Proc. 2020 IEEE 17th Annual Cons. Commun. Netw. Conf. (CCNC). gram., vol. 152, no. 1-2, pp. 405–434, 2015.
IEEE, 2020, pp. 1–6. [97] Q. Hu, F. Li, X. Zou and Y. Xiao, “Correlated participation decision
[75] L. U. Khan, S. R. Pandey, N. H. Tran, W. Saad, Z. Han, M. N. making for federated edge learning,” in Proc. IEEE Global Commun.
H. Nguyen and C. S. Hong, “Federated learning for edge networks: Conf., GLOBECOM 2020, Virtual Event, Taiwan, December 7-11,
Resource optimization and incentive mechanism,” IEEE Commun. 2020, pp. 1–6.
Mag., vol. 58, no. 10, pp. 88–93, 2020. [98] C. Hasan, “Incentive mechanism design for federated learning:
[76] S. R. Pandey, N. H. Tran , M. Bennis, Y. K. Tun, Z. Han and C. S. Hedonic game approach,” CoRR, 2021. [Online]. Available:
Hong, “Incentivize to build: A crowdsourcing framework for federated https://arxiv.org/abs/2101.09673
learning,” in Proc. IEEE Global Commun. Conf., (GLOBECOM), [99] W. Y. B. Lim, Z. Xiong, C. Miao, D. Niyato, Q. Yang, C. Leung and
Waikoloa, HI, USA, December 9-13, 2019, pp. 1–6. H. V. Poor, “Hierarchical incentive mechanism design for federated
machine learning in mobile networks,” IEEE Internet Things J., vol. 7,
[77] J. Lee, D. Kim and D. Niyato, “Market analysis of distributed learn-
pp. 9575–9588, 2020.
ing resource management for internet of things: A game-theoretic
[100] X. Qu, S. Wangm Q. Hu and X. Cheng, “Proof of federated learning:
approach,” IEEE Internet Things J., vol. 7, no. 9, pp. 8430–8439, 2020.
A novel energy-recycling consensus algorithm,” IEEE Trans. Parallel
[78] L. Dong and Y. Zhang, “Federated learning service market: A game Distrib. Syst., vol. 32, no. 8, pp. 2074–2085, 2021.
theoretic analysis,” in Proc.IEEE Int. Conf. Wirel. Commun. Signal [101] K. Donahue and J. M. Kleinberg, “Model-sharing games: Analyzing
Process. (WCSP), Nanjing, China, October 21-23, 2020, 2020, pp. federated learning under,” in Proc. The 35th AAAI Conf. Arti. Intell.,
227–232. IAAI 2021, Virtual Event, February 2-9, 2021, pp. 5303–5311.
[79] W. Sun, N. Xu, L. Wang, H. Zhang and Y. Zhang, “Dynamic digital [102] D. H. N. Nguyen, Y. Zhang and Z. Han, “A contract-theoretic approach
twin and federated learning with incentives for air-ground networks,” to spectrum resource allocation in wireless virtualization,” in in Proc.
IEEE Trans. Netw. Sci. and Eng., vol. PP, pp. 1–1, 2020. 2016 IEEE Global Commun. Conf. (GLOBECOM). IEEE, 2016, pp.
[80] R. Hu and Y. Gong, “Trading data for learning: Incentive mechanism 1–6.
for on-device federated learning,” in Proc. IEEE Global Commun. [103] S. Diamond, S. Boyd, “Cvxpy: A python-embedded modeling language
Conf., (GLOBECOM). for convex optimization,” J. Mach. Learn. Res., vol. 17, no. 1, p.
[81] H. Chai, S. Leng, Y. Chen and K. Zhang, “A hierarchical blockchain- 2909–2913, 2016.
enabled federated learning algorithm for knowledge sharing in internet [104] L. Gao, X. Wang, Y. Xu and Q. Zhang, “Spectrum trading in cognitive
of vehicles,” IEEE Trans. Intell. Transp. Syst., vol. PP, pp. 1–12, 2020. radio networks: A contract-theoretic modeling approach,” IEEE J. Sel.
[82] X. Lin, J. Wu, A. K. Bashir, J. Li, W. Yang and J. Piran, “Blockchain- Areas Commun., vol. 29, pp. 843–855, 2011.
based incentive energy-knowledge trading in iot: Joint power transfer [105] K. R. Apt and T. Radzik, “Stable partitions in coalitional games,”
and ai design,” IEEE Internet Things J., vol. PP, pp. 1–1, 2020. Computer ence, 2006.
[83] Y. Zhan, P. Li, Z. Qu, D. Zeng and S. Guo, “A learning-based incentive [106] J. Jaramillo and R. Srikant, “A game theory based reputation mecha-
mechanism for federated learning,” IEEE Internet Things J., vol. 7, nism to incentivize cooperation in wireless ad hoc networks,” Ad Hoc
no. 7, pp. 6360–6368, 2020. Netw., vol. 8, no. 4, pp. 416–429, 2010.
25
[107] J. Ren, Y. Zhang, K. Zhang and X. Shen, “SACRM: social aware cellular wireless networks,” in Proc. 2020 IEEE Wirel. Commun. Netw.
crowdsourcing with reputation management in mobile sensing,” Com- Conf., WCNC 2020, Seoul, Korea (South), May 25-28, 2020.
put. Commun., vol. 65, pp. 55–65. [129] R. Zeng, S. Zhang, J. Wang and X. Chu, “Fmore: An incentive scheme
[108] A. Jin, W. Song, P. Wang, D. Niyato and P. Ju, “Auction mechanisms of multi-dimensional auction for federated learning in MEC,” in Proc.
toward efficient resource sharing for cloudlets in mobile cloud com- 40th IEEE Int. Conf. Distrib. Comput. Sys., ICDCS 2020, Singapore,
puting,” IEEE Trans. Serv. Comput., vol. 9, no. 6, pp. 895–909, 2016. Nov. 29 - Dec. 1, 2020, pp. 278–288.
[109] X. Wang, Y. Sui, J. Wang, C. Yuen and W. Wu, “A distributed truthful [130] Y. Deng, F. Lyu, J. Ren, Y. Chen, P. Yang, Y. Zhou, and Y. Zhang, “Fair:
auction mechanism for task allocation in mobile cloud computing,” Quality-aware federated learning with precise user incentive and model
IEEE Trans. Serv. Comput., vol. 14, no. 3, pp. 628–638, 2021. aggregation,” in Proc. IEEE INFOCOM 2021-IEEE Conf. Comput.
[110] D.P. Palomar and C. Mung , “A tutorial on decomposition methods for Commun., pp. 1–10.
network utility maximization,” IEEE J. Sel. Areas Commun., vol. 24, [131] J. Li, Y. Zhu, Y. Hua and J. Yu, “Crowdsourcing sensing to smart-
no. 8, pp. 1439–1451, 2006. phones: A randomized auction approach,” IEEE Trans. Mobile Com-
[111] W. Y. B. Lim, J. S. Ng, Z. Xiong, D. Niyato, C. Leung, C. Miao put., vol. 16, pp. 2764–2777, 2017.
and Q. Yang, “Incentive mechanism design for resource sharing [132] W. Zhong, K. Xie, Y. Liu, C. Yang and Sh. Xie, “Efficient auction
in collaborative edge learning,” CoRR, 2020. [Online]. Available: mechanisms for two-layer vehicle-to-grid energy trading in smart grid,”
https://arxiv.org/abs/2006.00511 in Proc.2017 IEEE Int. Conf. Commun. (ICC), 2017.
[112] J. S. Ng, W. Y. B. Lim, H. Dai, Z/ Xiong, J. Huang, D. Niyato, X. Hua, [133] R. B Myerson, “Optimal auction design,” Mathematics of operations
C. Leung and C. Miao, “Communication-efficient federated learning research, vol. 6, no. 1, pp. 58–73, 1981.
in uav-enabled iov: A joint auction-coalition approach,” in Proc. IEEE [134] W. Y. B. Lim, Z. Xiong, J. Kang, D. Niyato, C. S Leung, C. Miao and S.
GLOBECOM 2020 - 2020 IEEE Global Commun. Conf. Shen, “When information freshness meets service latency in federated
[113] J. S. Ng, W. Y. B. Lim, H. Dai, Z. Xiong, J. Huang, D. Niyato, X. Hua, learning: A task-aware incentive scheme for smart industries,” IEEE
C. Leung and C. Miao, “Joint auction-coalition formation framework Trans. Ind. Informat., 2020.
for communication-efficient federated learning in uav-enabled internet [135] Y. M. Saputra, D. T. Hoang, D. N. Nguyen and E.
of vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 4, pp. 2326– Dutkiewicz, “Dynamic federated learning-based economic framework
2344, 2021. for internet-of-vehicles,” CoRR, 2021. [Online]. Available:
[114] M. Cong, H. Yu, X. Weng, J. Qu, Y. Liu and S. Yiu, “A vcg-based fair https://arxiv.org/abs/2101.00191
incentive mechanism for federated learning,” CoRR, 2020. [Online]. [136] Y. M. Saputra, D. Nguyen, H. T. Dinh, T. X. Vu, E. Dutkiewicz and
Available: https://arxiv.org/abs/2008.06680 S. Chatzinotas, “Federated learning meets contract theory: Economic-
[115] M. Cong, X. Weng, H. Yu, J. Qu and S. Yiu, “Optimal procurement efficiency framework for electric vehicle networks,” IEEE Trans. Mo-
auction for cooperative production of virtual products: Vickrey- bile Comput., pp. 1–1, 2020.
clarke-groves meet cremer-mclean,” CoRR, 2020. [Online]. Available: [137] J. Kang, Z. Xiong, D. Niyato, H. Yu, Y. Liang and D. I. Kim, “Incentive
https://arxiv.org/abs/2007.14780 design for efficient federated learning in mobile networks: A contract
[116] M. Cong, H. Yu, X. Weng, and S. M. Yiu, “A game-theoretic framework theory approach,” in Porc. IEEE VTS Asia Pac. Wirel. Commun. Symp.,
for incentive mechanism design in federated learning,” in Federated APWCS 2019, Singapore, August 28-30, 2019, pp. 1–5.
Learning. Springer, 2020, pp. 205–222. [138] J. Kang, Z. Xiong, D. Niyato, D. Ye, D. I. Kim and J. Zhao, “Toward
[117] J. Crémer and R. P. McLean, “Optimal selling strategies under uncer- secure blockchain-enabled internet of vehicles: Optimizing consensus
tainty for a discriminating monopolist when demands are interdepen- management using reputation and contract theory,” IEEE Trans. Veh.
dent,” Econometrica, vol. 53, pp. 345–361, 1985. Technol., vol. 68, pp. 2906–2920, 2019.
[118] R. Jia, D. Dao, B. Wang, F. A. Hubis, N. Hynes, N. M. Gürel, B. Li, [139] J. Kang, Z. Xiong, D. Niyato, S. Xie and J. Zhang, “Incentive mech-
C. Zhang, D. Song and C. J. Spanos, “Towards efficient data valuation anism for reliable federated learning: A joint optimization approach
based on the shapley value,” in Proc. The 22nd Int. Conf. Arti. Intell. to combining reputation and contract theory,” IEEE Internet Things J.,
Stat., AISTATS 2019, 16-18 April 2019, Naha, Okinawa, Japan, vol. 89, vol. 6, no. 6, pp. 10 700–10 714, 2019.
pp. 1167–1176. [140] N, Ding, Z. Fang and J. Huang, “Incentive mechanism design for
[119] O. Ohrimenko, S. Tople, S. Tschiatschek, “Collaborative federated learning with multi-dimensional private information,” in
machine learning markets with data-replication-robust pay- Proc. 18th Int. Symp. Model. Optim. Mob., Ad Hoc, Wirel. Networks,
ments,” CoRR, vol. abs/1911.09052, 2019. [Online]. Available: WiOPT 2020, Volos, Greece, June 15-19, 2020, pp. 1–8.
http://arxiv.org/abs/1911.09052 [141] N. Ding, Z. Fang and J. Huang, “Optimal contract design for efficient
[120] S. Gollapudi, K. Kollias, D. Panigrahi and V. Pliatsika, “Profit sharing federated learning with multi-dimensional private information,” IEEE
and efficiency in utility games,” in Proc. Eur. Symp. Alg. (ESA 2017). J. Sel. Areas Commun., vol. 39, pp. 186–200, 2021.
Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017. [142] W. Y. B. Lim, J. Huang, Z. Xiong, J. Kang, D. Niyato, X. Hua, C.
[121] T. H. T. Le, N. H. Tran, Y. K. Tun, M. N. H. Nguyen, S. R. Pandey, Z. Leung and C.Miao, “Multi-dimensional contract-matching for federated
Han and C. S. Hong, “An incentive mechanism for federated learning learning in uav-enabled internet of vehicles,” in Proc. GLOBECOM
in wireless cellular networks: An auction approach,” IEEE Trans. Wirel. 2020 - 2020 IEEE Global Commun. Conf., pp. 1–6.
Commun., vol. 20, no. 8, pp. 4874–4887, 2021. [143] W. Y. B. Lim, J. Huang, Z. Xiong, J. Kang, D. Niyato, X. Hua,
[122] S. Fan, H. Zhang, Y. Zeng and W. Cai, “Hybrid blockchain-based C. Leung and C. Miao, “Towards federated learning in uav-enabled
resource trading system for federated learning in edge computing,” internet of vehicles: A multi-dimensional contract-matching approach,”
IEEE Internet Things J., vol. 8, pp. 2252–2264, 2021. IEEE Trans. Intell. Transp. Sys., vol. PP, pp. 1–15, 2021.
[123] N. Chen, N. Gravin and P. Lu, “On the approximability of budget [144] R. Fantacci and B. Picano, “When network slicing meets prospect
feasible mechanisms,” in Proc. The 22th ACM-SIAM Symp. Discrete theory: A service provider revenue maximization framework,” IEEE
Alg., SODA 2011, San Francisco, California, USA, January 23-25, Trans. Veh. Technol., vol. 69, no. 3, pp. 3179–3189, 2020.
2011, pp. 685–699. [145] A. S. Chakrabarti, B.K.Chakrabarti, A. Chatterjee and M. Mitra, “The
[124] Y. Jiao, P. Wang, D. Niyato, B. Lin and D. I. Kim, “Toward an kolkata paise restaurant problem and resource utilization,” Phy. A:
automated auction framework for wireless federated learning services Statis. Mech.its Appl., vol. 388, no. 12, pp. 2420–2426, 2009.
market,” IEEE Trans. Mob. Comput., vol. 20, no. 10, pp. 3034–3048, [146] “Distributed abs-slot access in dense heterogeneous networks: A poten-
2021. tial game approach with generalized interference model,” IEEE Access,
[125] J. Zhang, Y. Wu and R. Pan, “Incentive mechanism for horizontal vol. 5, pp. 94–104, 2016.
federated learning based on reputation and reverse auction,” in Proc. [147] X. Zhang, Z. Yang, Z. Zhou, H. Cai, L. Chen and X. Li, “Free market
Web Conf. 2021, 2021, pp. 947–956. of crowdsourcing: Incentive mechanism design for mobile sensing,”
[126] R. Lu, W. Zhang, Q. Li, X. Zhong, A. V. Vasilakos, “Auction based IEEE Trans. Parallel Distributed Syst., vol. 25, no. 12, pp. 3190–3200,
clustered federated learning in mobile edge computing system,” CoRR, 2014.
2021. [Online]. Available: https://arxiv.org/abs/2103.07150 [148] D. Zhao, X.Yang and H. Ma, “How to crowdsource tasks truthfully
[127] K. Zhang, Y. Shi, Y. Liu and Z. Yan, “Power demand response incentive without sacrificing utility: Online incentive mechanisms with budget
pricing model,” in Proc. 2019 IEEE Int. Conf. Big Data , Los Angeles, constraint,” in Proc. IEEE INFOCOM 2014 - IEEE Conf. Comput.
CA, USA, December 9-12, 2019, pp. 2605–2614. Commun., 2014, pp. 1213–1221.
[128] T. H. T. Le, N. H. Tran, Y. K. Tun, Z. Han and C. S. Hong, [149] T. Nishio, R. Shinkuma, and N. B. Mandayam, “Estimation of individ-
“Auction based incentive design for efficient federated learning in ual device contributions for incentivizing federated learning,” in Proc.
26