A Reinforcement Learning Algorithm Using Temporal Difference Error in Ant Model

Lee, SeungGwan; Chung, TaeChoong

doi:10.1007/11494669_27

SeungGwan Lee¹⁹ &
TaeChoong Chung²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3512))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

3131 Accesses

Abstract

When agent chooses some action and does state transition in present state in reinforcement learning, it is important subject to decide how will reward for conduct that agent chooses. In this paper, we suggest multi colony interaction ant reinforcement learning model using TD-error to original Ant-Q learning. This method is a hybrid of multi colony interaction by elite strategy and reinforcement learning applying TD-error to Ant-Q. We could know through an experiment that proposed reinforcement learning method converges faster to optimal solution than original ACS and Ant-Q.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improving ant colony optimization algorithm with epsilon greedy and Levy flight

Article Open access 31 March 2020

Dynamic heuristic acceleration of linearly approximated SARSA(\(\lambda \)): using ant colony optimization to learn heuristics dynamically

Article Open access 03 May 2019

A Pseudo-dynamic Search Ant Colony Optimization Algorithm with Improved Negative Feedback Mechanism to Solve TSP

References

Colorni, A., Dorigo, M., Maniezzo, V.: An investigation of some properties of an ant algorithm. In: Manner, R., Manderick, B. (eds.) Proceediings of the Parallel Parallel Problem Solving from Nature Conference(PPSn 1992), pp. 509–520. Elsevier Publishing, Amsterdam (1992)
Google Scholar
Colorni, A., Dorigo, M., Maniezzo, V.: Distributed optimization by ant colonies. In: Varela, F., Bourgine, P. (eds.) Proceedings of ECAL 1991 - European Conference of Artificial Life, Paris, France, pp. 134–144. Elsevier Publishing, Amsterdam (1991)
Google Scholar
Gambardella, L.M., Dorigo, M.: Solving symmetric and asymmetric TSPs by ant colonies. In: Proceedings of IEEE International Conference of Evolutionary Computation, IEEE-EC 1996, pp. 622–627. IEEE Press, Los Alamitos (1996)
Chapter Google Scholar
Drigo, M., Maniezzo, V., Colorni, A.: The ant system: optimization by a colony of cooperation agents. IEEE Transactions of Systems, Man, and Cybernetics-Part B 26(2), 29–41 (1996)
Article Google Scholar
Stutzle, T., Hoos, H.: The ant system and local search for the traveling salesman problem. In: Proceedings of ICEC 1997 IEEE 4th International Conference of Evolutionary (1997)
Google Scholar
Gambardella, L.M., Dorigo, M.: Ant-Q: a reinforcement learning approach to the traveling salesman problem. In: Prieditis, A., Russell, S. (eds.) Proceedings of ML 1995, Twelfth International Conference on Machine Learning, pp. 252–260. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Dorigo, M., Gambardella, L.M.: A study of some properties of Ant-Q. In: Voigt, H.M., Ebeling, W., Rechenberg, I., Schwefel, H.S. (eds.) Proceedings of PPSN IVFourth International Conference on Parallel Problem Solving From Nature, pp. 656–665. Springer, Berlin (1996)
Chapter Google Scholar
Fiecher, C.N.: Efficient reinforcement learning. In: Proceedings of the Seventh Annual ACM Conference On Computational Learning Theory, pp. 88–97 (1994)
Google Scholar
Barnald, E.: Temporal-difference methods and markov model. IEEE Transactions on Systems, Man, and Cybernetics 23, 357–365 (1993)
Article Google Scholar
Kawamura, H., Yamamoto, M., Suzuki, K., Ohuchi, A.: Multiple Ant Colonies Algorithm Based on Colony Level Interactions. IEICE Transactions E83-A(2), 371–379 (2000)
Google Scholar
Gambardella, L.M., Dorigo, M.: Ant Colony System: A Cooperative Learning approach to the Traveling Salesman Problem. IEEE Transactions on Evolutionary Computation 1(1) (1997)
Google Scholar
http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95

Download references

Author information

Authors and Affiliations

School of Computer Science and Information Engineering, Catholic University, 43-1, Yeokgok 2-Dong, Wonmi-Gu, Bucheon-Si, Gyeonggi-Do, 420-743, Korea
SeungGwan Lee
School of Electronics and Information, KyungHee University, 1 Seocheon-Ri, Kiheung-Up, Yongin-Si, Gyeonggi-Do, 449-701, Korea
TaeChoong Chung

Authors

SeungGwan Lee
View author publications
You can also search for this author in PubMed Google Scholar
TaeChoong Chung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Ingeniería Electrónica, Universitat Politècnica de Catalunya (UPC). E.T.S.I. de Telecomunicación, Campus Norte, Edificio C4, C/ Jordi Girona, 1-3, E08034, Barcelona, Spain
Joan Cabestany
Department of Computer Architecture and Computer Technology, University of Granada,
Alberto Prieto
Grupo ISIS, Dpto. Tecnología Electrónica ETSI Telecomunicación, Universidad de Málaga, Campus de Teatinos, 29071, Málaga, Spain
Francisco Sandoval

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, S., Chung, T. (2005). A Reinforcement Learning Algorithm Using Temporal Difference Error in Ant Model. In: Cabestany, J., Prieto, A., Sandoval, F. (eds) Computational Intelligence and Bioinspired Systems. IWANN 2005. Lecture Notes in Computer Science, vol 3512. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11494669_27

Download citation

DOI: https://doi.org/10.1007/11494669_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26208-4
Online ISBN: 978-3-540-32106-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Reinforcement Learning Algorithm Using Temporal Difference Error in Ant Model

Abstract

Access this chapter

Preview

Similar content being viewed by others

Improving ant colony optimization algorithm with epsilon greedy and Levy flight

Dynamic heuristic acceleration of linearly approximated SARSA(\(\lambda \)): using ant colony optimization to learn heuristics dynamically

A Pseudo-dynamic Search Ant Colony Optimization Algorithm with Improved Negative Feedback Mechanism to Solve TSP

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Reinforcement Learning Algorithm Using Temporal Difference Error in Ant Model

Abstract

Access this chapter

Preview

Similar content being viewed by others

Improving ant colony optimization algorithm with epsilon greedy and Levy flight

Dynamic heuristic acceleration of linearly approximated SARSA(\(\lambda \)): using ant colony optimization to learn heuristics dynamically

A Pseudo-dynamic Search Ant Colony Optimization Algorithm with Improved Negative Feedback Mechanism to Solve TSP

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation