Google Scholar

User profiles for Fuli Luo

Fuli Luo（罗福莉）

Peking University

Verified email at pku.edu.cn

Cited by 22669

[PDF] aclanthology.org

Incorporating glosses into neural word sense disambiguation

F Luo, T Liu, Q Xia, B Chang, Z Sui - … of the 56th Annual Meeting of …, 2018 - aclanthology.org

Word Sense Disambiguation (WSD) aims to identify the correct meaning of polysemous words
in the particular context. Lexical resources like WordNet which are proved to be of great …

Save Cite Cited by 130 Related articles All 5 versions View as HTML

[PDF] arxiv.org

Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model

…, D Guo, D Yang, D Chen, D Ji, E Li, F Lin, F Luo… - arXiv preprint arXiv …, 2024 - arxiv.org

We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized
by economical training and efficient inference. It comprises 236B total parameters, of which …

Save Cite Cited by 903 Related articles All 2 versions View as HTML

[PDF] arxiv.org

Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning

…, D Dai, D Chen, D Ji, E Li, F Lin, F Dai, F Luo… - arXiv preprint arXiv …, 2025 - arxiv.org

General reasoning represents a long-standing and formidable challenge in artificial
intelligence. Recent breakthroughs, exemplified by large language models (LLMs) and chain-of-…

Save Cite Cited by 9213 Related articles All 3 versions View as HTML

[PDF] arxiv.org

DeepSeek-Coder: when the large language model meets programming--the rise of code intelligence

…, W Zhang, G Chen, X Bi, Y Wu, YK Li, F Luo… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid development of large language models has revolutionized code intelligence in
software development. However, the predominance of closed-source models has restricted …

Save Cite Cited by 1840 Related articles All 3 versions View as HTML

[PDF] aclanthology.org

Raise a child in large language model: Towards effective and generalizable fine-tuning

R Xu, F Luo, Z Zhang, C Tan, B Chang… - Proceedings of the …, 2021 - aclanthology.org

Recent pretrained language models extend from millions to billions of parameters. Thus the
need to fine-tune an extremely large pretrained model with a limited training corpus arises in …

Save Cite Cited by 282 Related articles All 5 versions View as HTML

[PDF] aclanthology.org

[PDF][PDF] Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models

…, X Yu, Y Wu, Z Xie, YK Li, P Huang, F Luo… - Proceedings of the …, 2024 - aclanthology.org

In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for
managing computational costs when scaling up model parameters. However, conventional …

Save Cite Cited by 977 Related articles All 8 versions View as HTML

[PDF] arxiv.org

Deepseek llm: Scaling open-source language models with longtermism

…, W Liu, X Liu, X Liu, Y Liu, H Lu, S Lu, F Luo… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid development of open-source large language models (LLMs) has been truly remarkable.
However, the scaling law described in previous literature presents varying conclusions…

Save Cite Cited by 927 Related articles All 5 versions View as HTML

[PDF] nature.com

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

…, C Ruan, D Dai, D Chen, D Ji, E Li, F Lin, F Dai, F Luo… - Nature, 2025 - nature.com

General reasoning represents a long-standing and formidable challenge in artificial intelligence
(AI). Recent breakthroughs, exemplified by large language models (LLMs) 1 , 2 and …

Save Cite Cited by 1109 Related articles All 7 versions

[PDF] arxiv.org

Deepseek-v3 technical report

…, D Yang, D Chen, D Ji, E Li, F Lin, F Dai, F Luo… - arXiv preprint arXiv …, 2024 - arxiv.org

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B
total parameters with 37B activated for each token. To achieve efficient inference and cost-…

Save Cite Cited by 4463 Related articles All 3 versions View as HTML

[PDF] arxiv.org

Deepseek-coder-v2: Breaking the barrier of closed-source models in code intelligence

…, Y Wang, C Deng, J Li, C Zhao, C Ruan, F Luo… - arXiv preprint arXiv …, 2024 - arxiv.org

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language
model that achieves performance comparable to GPT4-Turbo in code-specific tasks. …

Save Cite Cited by 463 Related articles All 2 versions View as HTML

Create alert

Cite

Advanced search

Saved to My library

User profiles for Fuli Luo

Fuli Luo（罗福莉）

Incorporating glosses into neural word sense disambiguation

Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model

Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning

DeepSeek-Coder: when the large language model meets programming--the rise of code intelligence

Raise a child in large language model: Towards effective and generalizable fine-tuning

[PDF][PDF] Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models

Deepseek llm: Scaling open-source language models with longtermism

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

Deepseek-v3 technical report

Deepseek-coder-v2: Breaking the barrier of closed-source models in code intelligence

Related searches