Yang Li

Mountain View, California, United States
693 followers 500+ connections

View mutual connections with Yang

Yang can introduce you to 10+ people at Meta

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to view profile

About

Experienced eng lead in Machine Learning and AI, with a demonstrated record of leading…

Activity

693 followers

Yang Li liked this
Report this post
Yang Li liked this

Lingjuan Peng

Lingjuan Peng

2mo

Yang Li liked this
After 13+ incredible years at Meta, it’s time for a new adventure. Joining in 2013, I never could have imagined the scale of the challenges we’d tackle or the lifelong friendships I’d build. It has been a chapter that quite literally defined my professional life, and I’m deeply grateful for every milestone hit and lesson learned along the way. Today, I’m thrilled to share that I am joining Google DeepMind to lead engineering of Growth & Discovery for Gemini App. I’ve long admired the mission to solve intelligence to advance science and benefit humanity, and I couldn’t be more excited to dive in and help bring the power of Gemini to more people. Onward! 🧠✨
49 Comments
Yang Li liked this
Report this post
Yang Li liked this

Jeff Huang

Jeff Huang

4mo

Yang Li liked this
After an incredible 14+ years at Meta, I’ll be moving on to my next adventure. It has been an honor to lead large engineering organizations across Instagram and Facebook during a time of massive growth. I’m deeply grateful to my teams and colleagues - together, we’ve tackled complex technical challenges and built products that serve billions. Leaving this community is bittersweet, but I’m ready for a new challenge. I’m thrilled to announce that I’ll be joining Google DeepMind to lead engineering for the Gemini app! We are at an inflection point in technology with AI, and I can’t wait to dive in with Chris Struhar, Jenny Blackburn, Josh Woodward, Demis Hassabis, and the team to help shape how these tools can help people everyday. I’ll be wrapping up at Meta over the next few weeks before starting in January. A huge thank you to everyone I’ve worked with over the years. Let's stay in touch!

public_profile__reactions
211 Comments
Yang Li liked this
Report this post
Yang Li liked this

Qi Guo

Qi Guo

5mo

Yang Li liked this
Really excited to share "Your Algorithm": a new AI powered feature on Instagram that lets users "see" and "tune" their algorithm to shape the feed in Reels tab. Glad to see a lot of excitements from the users and the media coverage: - Blog Post: https://lnkd.in/gd59WTac - Good Morning American: https://lnkd.in/gbsu-k6t - Verge: https://lnkd.in/gfzi2yht - TechCrunch: https://lnkd.in/gpZQJMYn - Engadget: https://lnkd.in/g4p5jq6f - IG head Adam's post: https://lnkd.in/gmRd9qXi Congrats to everyone who works on this amazing feature! It has been a lot of hard work. Look forward to what's ahead.
2 Comments
Yang Li liked this
Report this post
Xin Luna Dong

Xin Luna Dong

1y

Yang Li liked this
As another great honor, I was recently featured in "People of ACM" #PeopleOfACM: https://lnkd.in/gw_XusVw I appreciate the series of thought-provoking questions in the interview, which helped me reflect on the insights gained over the past 12 years of research and production. - Many of us use knowledge graphs on a regular basis on applications such as search engines (such as Google or Bing) or when we ask questions of a virtual assistant (such as Amazon Alexa or Apple Siri). Will you give us a general definition of what knowledge graphs are and how they make these applications possible? - Your most downloaded paper in the ACM DL is “Towards Next-Generation Intelligent Assistants Leveraging LLM Techniques.” What are large language models (LLM’s) and why do they complement rather than compete with knowledge graphs to improve intelligent assistants? - Will you tell us how the personal assistant you are developing at Meta will be an improvement over existing state-of-the-art personal assistants? - One example of a wearable personal assistant your team is working on are the Ray-Ban Meta Smart Glasses. Other technology companies have introduced wearable glasses in the past that were not embraced by consumers. What makes the Ray-Ban Meta Smart Glasses unique? - What is another area of your field that is poised to make a big impact in the near future?

ACM, Association for Computing Machinery

ACM, Association for Computing Machinery

1y

Yang Li liked this
Why are the new generation of smart glasses set to transform personal assistants? Find out in this week's #PeopleOfACM profile as we interview Xin Luna Dong, Principal Research Scientist at Meta Reality Labs. Dong and her team are working in contextual AI, multimodal conversation, search, question answering, as well as recommendation and personalization. In her interview, Dong explains what knowledge graphs are and why they are so important to machine learning applications. She also discusses some of the unique features of the Ray Ban Meta Smart Glasses. Read the full interview here: https://bit.ly/4gf4IUZ

public_profile__reactions
5 Comments
Yang Li liked this
Report this post
Yang Li liked this

Xin Luna Dong

Xin Luna Dong

1y

Yang Li liked this
My great honor to be invited for an episode of ACM ByteCast, to talk about knowledge graphs, retrieval-augmented generation (RAG), smart assistants, career advice, etc. https://lnkd.in/gaPhUbXa "Luna shares how early experiences growing up in China sparked her interest in computing, and how her PhD experience in data integration lay the groundwork for future work with knowledge graphs. Luna and Bruke dive into the relevance and structure of knowledge graphs, and her work on Google Knowledge Graph and Amazon Product Knowledge Graph. She talks about the progression of data integration methodologies over the past two decades, how the rise of ML and AI has given rise to a new one, and how knowledge graphs can enhance LLMs. She also mentions promising emerging technologies for answer generation and recommender systems such as Retrieval-Augmented Generation (RAG), and her work on the Comprehensive RAG Benchmark (CRAG) and the KDD Cup competition. Luna also shares her passion for making information access effortless, especially for non-technical users such as small business owners, and suggests some solutions."

In this episode of ACM ByteCast, Bruke Kifle hosts ACM and IEEE Fellow Xin Luna Dong, Principal Scientist at Meta Reality Labs.

In this episode of ACM ByteCast, Bruke Kifle hosts ACM and IEEE Fellow Xin Luna Dong, Principal Scientist at Meta Reality Labs.
2 Comments
Yang Li liked this
Report this post
Yang Li liked this

Ji Liu

Ji Liu

1y

Yang Li liked this
There are multiple positions of both FT and 2025 summer interns for reco system engineers and scientists at Instagram. Feel free to ping me with your CV. The candidates are expected with strong ML or system background. The experiences in recommendation system are a big bonus especially for senior candidates.
5 Comments
Yang Li liked this
Report this post
Yang Li liked this

Xin Luna Dong

Xin Luna Dong

1y

Yang Li liked this
I have arrived at #KDD2024 at Barcelona and excited for three works from my team. 1. KDD Cup CRAG (Comprehensive RAG benchmark) Workshop 8/27 (Tuesday) 11am-4:30pm, Room: M213-214 Workshop website: https://lnkd.in/dF5__deG Where are we in building the RAG solutions? The CRAG competition gives a comprehensive quantified answer (see figure below). Come to the workshop to know more details! -- Winner announcement from 2.3K participants and 5.6K submissions https://lnkd.in/gvHENVYz -- Competition overview by Xiao Yang, the brain behind CRAG, and Scott Wen-tau Yih, who coined the term "RAG". -- 3 keynotes by RAG experts Juan Sequeda, Mohit Iyyer, and Patrick Lewis, to discuss RAG solutions from perspectives of KG, IR, and NLP. -- 6 talks from winning teams discussing winning RAG solutions. 2. ADS Invited talk: Next-Generation Intelligent Assistant for Wearable Devices 8/28 (Wednesday) 12-1pm, Room: 113 -- describing our latest progress on leveraging GenAI to build trustworthy, multi-modal, and personalized smart assistants for wearable devices https://lnkd.in/gez3EXec 3. Paper talk: Lumos: Empowering multi-modal LLMs with scene text recognition. 8/28 (Wednesday) 12-1pm -- describing how we enable scene text recognition on wearable devices Paper: https://lnkd.in/g3bz9DMy

public_profile__reactions
4 Comments
Yang Li liked this
Report this post
Yang Li liked this

Michelle Gong

Michelle Gong

1y

Yang Li liked this
Hiring a Technical Director to lead the long-term vision for Search. Please apply through our web site if you're interested in this exciting opportunity. https://lnkd.in/gDZCT-3X
2 Comments

See all activities

Experience

Meta

San Francisco Bay Area
-
-

San Francisco Bay Area
-

Santa Barbara, California Area
-

Greater Seattle Area
-

Mountain View
-

Cupertino, CA
-

Beijing City, China

Education

University of California, Santa Barbara

-

2010 - 2015
-

2006 - 2010
-

2009 - 2009

Publications

Guess Me If You Can: Acronym Disambiguation for Enterprises

Proc. of the 56th Annual Meeting of the Association for Computational Linguistics (ACL'2018) July 1, 2018

Acronyms are abbreviations formed from the initial components of words or phrases. In enterprises, people often use acronyms to make communications more efficient. However, acronyms could be difficult to understand for people who are not familiar with the subject matter (new employees, etc.), thereby affecting productivity. To alleviate such troubles, we study how to automatically resolve the true meanings of acronyms in a given context. Acronym disambiguation for enterprises is challenging for…

Acronyms are abbreviations formed from the initial components of words or phrases. In enterprises, people often use acronyms to make communications more efficient. However, acronyms could be difficult to understand for people who are not familiar with the subject matter (new employees, etc.), thereby affecting productivity. To alleviate such troubles, we study how to automatically resolve the true meanings of acronyms in a given context. Acronym disambiguation for enterprises is challenging for several reasons. First, acronyms may be highly ambiguous since an acronym used in the enterprise could have multiple internal and external meanings. Second, there are usually no comprehensive knowledge bases such as Wikipedia available in enterprises. Finally, the system should be generic to work for any enterprise. In this work we propose an end-to-end framework to tackle all these challenges. The framework takes the enterprise corpus as input and produces a high-quality acronym disambiguation system as output. Our disambiguation models are trained via distant supervised learning, without requiring any manually labeled training examples. Therefore, our proposed framework can be deployed to any enterprise to support high-quality acronym disambiguation. Experimental results on real world data justified the effectiveness of our system.

See publication
Entity Disambiguation with Linkless Knowledge Bases

Proc. of the 25th International World Wide Web Conference (WWW'2016) April 1, 2016

Named Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics to plain text and distinguish homonymous entities. Previous research has tackled this problem by making use of two types of context-aware features derived from the reference knowledge base, namely, the context similarity and the semantic relatedness. Both…

Named Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics to plain text and distinguish homonymous entities. Previous research has tackled this problem by making use of two types of context-aware features derived from the reference knowledge base, namely, the context similarity and the semantic relatedness. Both features heavily rely on the cross-document hyperlinks within the knowledge base: the semantic relatedness feature is directly measured via those hyperlinks, while the context similarity feature implicitly makes use of those hyperlinks to expand entity candidates' descriptions and then compares them against the query context. Unfortunately, cross-document hyperlinks are rarely available in many closed domain knowledge bases and it is very expensive to manually add such links. Therefore few algorithms can work well on linkless knowledge bases. In this work, we propose the challenging Named Entity Disambiguation with Linkless Knowledge Bases (LNED) problem and tackle it by leveraging the useful disambiguation evidences scattered across the reference knowledge base. We propose a generative model to automatically mine such evidences out of noisy information. The mined evidences can mimic the role of the missing links and help boost the LNED performance. Experimental results show that our proposed method substantially improves the disambiguation accuracy over the baseline approaches.

See publication
Answering Elementary Science Questions by Constructing Coherent Scenes Using Background Knowledge

Proc. of the Conference on Empirical Methods in Natural Language Processing (EMNLP'2015) September 1, 2015

Much of what we understand from text is not explicitly stated. Rather, the reader uses his/her knowledge to fill in gaps and create a coherent, mental picture or "scene" depicting what text appears to convey. The scene constitutes an understanding of the text, and can be used to answer questions that go beyond the text. Our goal is to answer elementary science questions, where this requirement is pervasive; A question will often give a partial description of a scene and ask the student about…

Much of what we understand from text is not explicitly stated. Rather, the reader uses his/her knowledge to fill in gaps and create a coherent, mental picture or "scene" depicting what text appears to convey. The scene constitutes an understanding of the text, and can be used to answer questions that go beyond the text. Our goal is to answer elementary science questions, where this requirement is pervasive; A question will often give a partial description of a scene and ask the student about implicit information. We show that by using a simple "knowledge graph" representation of the question, we can leverage several large-scale linguistic resources to provide missing background knowledge, somewhat alleviating the knowledge bottleneck in previous approaches. The coherence of the best resulting scene, built from a question/answer-candidate pair, reflects the confidence that the answer candidate is correct, and thus can be used to answer multiple choice questions. Our experiments show that this approach outperforms competitive algorithms on several datasets tested. The significance of this work is thus to show that a simple "knowledge graph" representation allows a version of "interpretation as scene construction" to be made viable.

See publication
Interpreting the Public Sentiment Variations on Twitter

Transactions on Knowledge and Data Engineering (TKDE'2013) August 31, 2013

Millions of users share their opinions on Twitter, making it a valuable platform for tracking and analyzing public sentiment. Such tracking and analysis can provide critical information for decision making in various domains. Therefore it has attracted attention in both academia and industry. Previous research mainly focused on modeling and tracking public sentiment. In this work, we move one step further to interpret sentiment variations. We observed that emerging topics (named foreground…

Millions of users share their opinions on Twitter, making it a valuable platform for tracking and analyzing public sentiment. Such tracking and analysis can provide critical information for decision making in various domains. Therefore it has attracted attention in both academia and industry. Previous research mainly focused on modeling and tracking public sentiment. In this work, we move one step further to interpret sentiment variations. We observed that emerging topics (named foreground topics) within the sentiment variation periods are highly related to the genuine reasons behind the variations. Based on this observation, we propose a Latent Dirichlet Allocation (LDA) based model, Foreground and Background LDA (FB-LDA), to distill foreground topics and filter out longstanding background topics. These foreground topics can give potential interpretations of the sentiment variations. To further enhance the readability of the mined reasons, we select the most representative tweets for foreground topics and develop another generative model called Reason Candidate and Background LDA (RCB-LDA) to rank them with respect to their "popularity" within the variation period. Experimental results show that our methods can effectively find foreground topics and rank reason candidates. The proposed models can also be applied to other tasks such as finding topic differences between two sets of documents.

See publication
Memory Efficient Minimum Substring Partitioning

Proc. of the 39th International Conference on Very Large Databases (VLDB'2013) August 31, 2013

Massively parallel DNA sequencing technologies are revolutionizing genomics research. Billions of short reads generated at low costs can be assembled for reconstructing the whole genomes. Unfortunately, the large memory footprint of the existing de novo assembly algorithms makes it challenging to get the assembly done for higher eukaryotes like mammals. In this work, we investigate the memory issue of constructing de Bruijn graph, a core task in leading assembly algorithms, which often consumes…

Massively parallel DNA sequencing technologies are revolutionizing genomics research. Billions of short reads generated at low costs can be assembled for reconstructing the whole genomes. Unfortunately, the large memory footprint of the existing de novo assembly algorithms makes it challenging to get the assembly done for higher eukaryotes like mammals. In this work, we investigate the memory issue of constructing de Bruijn graph, a core task in leading assembly algorithms, which often consumes several hundreds of gigabytes memory for large genomes. We propose a disk-based partition method, called Minimum Substring Partitioning (MSP), to complete the task using less than 10 gigabytes memory, without runtime slowdown. MSP breaks the short reads into multiple small disjoint partitions so that each partition can be loaded into memory, processed individually and later merged with others to form a de Bruijn graph. By leveraging the overlaps among the k-mers (substring of length k), MSP achieves astonishing compression ratio: The total size of partitions is reduced from Θ(kn) to Θ(n), where n is the size of the short read database, and k is the length of a k-mer. Experimental results show that our method can build de Bruijn graphs using a commodity computer for any large-volume sequence dataset.

See publication
Mining Evidences for Named Entity Disambiguation

Proc. of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'2013) August 31, 2013

Named entity disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a knowledge base such as Wikipedia. Such disambiguation can help enhance readability and add semantics to plain text. It is also a central step in constructing high-quality information network or knowledge graph from unstructured text. Previous research has tackled this problem by making use of various textual and structural features from a…

Named entity disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a knowledge base such as Wikipedia. Such disambiguation can help enhance readability and add semantics to plain text. It is also a central step in constructing high-quality information network or knowledge graph from unstructured text. Previous research has tackled this problem by making use of various textual and structural features from a knowledge base. Most of the proposed algorithms assume that a knowledge base can provide enough explicit and useful information to help disambiguate a mention to the right entity. However, the existing knowledge bases are rarely complete (likely will never be), thus leading to poor performance on short queries with not well-known contexts. In such cases, we need to collect additional evidences scattered in internal and external corpus to augment the knowledge bases and enhance their disambiguation power. In this work, we propose a generative model and an incremental algorithm to automatically mine useful evidences across documents. With a specific modeling of "background topic" and "unknown entities", our model is able to harvest useful evidences out of noisy information. Experimental results show that our proposed method outperforms the state-of-the-art approaches significantly: boosting the disambiguation accuracy from 43% (baseline) to 86% on short queries derived from tweets.

See publication

Languages

Native Mandarin

Native or bilingual proficiency
Fluent English

Professional working proficiency

View Yang’s full profile

See who you know in common
Get introduced
Contact Yang directly

Join to view full profile

Other similar profiles

Yi Cui

Yi Cui

Palo Alto, CA

Connect
Shiva Mahajan

Shiva Mahajan

Sunnyvale, CA

Connect
Kaushik Iska

Kaushik Iska

Austin, Texas Metropolitan Area

Connect
Ji Xue

Ji Xue

New York, NY

Connect
Hidayet Aksu

Hidayet Aksu

San Francisco Bay Area

Connect
Sheetal Shalini

Sheetal Shalini

San Francisco Bay Area

Connect
Drishan Arora

Drishan Arora

San Francisco Bay Area

Connect
Li Li 李力

Li Li 李力

San Francisco Bay Area

Connect
Yanan Qian

Yanan Qian

San Francisco Bay Area

Connect
Yitian Li

Yitian Li

Mountain View, CA

Connect
Qihui Li

Qihui Li

Seattle, WA

Connect
Ali Stanfield

Ali Stanfield

Portland, Oregon Metropolitan Area

Connect
Yue Zhao

Yue Zhao

Mountain View, CA

Connect
Dacheng Liu

Dacheng Liu

New York, NY

Connect
Lu Qu

Lu Qu

Mountain View, CA

Connect
Cicy T.

Cicy T.

Mountain View, CA

Connect
Sonali Bhadra

Sonali Bhadra

United States

Connect
Kavin Karthik

Kavin Karthik

Santa Clara, CA

Connect
Yu Wang

Yu Wang

Seattle, WA

Connect
Fei Xia

Fei Xia

Mountain View, CA

Connect

Explore more posts

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses

See all courses

Yang Li

Mountain View, California, United States 693 followers 500+ connections

About

Activity

693 followers

Lingjuan Peng

Jeff Huang

Qi Guo

Xin Luna Dong

ACM, Association for Computing Machinery

Xin Luna Dong

Ji Liu

Xin Luna Dong

Michelle Gong

Experience

-

-

-

-

-

-

-

Education

-

-

-

Publications

Proc. of the 56th Annual Meeting of the Association for Computational Linguistics (ACL'2018) July 1, 2018

Proc. of the 25th International World Wide Web Conference (WWW'2016) April 1, 2016

Proc. of the Conference on Empirical Methods in Natural Language Processing (EMNLP'2015) September 1, 2015

Transactions on Knowledge and Data Engineering (TKDE'2013) August 31, 2013

Proc. of the 39th International Conference on Very Large Databases (VLDB'2013) August 31, 2013

Proc. of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'2013) August 31, 2013

Languages

Native Mandarin

Native or bilingual proficiency

Fluent English

Professional working proficiency

View Yang’s full profile

Other similar profiles

Yi Cui

Shiva Mahajan

Kaushik Iska

Ji Xue

Hidayet Aksu

Sheetal Shalini

Drishan Arora

Li Li 李力

Yanan Qian

Yitian Li

Qihui Li

Ali Stanfield

Yue Zhao

Dacheng Liu

Lu Qu

Cicy T.

Sonali Bhadra

Kavin Karthik

Yu Wang

Fei Xia

Explore more posts

Explore top content on LinkedIn

Add new skills with these courses

Building Deep Learning Applications with Keras

Deep Learning and Generative AI: Data Prep, Analysis, and Visualization with Python

Data Quality: Measure, Improve, and Enforce Reliable Systems

Mountain View, California, United States
693 followers 500+ connections