Xun Liang, Shichao Song, Zifan Zheng, Hanyu Wang, Qingchen Yu, Xunkai Li, Rong-Hua Li, Feiyu Xiong, Zhiyu Li
Large language models (LLMs) are expected to respond accurately but often exhibit deficient reasoning or generate hallucinatory content. To address these, studies prefixed with Self-'' such as Self-Consistency, Self-Improve, and Self-Refine have been initiated. They share a commonality: involving LLMs evaluating and updating itself to mitigate the issues. Nonetheless, these efforts lack a unified perspective on summarization, as existing surveys predominantly focus on categorization without examining the motivations behind these works. In this paper, we summarize a theoretical framework, termed Internal Consistency, which offers unified explanations for phenomena such as the lack of reasoning and the presence of hallucinations. Internal Consistency assesses the coherence among LLMs' latent layer, decoding layer, and response layer based on sampling methodologies. Expanding upon the Internal Consistency framework, we introduce a streamlined yet effective theoretical framework capable of mining Internal Consistency, named Self-Feedback. The Self-Feedback framework consists of two modules: Self-Evaluation and Self-Update. This framework has been employed in numerous studies. We systematically classify these studies by tasks and lines of work; summarize relevant evaluation methods and benchmarks; and delve into the concern,
Does Self-Feedback Really Work?'' We propose several critical viewpoints, including the Hourglass Evolution of Internal Consistency'',
Consistency Is (Almost) Correctness'' hypothesis, and ``The Paradox of Latent and Explicit Reasoning''. Furthermore, we outline promising directions for future research. We have open-sourced the experimental code, reference list, and statistical data, available at \url{this https URL}.
URL: https://huggingface.co/papers/2407.14507
Here is a summary of the paper "Internal Consistency and Self-Feedback in Large Language Models: A Survey" in bullet points:
Problem:
- Large Language Models (LLMs) often exhibit deficient reasoning and generate hallucinatory content.
- Existing surveys on "Self-" prefixed methods (e.g., Self-Consistency, Self-Improve) lack a unified perspective.
Proposed Solutions:
- Internal Consistency Framework: A unified framework to explain reasoning deficiencies and hallucinations. It assesses coherence across the LLM's latent, decoding, and response layers using sampling methodologies.
- Self-Feedback Framework: A streamlined framework for Internal Consistency Mining. It consists of two modules:
- Self-Evaluation: Captures Internal Consistency signals.
- Self-Update: Leverages signals to enhance the model's response or the model itself.
Key Contributions:
- Unified Perspective: Internal Consistency Mining provides a unified perspective for reasoning elevation and hallucination alleviation tasks.
- Self-Feedback Framework: A simple and comprehensive framework for improving Internal Consistency.
- Taxonomy: Classifies Self-Feedback studies by tasks and lines of work, summarizing relevant evaluation methods and benchmarks.
- Critical Viewpoints: Addresses the question "Does Self-Feedback Really Work?" with insights like the "Hourglass Evolution of Internal Consistency" and the "Consistency Is (Almost) Correctness" hypothesis.
- Future Directions: Outlines promising research directions, including textual self-awareness, the paradox of latent and explicit reasoning, and deeper exploration of decoding and latent layers.
Other Key Points:
- The paper argues that "lack of reasoning" and "exhibiting hallucinations" share the same essence, stemming from low Internal Consistency.
- It highlights the importance of Internal Consistency for AI safety and robustness.
- The paper provides a comprehensive taxonomy of Self-Feedback methods, categorized by tasks and lines of work.
- It discusses various methods for acquiring consistency signals, including uncertainty estimation, confidence estimation, hallucination detection, verbal critiquing, contrastive optimization, and external feedback.
- The paper explores different lines of work for reasoning elevation and hallucination alleviation, summarizing their strengths and weaknesses.
- It addresses the question of whether Self-Feedback truly works, presenting conflicting viewpoints and a discussion on the "Consistency Is (Almost) Correctness" hypothesis.
- The paper concludes by outlining future research directions and challenges in the field.