admin管理员组文章数量:1564700
文章目录
- Original Paper
- Motivation
- Contribution
- Methods
- Semantically-controlled text generation
- GPT2-based simulated user
- Datasets
- Qulac and ClariQ
- Multi-turn conversational data
- Future Work
- Knowledge
Original Paper
Evaluating Mixed-initiative Conversational Search Systems via User Simulation:
Motivation
Propose a conversational User Simulator, called USi, for automatic evaluation of such conversational search system.
Contribution
- propose a user simulator, USi, for conversational search system evaluation, capable of answering clarifying questions prompted by the search system
- perform extensive set of experiments to evaluate the feasibility of substituting real users with the user simulator
- release a dataset of multi-turn interactions acquired through crowdsourcing
Methods
Semantically-controlled text generation
We define the task of generating answers to clarifying questions as a sequence generation task.
Current SOTA language models formulate the task as next-word prediction task:
- generated text are prone to hallucination and in general lack semantic guidance
Answer generation needs to be conditioned on the underlying information need:
- a i a_i ai is the current token of the answer
- a < i a_{<i} a<i are all the previous ones
- i n , q , c q in,q,cq in,q,cq correspond to the information need, initial query, current clarifying question
GPT2-based simulated user
- base USi in the GPT-2 model with language modelling and classification losses(DoubleHead GPT-2)
- learn to generate the appropriate seq through the language modelling loss
- learn to distinguish a correct answer to the distractor one
- the two losses are linearly combined
Singel-turn responses:
-
GPT-2 input:
- accept as input two sequences: one with the original target answer in the end, the other with the distractor answer
- sample distractor answer from
ClariQ
dataset.
Conversation history-aware model:
-
history-aware GPT-2 input:
- [user] and [system] indicate the conversational turns between user and the conversational system respectively.
Inference:
- omit the answer a a a from the input seq.
- In order to generate answers, we use a combination of SOTA sampling techniques to generate a textual sequence from the trained model
The results are mainly about the setting of single-turn. Only some qualitative analysis for multi-turn are provided.
Datasets
Qulac and ClariQ
both built for single-turn offline evaluation.
Qulac
: (topic, facet, clarifying_question, answer). ClariQ
is an extension of Qulac
and contains additional non-ambiguous topics.
facet from Qulac
and ClariQ
represents the underlying information need, as it describes in detail what the intent behind the issued query is. Moreover, question represents the current asked question, while answer is our language modelling target.
Multi-turn conversational data
A major drawback of above datasets is that they are both built for single-turn offline evaluation.
we construct multi-turn data that resembles a more realistic interaction between a user and the system. Our user simulator USi is then further fine-tuned on this data.
- construct a crowdsourcing-based human-to-human interaction
- construct in 500 conversations up to depth of three
- construct edge cases: provide answers to additional 500 clarifying questions of poor quality, up to the depth of two
Future Work
- a pair-wise comparison of multi-turn conversations.
- aim to observe user simulator behaviour in unexpected, edge case scenarios
- for example, people will repeat the answer is the clarifying question is repeated. We want USi to do so.
Knowledge
Multi-turn passage retrieval: The system needs to understand the conversational context and retrieve appropriate passages from the collection.
Document-retrieval task with the answer to the prompted clarifying question: the initial query is expanded with the text of the clarifying question and the user’s answer and the fed into a retrieval model.
本文标签: 论文mixedInitiativeConversationalEvaluating
版权声明:本文标题:【论文阅读】Evaluating Mixed-initiative Conversational Search Systems via User Simulation 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.elefans.com/dianzi/1727253331a1105095.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论