2022
Conference article  Open Access

A dependency-aware utterances permutation strategy to improve conversational evaluation

Faggioli G., Ferrante M., Ferro N., Perego R., Tonellotto N.

Evaluation  Conversational search 

The rapid growth in the number and complexity of conversational agents has highlighted the need for suitable evaluation tools to describe their performance. The main evaluation paradigms move from analyzing conversations where the user explores information needs following a scripted dialogue with the agent. We argue that this is not a realistic setting: different users ask different questions (and in a diverse order), obtaining distinct answers and changing the conversation path. We analyze what happens to conversational systems performance when we change the order of the utterances in a scripted conversation while respecting temporal dependencies between them. Our results highlight that the performance of the system widely varies. Our experiments show that diverse orders of utterances determine completely different rankings of systems by performance. The current way of evaluating conversational systems is thus biased. Motivated by these observations, we propose a new evaluation approach based on dependency-aware utterance permutations to increase the power of our evaluation tools.

Source: ECIR 2022 - 44th European Conference on IR Research, pp. 184–198, Stavanger, Norway, 10-14/04/2022


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:468997,
	title = {A dependency-aware utterances permutation strategy to improve conversational evaluation},
	author = {Faggioli G. and Ferrante M. and Ferro N. and Perego R. and Tonellotto N.},
	doi = {10.1007/978-3-030-99736-6_13},
	booktitle = {ECIR 2022 - 44th European Conference on IR Research, pp. 184–198, Stavanger, Norway, 10-14/04/2022},
	year = {2022}
}