Document - Towards a dataset for natural language requirements processing

2017

Conference article Restricted

Towards a dataset for natural language requirements processing

Ferrari A., Spagnolo G. O., Gnesi S.

NAtural language processing Natural language requirements Requirements classifications Requirements document

[Context and motivation] The current breakthrough of natural language processing (NLP) techniques can provide the requirements engineering (RE) community with powerful tools that can help addressing specic tasks of natural language (NL) requirements analysis, such as traceability, ambiguity detection and requirements classification, to name a few. [Question/problem] However, modern NLP techniques are mainly statistical, and need large NL requirements datasets, to support appropriate training, test and validation of the techniques. The RE community has experimented with NLP since long time, but datasets were often proprietary, or limited to few software projects for which requirements were publicly available. Hence, replication of the experiments and generalization have always been an issue. [Principal idea/results] Our near future commitment is to provide a publicly available NL requirements dataset. [Contribution] To this end, we are collecting requirements documents from the Web, and we are representing them in a common XML format. In this paper, we present the current version of the dataset, together with our agenda concerning formatting, extension, and annotation of the dataset.

Source: joint REFSQ Workshops, Doctoral Symposium, Research Method Track, and Poster Track, 27/02/2017

Back to previous page

Cite as

BibTeX entry

@inproceedings{oai:it.cnr:prodotti:382379,
	title = {Towards a dataset for natural language requirements processing},
	author = {Ferrari A. and Spagnolo G.  O. and Gnesi S.},
	booktitle = {joint REFSQ Workshops, Doctoral Symposium, Research Method Track, and Poster Track, 27/02/2017},
	year = {2017}
}

CNR authors and affiliations

CNR authors

Ferrari, Alessio
0000-0002-0636-5663
Gnesi, Stefania
0000-0002-0139-0421
Spagnolo, Giorgio Oronzo
0000-0002-7771-0882

Laboratories

Formal Methods and Tools (2002-ongoing)
System and Software Evaluation (2002-ongoing)

Download

CNR ExploRA

Bibliographic record

Also available from

ceur-ws.org

Towards a dataset for natural language requirements processing

Share

Cite as

CNR authors and affiliations

Download