Report  Unknown

Joint modeling of arrival process and length distribution of queries in Web search engines

Cassarà P., Colucci M., Gotta A., Tonellotto N.

Web search engine  Batch arrival process  Kernel-based probability distribution models  Generalized cross entropy  Computer-communication networks 

This paper proposes a novel fitting procedure via non-parametric kernel- based models of the probability mass function of a discrete arrival process, derived from real traffic traces of queries to a Web search engine. Most of the adopted estimation techniques for probability mass functions are based on parameter estimations for a given family of probability distri- bution functions. Conversely, the proposed procedure, jointly with a kernel-based model of the probability distribution function, doesn't need any assumptions about membership to a families of distributions, or about parameters. The fitting procedure based on the Generalized Cross-Entropy resolves a Quadratic Programming Problem. Furthermore, the estimated probability mass function can be expressed in a closed form, as a weighted sum of kernel functions. We also examine the performance of the proposed procedure via numer- ical experiments and present an example of traffic analysis with real data traffic. Results show that our estimation of the probability mass function, closely matches the empirical probability mass function. Precisely, through the procedure, both temporal and statistical characteristics, such as auto- correlation, long-range dependence, and skewness, can be well approximated.

Source: ISTI Technical reports, 2016

Back to previous page
BibTeX entry
	title = {Joint modeling of arrival process and length distribution of queries in Web search engines},
	author = {Cassarà P. and Colucci M. and Gotta A. and Tonellotto N.},
	institution = {ISTI Technical reports, 2016},
	year = {2016}