Web traffic modeling and its application in the design of caching and prefetching systems
AuthorBalamash, Abdullah S.
MetadataShow full item record
PublisherThe University of Arizona.
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
AbstractNetwork congestion remains one of the main barriers to the continuing success of the Internet. For web users, congestion manifests itself in unacceptably long response times. One possible remedy to the latency problem is to use caching at the client, at the proxy server, or even within the Internet. However, documents on the World Wide Web (WWW) are becoming increasingly dynamic (i.e., have short lifetimes), which limits the potential benefit of caching. The performance of a WWW caching system can be dramatically increased by integrating document prefetching (a.k.a., "proactive caching") into its design. Prefetching reduces the perceived user response time, but it also increases the network load, which in turn may increase the response time. One main goal of this dissertation is to investigate this tradeoff through a mathematical model of a WWW caching/prefetching system, and to demonstrate how such a model can be used in building a real prefetching system. In our model, the client cache consists of a "regular" cache for on-demand requests and a "prefetching cache" for prefetched requests. A pool of clients connect to a proxy server through bandwidth-limited dedicated lines (e.g., dialup phone lines). The proxy server implements its own caching system. Forecasting of future documents is performed at the client based on the client's access profile and using hints from servers. Our analysis sheds light on the interesting tradeoff between aggressive and conservative prefetching, and can be used to optimize the parameters of a combined caching/prefetching system. We validate our model through simulation. From the analysis and/or simulation, we find that: (1) prefetching all documents whose access probabilities exceed a given threshold value may, surprisingly, degrade the delay performance, (2) the variability of WWW file sizes has a detrimental impact on the effectiveness of prefetching, and (3) coexistence between caching and prefetching is, in general, beneficial for the overall performance of the system, especially under heavy load. Ideally, a caching/prefetching system should account for the intrinsic characteristics of WWW traffic, which include temporal locality, spatial locality, and popularity. A second contribution of this dissertation is in constructing a stochastic model that accurately captures these three characteristics. Such a model can be used to generate synthetic WWW traces and assess WWW caching/prefetching designs. To capture temporal and spatial localities, we use a modified version of Riedi et al.'s multifractal model, where we reduce the complexity of the original model from O(N) to O(1); N being the length of the synthetic trace. Our model has the attractiveness of being parsimonious (characterized by few parameters) and that it avoids the need to apply a transformation to a self-similar model (as often done in previously proposed models), thus retaining the temporal locality of the fitted traffic. Furthermore, because of the scale-dependent nature of multifractal processes, the proposed model is more flexible than monofractal (self-similar) models in describing irregularities in the traffic at various time scales.
Degree ProgramGraduate College
Electrical and Computer Engineering