Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2010 | Vol. 98, nr 1 | 49-70
Tytuł artykułu

Predicting Website Audience Demographics forWeb Advertising Targeting Using Multi-Website Clickstream Data

Warianty tytułu
Języki publikacji
Several recent studies have explored the virtues of behavioral targeting and personalization for online advertising. In this paper, we add to this literature by proposing a cost-effective methodology for the prediction of demographic website visitor profiles that can be used for web advertising targeting purposes. The methodology involves the transformation of website visitors' clickstream patterns to a set of features and the training of Random Forest classifiers that generate predictions for gender, age, level of education and occupation category. These demographic predictions can support online advertisement targeting (i) as an additional input in personalized advertising or behavioral targeting, or (ii) as an input for aggregated demographic website visitor profiles that support marketing managers in selecting websites and achieving an optimal correspondence between target groups and website audience composition. The proposed methodology is validated using data from a Belgian web metrics company. The results reveal that Random Forests demonstrate superior classification performance over a set of benchmark algorithms. Further, the ability of the model set to generate representative demographic website audience profiles is assessed. The stability of the models over time is demonstrated using out-of-period data.

Opis fizyczny
Bibliogr. 39 poz., tab.
  • Department ofMarketing, Faculty of Economics and Business Administration, Ghent University, Tweekerkenstraat 2, B-9000 Ghent, Belgium,
  • [1] American Advertising Federation: 2006 AAF Survey of Industry Leaders on Advertising Industry and New Media Trends,, 2006.
  • [2] Adtech: Click Through Rates - Up and Down,, 2009.
  • [3] Amiri, A. and Menon, S.: Scheduling web banner advertisements with multiple display frequencies, IEEE Transactions on Systems Man and Cybernetics Part A-Systems and Humans, 36(2), 2006, 245-251.
  • [4] Baglioni, M., Ferrara, U., Romei, A., Ruggieri, S. and Turini, F.: Preprocessing and mining web log data for web personalization, Proc. 8th Congress of the Italian-Association-for-Artificial-Intelligence (Cappelli, A. and Turini, F., Ed.), LNCS 2829, 2003.
  • [5] Bilchev, G. and Marston, D.: Personalised advertising - exploiting the distributed user profile, BT Technology Journal, 21(1), 2003, 84-90.
  • [6] Breiman, L.: Bagging predictors, Machine Learning, 24(2), 1996, 123-140.
  • [7] Breiman, L.: Random forests, Machine Learning, 45(1), 2001, 5-32.
  • [8] Breiman, L., Friedman, J. H., Olsen, R. A. and Stone, C. J.: Classification and regression trees, Chapman & Hall / CRC, 1984.
  • [9] Cannon, H. M.: The naive approach to demographic media selection, Journal of Advertising Research, 24(3), 1984, 21-25.
  • [10] Cannon, H. M. and Rashid, A.: When do demographics help in media planning, Journal of Advertising Research, 30(6), 1991, 20-26.
  • [11] Chandon, J. L., Chtourou,M. S. and Fortin, D. R.: Effects of configuration and exposure levels on responses to web advertisements, Journal of Advertising Research, 43(2), 2003, 217-229.
  • [12] Eirinaki, M. and Vazirgiannis, M.: Web mining for web personalization, ACM Transactions on Internet Technology, 3(1), 2003, 1-27.
  • [13] Faber, R. J., Lee, M. and Nan, X. L.: Advertising and the consumer information environment online, American Behavioral Scientist, 48(4), 2004, 447-466.
  • [14] Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. and Witten, I. H.: The WEKA Data Mining Software: An Update, SIGKDD Explorations, 1(1), 2009.
  • [15] Freund, Y. and Schapire, R. E.: Experiments with a new boosting algorithm Proc. Thirteenth International Conference on Machine Learning (Saitta, L., Ed.), Morgan Kauffman, San Francisco, CA, 1996.
  • [16] Freund, Y. and Schapire, R. E.: A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55(1), 1997, 119-139.
  • [17] Gallagher, K. and Parsons, J.: A framework for targeting banner advertising on the Internet, Proc. 30th Hawaii International Conference on System Sciences (HICSS 30) (Nunamaker, J. F. and Sprague, R. H., Ed.), 1997.
  • [18] Ha, S. H.: An intelligent system for personalized advertising on the Internet, Proc. 5th International Conference on E-Commerce and Web Technology (Bauknecht, K., Bichler, M. and Proll, B., Ed.), LNCS 3182, 2004.
  • [19] Hand, D. J. and Till, R. J.: A simple generalisation of the area under the ROC curve for multiple class classification problems, Machine Learning, 45(2), 2001, 171-186.
  • [20] Hanley, J. A. and McNeil, B. J.: The meaning and use of the Area under a Receiver Operating Characteristic (ROC) Curve, Radiology, 143(1), 1982, 29-36.
  • [21] Hollis, N.: Ten years of learning on how online advertising builds brands, Journal of Advertising Research, 45(2), 2005, 255-268.
  • [22] Huang, C. Y. and Lin, C. S.: Modeling the audience's banner ad exposure for Internet advertising planning, Journal of Advertising, 35(2), 2006, 123-136.
  • [23] Interactive Advertising Bureau Europe: European Internet advertising expenditure report 2008,, 2008.
  • [24] Kass, G. V.: An exploratory technique for investigating large quantities of categorical data, Applied statistics, 29(2), 1980, 119-127.
  • [25] Kazienko, P. and Adamski, M.: AdROSA - Adaptive personalization of web advertising, Information Sciences, 177(11), 2007, 2269-2295.
  • [26] Kumar, S., Dawande, M. and Mookerjee, V. S.: Optimal scheduling and placement of internet banner advertisements, IEEE Transactions on Knowledge and Data Engineering, 19(11), 2007, 1571-1584.
  • [27] Kwan, I. S. Y., Fong, J. and Wong, H. K.: An e-customer behavior model with online analytical mining for Internet marketing planning, Decision Support Systems, 41(1), 2005, 189-204.
  • [28] Lariviére, B. and Van den Poel, D.: Predicting customer retention and profitability by using random forests and regression forests techniques, Expert Systems with Applications, 29(2), 2005, 472-484.
  • [29] Liaw, A. andWiener, M.: Classification and Regression by randomForest, R News, 2(3), 2002, 18-22.
  • [30] Menon, S. and Amiri, A.: Scheduling banner advertisements on the web, INFORMS Journal on Computing, 16(1), 2004, 95-105.
  • [31] Milani, A.: Minimal knowledge anonymous user profiling for personalized services, Proc. 18th International Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (Ali, M. and Esposito, F., Ed.), LNCS 3533, 2005.
  • [32] Moe, W. W. and Fader, P. S.: Dynamic conversion behavior at e-commerce site's, Management Science, 50(3), 2004, 326-335.
  • [33] Murray, D. and Durrell, K.: Inferring demographic attributes of anonymous Internet users, Proc. International Workshop on Web Usage Analysis and User Profiling (Masand, B. and Spiliopoulou, M., Ed.), LNCS 1836, 2000.
  • [34] Ngai, E. W. T.: Selection of web sites for online advertising using the AHP, Information & Management, 40(4), 2003, 233-242.
  • [35] Prinzie, A. and Van den Poel, D.: Random forests for multiclass classification: Random MultiNomial Logit, Expert Systems with Applications, 34(3), 2008, 1721-1732.
  • [36] Quinlan, R.: C4.5: Programs for Machine Learning, Morgan Kauffman Publishers, 1993.
  • [37] R Development Core Team: R: A Language and Environment for Statistical Computing, Vienna, Austria, 2009.
  • [38] Robinson, H., Wysocka, A. and Hand, C.: Internet advertising effectiveness - The effect of design on clickthrough rates for banner ads, International Journal of Advertising, 26(4), 2005, 527-541.
  • [39] WCA: Web characterization terminology and definitions sheet,, 1999.
Typ dokumentu
Identyfikator YADDA
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.