Show simple item record

Authordc.contributor.authorJaimes, A. 
Authordc.contributor.authorRuiz del Solar, Javier es_CL
Authordc.contributor.authorVerschae, Rodrigo es_CL
Authordc.contributor.authorBaeza Yates, Ricardo es_CL
Authordc.contributor.authorCastillo, C. es_CL
Authordc.contributor.authorYaksic, D. es_CL
Authordc.contributor.authorDavis, E. es_CL
Admission datedc.date.accessioned2012-12-18T14:07:28Z
Available datedc.date.available2012-12-18T14:07:28Z
Publication datedc.date.issued2004
Cita de ítemdc.identifier.citationJournal of Web Engineering, Vol. 3, No.2 (2004) 153-168es_CL
Identifierdc.identifier.urihttps://repositorio.uchile.cl/handle/2250/125696
Abstractdc.description.abstractWe propose a methodology to characterize the image contents of a web segment, and we present an analysis of the contents of a segment of the Chilean web (.CL domain). Our framework uses an efficient web-crawling architecture, standard content-based analysis tools (to extract low-level features such as color, shape and texture), and novel skin and face detection algorithms. In an automated process we start by examining all websites within a domain (e.g., .cl websites), obtaining links to images, and downloading a large number of the images (in all of our experiments approx. 383,000 images that correspond to about 35 billion pixels). Once the images are downloaded to a local server, our process automatically extracts several low-level visual features (color, texture, shape, etc.). Using novel algorithms we perform skin and face detection. The results of visual feature extraction, skin, and face detection are then used to characterize the contents of a web segment. We tested our methodology on a segment of the Chilean web (.cl), by automatically downloading and processing 183,000 images in 2003 and 200,000 images in 2004. We present some statistics derived from both sets of images, which should be of use to anyone concerned with the image content of the web in Chile. Our study is the first one to use content-based tools to determine the image contents of a given web segment.es_CL
Patrocinadordc.description.sponsorshipThis research was funded by Millennium Nucleus Center for Web Research, Grant P01-029-F, Mideplan, Chile.es_CL
Lenguagedc.language.isoenes_CL
Publisherdc.publisherRinton Presses_CL
Keywordsdc.subjectWeb characterizationes_CL
Títulodc.titleOn the image content of a web segment: Chile as a case studyes_CL
Document typedc.typeArtículo de revista


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record