Contextual Information Integration in Web Page Classification
| Type de document | Site actuel | Cote | Statut | Date de retour prévue | Code à barres | Réservations |
|---|---|---|---|---|---|---|
| Thèse universitaire | La bibliothèque des sciences de l'ingénieur | TH-006.312 BEL (Parcourir l'étagère) | Disponible | 0000000028525 |
Survol La bibliothèque des sciences de l'ingénieur Étagères Fermer l'étagère
PH.D Université Mohammed V 2017
The web is one of the most important data sources. It contains data from almost all fields. Its content grows at a remarkable rate, and its size becomes larger every day. Face to this overwhelming amount of data, web page classification and organization become necessary since users face problems in finding what they are looking for, even though they use search engines.
Classification is a supervised learning process, in which a classifier is trained on a set of examples belonging to predefined categories and then applied to classify future cases. It plays a significant role in many essential tasks of information retrieval. Many systems that search or manage web information will leverage advanced classification approaches.
In this dissertation, we suggest different techniques of using contextual information of web pages in three classification levels. First, web page representation has an impact on the categorization’s results. Thus, we utilized contextual information before the classification (pre-classification) to represent web pages. Second, when classifiers build their classification models during learning, they use only elements of the vectorial representation and do not take benefits from data that do not appear in vectors. Thus, we used contextual information in the building of classification models of three classifiers, so that they can leverage this information during the learning process. Finally, classifiers incorrectly classify some web pages and make good predictions on others. So, we used links relating web pages to correct classes assigned incorrectly to some web pages based on categories appearing in their neighborhoods. Hence, we used contextual information after the classification (post-classification) to adjust incorrect class assignments.
The effectiveness of our proposed approaches has been experimentally shown using real world datasets.


Il n'y a pas de commentaire pour ce document.