Home | Bio Sketch | Research | Publications | Projects | Teaching
·
Document
mining (content and/or structure) and its use
1.
Ontology
extraction from web pages
We proposed an approach for ontology
construction from Web pages which is based on a contextual and incremental
clustering of terms. Our approach defines and evaluates a context-based
clustering algorithm for ontology learning included in a global architecture
for knowledge discovery for the semantic Web. This algorithm is based on an
incremental use of the partitioning K-means algorithm and is guided by a
structural context. This context is based on the HTML structure and the
location of words in the documents. This contextual representation guides the
clustering algorithm to delimit the context of each word by improving the word
weighting, the word pair’s similarity and the semantically closer cooccurents
selection for each word. Our algorithm refines the context of each word cluster
and improves the conceptual quality of the resulting clusters and consequently
of the extracted concepts. We have defined a set of criteria for evaluating the
ontological concepts. We experiment the contextual clustering algorithm on HTML
document corpus related to the tourism domain (in French) and we evaluate the
extracted ontological concepts with our contextual algorithm. The results show
that the appropriate context definition and the successive refinements of
clusters improve the relevance of the extracted concepts in comparison with a
simple K-means algorithm. Our evaluation of ontological concepts can be applied
to any domain and provides qualitative and quantitative criteria.
2. Contextual Information Retrieval and
Extraction – Social Networks analysis – visualisation paradigms for large data
sets
In this work, we define an
information retrieval methodology which uses Formal Concept Analysis in
conjunction with semantics to provide contextual answers to Web queries. The
conceptual context defined can be global - i.e. stable- or instantaneous- i.e.
bounded by the global context. Our methodology consists first in a
pre-treatment providing the global conceptual context and then in an online
contextual processing of users requests, associated to an instantaneous
context. The pre-treatment consists in computing offline a conceptual lattice
from data sources in order to build an overall conceptual context. Then, the
information retrieval is performed in real-time: users formulate their query
with terms from the thesaurus/ontology. Users may then navigate within the
lattice by generalizing or on the contrary by refining their query. A
similarity measure has been defined to find the closer concepts starting from
an entry point of the lattice, in order to help the user to navigate. Our information
retrieval process was illustrated through experimentation results in the
tourism domain. One interest of our approach is to perform a more relevant and
refined information retrieval, closer to the user’s expectation. We add a
semantic layer to the conceptual and data ones. The similarity measure helps
the user to navigate through big lattices by ranking the neighbour concepts.
This method is generic and can be applied to any heterogeneous data sources
(Web data, personal data, social networks, etc.). We also define conceptual and
visual footprints for online social networks characterization. We experiment
large data sets visualization methods based on pixel-oriented techniques and
compare them with some traditional visualisation methods based on
Multi-Dimensional Scaling.
Collaboration: University
·
Knowledge
Management
1.
A
Knowledge Base for Ontology Building: application to Semantic Information
Retrieval
Our objective here is to propose a
semiautomatic construction of ontologies from web pages. To achieve such an
objective, we build a knowledge base to represent web knowledge which is
specified using a metaontology containing the knowledge related to the task of
domain knowledge extraction. Our architecture is based on ontological
components, defined by the metaontology, and related to the content, the
structure and the services of a determined domain. In this architecture, we
specify three interrelated ontologies: the domain ontology, the structure
ontology and the services ontology. Our metaontology is able to store the
knowledge related to different techniques and methods for ontology
construction. We have defined a semantic on-line information retrieval system
using this web knowledge architecture. This on-line information retrieval
system enriches the user query with domain concepts and classifies the web
documents according to the concepts and the services; it also gives the user
the opportunity to detect a set of services related to a given concept. The
comparison with other systems shows that the precision is improved.
Collaboration: ENSI Tunis (RIADI Lab.)
Funded by STIC INRIA-
1. Ontological Knowledge Evolution
Methodology 
Ontologies are used as a key for
semantic modelling, offering consensual and formal knowledge specification.
They are more and more applied to open and dynamic environments and modelling
knowledge that evolve continuously. To take into account all evolving aspects,
ontologies have to be adapted to change requirements. In this work, we propose
a methodological approach for ontological knowledge maintenance focusing
particularly on OWL ontologies. Several problems emanate from ontology
evolution: capturing change requirements, change specification, change
application, change traceability, change propagation to dependant artefacts, etc.
The goal of the methodology is to manage ontology evolution in a systematic and
optimized manner while maintaining consistency and evaluating change impact on
ontology quality. In this paper, we propose a pattern oriented ontology change
management approach, namely Onto-Evoal.
The modelled patterns correspond to changes, inconsistencies and resolution
alternatives. Based on these patterns and the links between them, we propose an
optimized and automated change management process guiding and controlling change
application while maintaining consistency of the evolved ontology. In addition,
a quality model is proposed to evaluate the impact of the different
alternatives on ontology quality and guide the user on the resolution of
inconsistencies.
Change management depends on the
ontology representation model, we focus on OWL model and we take into account
change impacts on logical consistency with respect to OWL DL constraints.
Funded by RNTL Dafoe Project
·
Semantic
Information Retrieval and Personalisation
1.
Semantic
Information Retrieval using personal fuzzy ontologies
Ontology can be seen as a semantic layer
allowing finding more relevant documents according to a user’s query. Fuzzy
logic is used in IR to solve the ambiguity and vagueness issues, by defining
flexible queries or fuzzy indexes. In this work, we have extended an existing
prototype (see Knowledge Management section) with fuzzy ontologies. SIROF uses the fuzzy ontology for query
reformulation and for documents and query indexing. A fuzzy ontology is owned
by each user, and the weights are modified according to the user’s queries.
Each user have an own personalized fuzzy ontology.
The main contributions of our system
are: (1) automatic fuzzification of a domain ontology taking account of both
taxonomic and non taxonomic relations, (2) query reformulation based on the
weights associated to all the relations existing in the fuzzy ontology, and (3)
use of this fuzzy ontology to classify documents by services.
2.
Integration
of spatial constraints in a personalised information retrieval system
We propose new approach to
personalize information and especially spatial information by considering
together the spatial and the semantic contexts. This approach can be considered
as an aid to navigate for example while travelling on an urban space by
highlighting the locations which might be of interest respecting the
spatial constraints. The personalization approach tends to analyse users
navigations to build a user profile describing his interests. The user profile
is used to filter the contents and extract the semantically relevant
information. The proposed approach develops a model oriented towards the
representation and approximation of users' profiles and preferences. We build a
user’s network and use
both spatial information of the network peers and information on the semantic
similarity. A prototype was developed and applied to the tourism domain.
Funded by STIC INRIA-
3.
Personalized
web content retrieval based on web usage mining
This work is a part of the Eiffel
project, which aims at developing a semantic search engine dedicated to
tourism. We have tried to address the exploration search problem by adding
personalization facilities to our solution according to user preferences and
profile. These user’s preferences and profile are part of a user model that
represents the whole context of navigation of the user in tourism websites.
This user model is enriched with information extracted from log files using Web
Usage Mining techniques. We have defined a methodology for processing web logs
acquired from many sources. The extracted information as well as additional
information (such as a spatial localization for example) is stored in a data
warehouse.
Funded by RNTL Eiffel
project “Semantic Web and e-Tourism”
·
PhD students
Myriam Hadjouni (PhD ENSI Tunis
& University Paris 11 – 3rd year): Spatial Web Personalisation
Nesrine Ben Mustapha (PhD ENSI Tunis
& co-direction Centrale Paris – 2nd year): Collaborative
ontology learning and Semantic Search
Rania Soussi (PhD ENSI Tunis & Centrale
Paris – 2nd year): Social Network extraction from relational
databases
Raphaël Thollot (PhD Centrale Paris and SAP
Business Objects – 1st year): A situational platform for Business
Intelligence
Micheline Elias (PhD Centrale Paris and SAP
Business Objects – 1st year): Human Computer Information Retrieval
in BI dashboards
Nicolas Beauger (PhD Centrale Paris and SAP
Business Objects – 1st year): Query & Answering in a Business
Intelligence Context
·
Post doctoral students
Rim Djedidi: CSDL Project (Complex Systems
Design Lab): decision-making collaborative environment for complex systems
·
Past students
Rim Djedidi (PhD University Paris
11): Ontological Knowledge Evolution Methodology (2009)
Riadh Trad (ENSI Tunis –
internship): Visualisation and interpretation of large conceptual graphs (2009)
Ramzi Haddad (ENSI Tunis -
internship): Integration of spatial constraints in a personalisation system
(2008)
Paul Barbotin and François Thisse (
Lobna Karoui (PhD Supelec &
University Paris 11): Ontology extraction from web pages (2008)
Zied Boulila (internship – ENSI
Tunis): Ontology Evolution (2008)
Nesrine Ben Mustapha (master - ENSI
Tunis): A framework for ontology building: application to the Semantic Web
(2007)
Thomas Monjo and Lucile Beguin (
Hassane Abboute (master - Supelec):
Evolution and Enrichment of a domain ontology (2006)
Christine Bonhomme (PhD student): A
visual language for querying Geographical Information Systems (2000)
Ahmed Lbath (PhD student): A Visual
Tool Case for Geographic Information Systems (1997)