Difference between revisions of "MCD 2008"

From Openresearch
Jump to: navigation, search
(Event created)
 
 
Line 4: Line 4:
 
  | Type = Workshop
 
  | Type = Workshop
 
  | Series =  
 
  | Series =  
 +
| Superevent = ICDM 2008
 
  | Field = Data mining
 
  | Field = Data mining
 
  | Homepage = eric.univ-lyon2.fr/~mcd
 
  | Homepage = eric.univ-lyon2.fr/~mcd
Line 17: Line 18:
 
}}
 
}}
  
<pre>
 
*********************************************************************************
 
 
4th International Workshop on Mining Complex Data - MCD'08 -
 
4th International Workshop on Mining Complex Data - MCD'08 -
 
In  Conjonction with IEEE Int. Conf. on Data Mining 2008
 
In  Conjonction with IEEE Int. Conf. on Data Mining 2008
 
Pisa, Italy, 15th Dec. 2008  
 
Pisa, Italy, 15th Dec. 2008  
*********************************************************************************
 
  
CALL FOR PAPERS
 
-------------------
 
 
Data mining and knowledge discovery can today be considered as stable fields with numerous efficient methods and studies that have been proposed to extract knowledge from data. Nevertheless, the famous golden nugget is still challenging. Actually, the context evolved since the first definition of the KDD process and knowledge has now to be extracted from data getting more and more complex. The structure of the data, for instance, doesn't match the attribute-value format when considering the web, texts or videos.  
 
Data mining and knowledge discovery can today be considered as stable fields with numerous efficient methods and studies that have been proposed to extract knowledge from data. Nevertheless, the famous golden nugget is still challenging. Actually, the context evolved since the first definition of the KDD process and knowledge has now to be extracted from data getting more and more complex. The structure of the data, for instance, doesn't match the attribute-value format when considering the web, texts or videos.  
  
Line 32: Line 28:
 
However, in a large number of application domains, this unimodal approach appears to be too restrictive. Consider for instance a corpus of medical files. Each file can contain tabular data such as results of biological analyzes, textual data coming from clinical reports, image data such as radiographies, echograms, or electrocardiograms. In a decision making framework, treating each type of information separately has serious drawbacks. It appears therefore more and more necessary to consider these different data simultaneously, thereby encompassing all their complexity. Many examples of complex data can thus be found in potential knowledge extraction processes. These data can be:
 
However, in a large number of application domains, this unimodal approach appears to be too restrictive. Consider for instance a corpus of medical files. Each file can contain tabular data such as results of biological analyzes, textual data coming from clinical reports, image data such as radiographies, echograms, or electrocardiograms. In a decision making framework, treating each type of information separately has serious drawbacks. It appears therefore more and more necessary to consider these different data simultaneously, thereby encompassing all their complexity. Many examples of complex data can thus be found in potential knowledge extraction processes. These data can be:
  
- Semi-structured or unstructured  
+
* Semi-structured or unstructured  
- Sensor data such as scientific or medical data  
+
* Sensor data such as scientific or medical data  
- Representing the same information at different periods  
+
* Representing the same information at different periods  
- Grouping different kinds of information (images, text, ontologies, etc.)  
+
* Grouping different kinds of information (images, text, ontologies, etc.)  
- Hence, a natural question arises: how could one combine information of different nature and associate them with a same semantic unit, which is for instance the patient? On a methodological level, one could also wonder how to compare such complex units via similarity measures. The classical approach consists in aggregating partial dissimilarities computed on components of the same type. However, this approach tends to make superposed layers of information. It considers that the whole entity is the sum of its components. By analogy with the analysis of complex systems, it appears that knowledge discovery in complex data can not simply consist of the concatenation of the partial information obtained from each part of the object. The aim would rather be to discover more "global" knowledge giving a meaning to the components and associating them with the semantic unit. This fundamental information cannot be extracted by the currently considered approaches and the available tools.
+
* Hence, a natural question arises: how could one combine information of different nature and associate them with a same semantic unit, which is for instance the patient? On a methodological level, one could also wonder how to compare such complex units via similarity measures. The classical approach consists in aggregating partial dissimilarities computed on components of the same type. However, this approach tends to make superposed layers of information. It considers that the whole entity is the sum of its components. By analogy with the analysis of complex systems, it appears that knowledge discovery in complex data can not simply consist of the concatenation of the partial information obtained from each part of the object. The aim would rather be to discover more "global" knowledge giving a meaning to the components and associating them with the semantic unit. This fundamental information cannot be extracted by the currently considered approaches and the available tools.
  
 
The new data mining strategies shall take into account the specificities of complex objects (units with which are associated the complex data). These specificities are summarized hereafter:  
 
The new data mining strategies shall take into account the specificities of complex objects (units with which are associated the complex data). These specificities are summarized hereafter:  
Line 49: Line 45:
 
The aim of this workshop is to address issues related to the concept of mining complex data. The whole knowledge discovery process being involved, our goal will be to attract papers dealing with each step of this process. Actually, managing complex data within the KDD process implies to work on every step, starting from the pre-processing (e.g. structuring and organizing) to the visualization and interpretation (e.g. sorting or filtering) of the results, via the data mining methods themselves (e.g. classification, clustering, frequent patterns extraction, etc.). Papers are invited in all KDD fields that involve complex data, including, but not limited to:
 
The aim of this workshop is to address issues related to the concept of mining complex data. The whole knowledge discovery process being involved, our goal will be to attract papers dealing with each step of this process. Actually, managing complex data within the KDD process implies to work on every step, starting from the pre-processing (e.g. structuring and organizing) to the visualization and interpretation (e.g. sorting or filtering) of the results, via the data mining methods themselves (e.g. classification, clustering, frequent patterns extraction, etc.). Papers are invited in all KDD fields that involve complex data, including, but not limited to:
  
- Pre-processing, structuring and organizing complex data  
+
* Pre-processing, structuring and organizing complex data  
- Handling missing or wrong values  
+
* Handling missing or wrong values  
- Data fusion, result fusion  
+
* Data fusion, result fusion  
- Methods and algorithms for mining complex data  
+
* Methods and algorithms for mining complex data  
- Mining heterogeneous data  
+
* Mining heterogeneous data  
- Knowledge integration into the KDD process  
+
* Knowledge integration into the KDD process  
- Post-processing, visualization and interpretation support  
+
* Post-processing, visualization and interpretation support  
- Applications and experience feedback  
+
* Applications and experience feedback  
- Information retrieval in complex data bases  
+
* Information retrieval in complex data bases  
- Ontology and metadata
+
* Ontology and metadata
 
   
 
   
 
The workshop will consist in a series of communications (oral presentations or poster). A reasonable time will be left for the discussion after each presentation. All the articles will be reviewed by at least two references with a double aim of improving quality and giving advice to the authors. A dedicated place will be given to the young researchers with a session (Position paper) grouping the works in progress in various teams. This can be a good occasion for a PhD student or a young researcher to present his/her starting project. This session will be particularly significant for works on their beginning and for the establishment of research groups on shared topics. Demonstrations of research results could be associated with the poster presentations.
 
The workshop will consist in a series of communications (oral presentations or poster). A reasonable time will be left for the discussion after each presentation. All the articles will be reviewed by at least two references with a double aim of improving quality and giving advice to the authors. A dedicated place will be given to the young researchers with a session (Position paper) grouping the works in progress in various teams. This can be a good occasion for a PhD student or a young researcher to present his/her starting project. This session will be particularly significant for works on their beginning and for the establishment of research groups on shared topics. Demonstrations of research results could be associated with the poster presentations.
Line 64: Line 60:
 
The submitted manuscript should closely reflect the final paper as it will appear in the Proceedings.
 
The submitted manuscript should closely reflect the final paper as it will appear in the Proceedings.
  
INSTRUCTIONS FOR AUTHORS
+
==INSTRUCTIONS FOR AUTHORS==
-------------------------------
 
  
 
Papers should not exceed 8 pages (pdf or MS-Word) in the IEEE 2-column format (see the IEEE Computer Society Press Proceedings Author Guidelines )
 
Papers should not exceed 8 pages (pdf or MS-Word) in the IEEE 2-column format (see the IEEE Computer Society Press Proceedings Author Guidelines )
Line 71: Line 66:
 
Submitted papers will be evaluated by at least two reviewers. Any submission that exceeds length limits or deviates from formatting requirements may be rejected without review.
 
Submitted papers will be evaluated by at least two reviewers. Any submission that exceeds length limits or deviates from formatting requirements may be rejected without review.
  
IMPORTANT DATES
+
==IMPORTANT DATES==
-------------------------------
 
  
Abstract and paper submission: August 7, 2008   
+
* Abstract and paper submission: August 7, 2008   
Notifications: September 12, 2008   
+
* Notifications: September 12, 2008   
Camera-ready version: September 29, 2008   
+
* Camera-ready version: September 29, 2008   
Workshop: December 15, 2008
+
* Workshop: December 15, 2008
 
   
 
   
COMMITTEES
+
==COMMITTEES==
-----------------------
 
  
- Workshop chairs
+
* Workshop chairs
- Djamel A. Zighed, University of Lyon 2, France  
+
** Djamel A. Zighed, University of Lyon 2, France  
- Zbigniew W. Ras, University of North Carolina, Charlotte  
+
** Zbigniew W. Ras, University of North Carolina, Charlotte  
- Shusaku Tsumoto, Shimane University, School of Medicine, Japan  
+
** Shusaku Tsumoto, Shimane University, School of Medicine, Japan  
 
+
* Organizing Committee
- Organizing Committee
+
** Hakim Hacid, University of new South Wales, Australia  
- Hakim Hacid, University of new South Wales, Australia  
+
* Program Committee (partial list)
 
+
** [[has PC member::Aijun An]] York University, Canada  
- Program Committee (partial list)
+
** [[has PC member::Younes Benani]], University Paris 13, France  
- Aijun An, York University, Canada  
+
** [[has PC member::Petr Berka]], University of Economics, Prague, Czech Republic  
- Youn?s Benani, University Paris 13, France  
+
** [[has PC member::Elisa Bertino]], Purdue University, USA  
- Petr Berka, University of Economics, Prague, Czech Republic  
+
** [[has PC member::Maria-Paula Brito]], University Porto, Portugal  
- Elisa Bertino, Purdue University, USA  
+
** [[has PC member::Michelangelo Ceci]], University Bari, Italy  
- Maria-Paula Brito, University Porto, Portugal  
+
** [[has PC member::Tapio Elomaa]], Tampere UniversityofTechnology, Finland  
- Michelangelo Ceci, University Bari, Italy  
+
** [[has PC member::Floriana Esposito]], University Bari, Italy  
- Tapio Elomaa, Tampere UniversityofTechnology, Finland  
+
** [[has PC member::Jean-Gabriel Ganascia]], University Paris 6, France  
- Floriana Esposito, University Bari, Italy  
+
** [[has PC member::Mirsad Hadzikadic]], UNC-Charlotte, USA  
- Jean-Gabriel Ganascia, University Paris 6, France  
+
** [[has PC member::Georges Hebrail]], ENST Paris, France  
- Mirsad Hadzikadic, UNC-Charlotte, USA  
+
** [[has PC member::Shoji Hirano]], Shimane University, Japan  
- Georges Hebrail, ENST Paris, France  
+
** [[has PC member::Jacek Koronacki]], ICS PAS, Poland  
- Shoji Hirano, Shimane University, Japan  
+
** [[has PC member::Xiaohua Tony Hu]], Drexel University,USA  
- Jacek Koronacki, ICS PAS, Poland  
+
** [[has PC member::Pascale Kuntz-Cosperec]], University Nantes, France  
- Xiaohua Tony Hu, Drexel University,USA  
+
** [[has PC member::Stephane Lallich]], University Lyon, France  
- Pascale Kuntz-Cosperec, University Nantes, France  
+
** [[has PC member::Rory Lewis]], University of Colorado at Colorado Springs  
- Stephane Lallich, University Lyon, France  
+
** [[has PC member::Jiming Liu]], Hong Kong Baptist University  
- Rory Lewis, University of Colorado at Colorado Springs  
+
** [[has PC member::Donato Malerba]], University Bari, Italy  
- Jiming Liu, Hong Kong Baptist University  
+
** [[has PC member::Francesco Palumbo]], Universit? di Macerata, Italy  
- Donato Malerba, University Bari, Italy  
+
** [[has PC member::Jean-Marc Petit]], LIRIS, INSA Lyon, France  
- Francesco Palumbo, Universit? di Macerata, Italy  
+
** [[has PC member::Jan Rauch]], University of Economics, Prague, Czech Republic  
- Jean-Marc Petit, LIRIS, INSA Lyon, France  
+
** [[has PC member::Gilbert Ritschard]], University Geneva, Switzerland  
- Jan Rauch, University of Economics, Prague, Czech Republic  
+
** [[has PC member::Henryk Rybinski]], Warsaw University of Technology, Poland  
- Gilbert Ritschard, University Geneva, Switzerland  
+
** [[has PC member::Lorenza Saitta]], University Alessandria, Italy  
- Henryk Rybinski, Warsaw University of Technology, Poland  
+
** [[has PC member::Gilbert Saporta]], CNAM Paris, France  
- Lorenza Saitta, University Alessandria, Italy  
+
** [[has PC member::Andrzej Skowron]], UniversityofWarsaw, Poland  
- Gilbert Saporta, CNAM Paris, France  
+
** [[has PC member::Stefan Trausan-Matu]], University Bucharest, Romania  
- Andrzej Skowron, UniversityofWarsaw, Poland  
+
** [[has PC member::Li-Shiang Tsay]], NCA&T State University, USA  
- Stefan Trausan-Matu, University Bucharest, Romania  
+
** [[has PC member::Rosana Verde]], University Naples, Italy  
- Li-Shiang Tsay, NCA&T State University, USA  
+
** [[has PC member::Christel Vrain]], The Orleans University, France  
- Rosana Verde, University Naples, Italy  
+
** [[has PC member::Xindong Wu]], University of Vermont, USA  
- Christel Vrain, The Orleans University, France  
+
** [[has PC member::Yiyu Yao]], University of Regina, Canada  
- Xindong Wu, University of Vermont, USA  
+
** [[has PC member::Ning Zhong]], Maebashi Institute of Technology, Japan
- Yiyu Yao, University of Regina, Canada  
 
- Ning Zhong, Maebashi Institute of Technology, Japan  
 
 
</pre>This CfP was obtained from [http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=3283&amp;copyownerid=2 WikiCFP]
 

Latest revision as of 15:33, 14 December 2008

MCD 2008
4th International Workshop on Mining Complex Data
Subevent of ICDM 2008
Dates Dec 15, 2008 (iCal) - Dec 15, 2008
Homepage: eric.univ-lyon2.fr/~mcd
Location
Location: Pisa, Italy
Loading map...

Important dates
Submissions: Aug 7, 2008
Notification: Sep 12, 2008
Table of Contents



4th International Workshop on Mining Complex Data - MCD'08 - In Conjonction with IEEE Int. Conf. on Data Mining 2008 Pisa, Italy, 15th Dec. 2008

Data mining and knowledge discovery can today be considered as stable fields with numerous efficient methods and studies that have been proposed to extract knowledge from data. Nevertheless, the famous golden nugget is still challenging. Actually, the context evolved since the first definition of the KDD process and knowledge has now to be extracted from data getting more and more complex. The structure of the data, for instance, doesn't match the attribute-value format when considering the web, texts or videos.

In the framework of Data Mining, many software solutions were developed for the extraction of knowledge from tabular data (which are typically obtained from relational databases). Methodological extensions were proposed to deal with data initially obtained from other sources, like in the context of natural language (text mining) and image (image mining). KDD has thus evolved following a unimodal scheme instantiated according to the type of the underlying data (tabular data, text, images, etc), which, in the end, always leads to working on the classical double entry tabular format.

However, in a large number of application domains, this unimodal approach appears to be too restrictive. Consider for instance a corpus of medical files. Each file can contain tabular data such as results of biological analyzes, textual data coming from clinical reports, image data such as radiographies, echograms, or electrocardiograms. In a decision making framework, treating each type of information separately has serious drawbacks. It appears therefore more and more necessary to consider these different data simultaneously, thereby encompassing all their complexity. Many examples of complex data can thus be found in potential knowledge extraction processes. These data can be:

  • Semi-structured or unstructured
  • Sensor data such as scientific or medical data
  • Representing the same information at different periods
  • Grouping different kinds of information (images, text, ontologies, etc.)
  • Hence, a natural question arises: how could one combine information of different nature and associate them with a same semantic unit, which is for instance the patient? On a methodological level, one could also wonder how to compare such complex units via similarity measures. The classical approach consists in aggregating partial dissimilarities computed on components of the same type. However, this approach tends to make superposed layers of information. It considers that the whole entity is the sum of its components. By analogy with the analysis of complex systems, it appears that knowledge discovery in complex data can not simply consist of the concatenation of the partial information obtained from each part of the object. The aim would rather be to discover more "global" knowledge giving a meaning to the components and associating them with the semantic unit. This fundamental information cannot be extracted by the currently considered approaches and the available tools.

The new data mining strategies shall take into account the specificities of complex objects (units with which are associated the complex data). These specificities are summarized hereafter:

Different kind. The data associated to an object are of different types. Besides classical numerical, categorical or symbolic descriptors, text, image or audio/video data are often available. Diversity of the sources. The data come from different sources. As shown in the context of medical files, the collected data can come from surveys filled in by doctors, textual reports, measures acquired from medical equipment, radiographies, echograms, etc. Evolving and distributed. It often happens that the same object is described according to the same characteristics at different times or different places. For instance, a patient may often consult several doctors, each one of them producing specific information. These different data are associated with the same subject. Linked to expert knowledge. Intelligent data mining should also take into account external information, also called expert knowledge, which could be taken into account by means of ontology. In the framework of oncology for instance, the expert knowledge is organized under the form of decision trees and is made available under the form of ?best practice guides? called Standard Option Recommendations (SOR). Dimensionality of the data. The association of different data sources at different moments multiplies the points of view and therefore the number of potential descriptors. The resulting high dimensionality is the cause of both algorithmic and methodological difficulties. The difficulty of Knowledge Discovery in complex data lies in all these specificities.

The aim of this workshop is to address issues related to the concept of mining complex data. The whole knowledge discovery process being involved, our goal will be to attract papers dealing with each step of this process. Actually, managing complex data within the KDD process implies to work on every step, starting from the pre-processing (e.g. structuring and organizing) to the visualization and interpretation (e.g. sorting or filtering) of the results, via the data mining methods themselves (e.g. classification, clustering, frequent patterns extraction, etc.). Papers are invited in all KDD fields that involve complex data, including, but not limited to:

  • Pre-processing, structuring and organizing complex data
  • Handling missing or wrong values
  • Data fusion, result fusion
  • Methods and algorithms for mining complex data
  • Mining heterogeneous data
  • Knowledge integration into the KDD process
  • Post-processing, visualization and interpretation support
  • Applications and experience feedback
  • Information retrieval in complex data bases
  • Ontology and metadata

The workshop will consist in a series of communications (oral presentations or poster). A reasonable time will be left for the discussion after each presentation. All the articles will be reviewed by at least two references with a double aim of improving quality and giving advice to the authors. A dedicated place will be given to the young researchers with a session (Position paper) grouping the works in progress in various teams. This can be a good occasion for a PhD student or a young researcher to present his/her starting project. This session will be particularly significant for works on their beginning and for the establishment of research groups on shared topics. Demonstrations of research results could be associated with the poster presentations.

The submitted manuscript should closely reflect the final paper as it will appear in the Proceedings.

INSTRUCTIONS FOR AUTHORS

Papers should not exceed 8 pages (pdf or MS-Word) in the IEEE 2-column format (see the IEEE Computer Society Press Proceedings Author Guidelines )

Submitted papers will be evaluated by at least two reviewers. Any submission that exceeds length limits or deviates from formatting requirements may be rejected without review.

IMPORTANT DATES

  • Abstract and paper submission: August 7, 2008
  • Notifications: September 12, 2008
  • Camera-ready version: September 29, 2008
  • Workshop: December 15, 2008

COMMITTEES