Data Mining Laboratory

 | Post date: 2019/12/10 | 
Data Mining Laboratory
The Data Mining Lab is focused on the analysis of very large data sets, especially those that arise in the application areas of text mining and bioinformatics. The emphasis is on finding sound, theoretically-motivated algorithms for the central tasks in data mining, such as high-dimensional clustering, classification algorithms and data visualization.
Our lab is closely affiliated with various problems in machine learning and data mining, including but not limited to the following areas:
  • Analysis of very large data sets, especially those that arise in the application areas of text mining and bioinformatics
  • Semantic web and knowledge graphs
  • Web mining and text Mining
  • Clustering ensembles
  • Association rule mining
  • Mining of graphs, and bioinformatics
With the above goals in mind, the lab has recently been exploring the application of information theory to data mining tasks. Information Theory provides a natural way of dealing with non-negative data vectors by treating them as probability vectors. Problems such as clustering can then be posed as optimization problems in information theory, such as maximizing mutual information. As an application to text mining, such an approach has been shown to reveal the semantic similarity of words thus leading to substantial reduction in classifier complexity and increased accuracy in document classification when training data is sparse. Further directions currently being explored include: (a) information-theoretic clustering and approximation of higher order non-negative tensors (that often arise in applications as multidimensional contingency tables), and (b) new algorithms for low-rank non-negative matrix factorization.
The Data Mining Lab has disseminated publications, software and results for document clustering, clustering of gene expression data in bioinformatics and multidimensional data visualization.

View: 494 Time(s)   |   Print: 72 Time(s)   |   Email: 0 Time(s)   |   0 Comment(s)