ACM CIKM 2000/Tutorial Program

All tutorials will take place on November 6, 2000. Tutorials T1 and T2 will be in the morning; T3 and T4 will be in the afternoon.


T1
Mobile Information Access

by Anupam Joshi
University of Maryland, Baltimore County

8am-12pm

Mobile Information Access and Data Management

This tutorial will serve to introduce attendees to issues involved in Mobile information access. The tutorial will start with an overview of the problems presented by mobile environments, from the hardware level to the systems, middleware and applications level.

We will then focus on the data management problems in mobile environments. These include managing disconnection (the fact that mobile users are intermittently connected), managing location, managing multimedia content using transcoding, managing trasnsactions managing security, etc. amongst others. We will bring out why these problems need to be explicitly addressed and are not automatically solved by the networking layer by itself. We will provide a good overview of the research done in this area in the past few years including our own, and focus on the details of some representative solutions. We will also point out those problems that are still open and challenging, in particular the ones that deal with serendipitous information retrieval on ad-hoc systems.


T2
Image Retrieval

by M.V. Ramakrishna
Monash University 8am-12pm

Content Based Image Retrieval (CBIR) Systems: Architecture, Similarity Query Processing and Indexing Issues

Content Based Image Retrieval (CBIR) systems have received a lot of attention from the academic and commercial development community in recent years. The aim of such systems is to enable the users pose queries such as, "retrieve images similar to a given image", "retrieve images of sunset", from a large image database. This has brought together the image processing, Information Retrieval and database communities together, as the problems involved are diverse. The CBIR systems need to extract image features, index them using appropriate structures and efficiently process user queries providing the required answers.

The last few years have seen tremendous growth in the number of research papers in this area. There are a few commercial systems such as QBIC from IBM, and several others are under development. There is a need for development IR tools to meet the requirements of the CBIR systems. In this tutorial, we will discuss the issues of image data modeling, query mechanisms and similarity query processing techniques, and the necessary high dimensional index structures. We present the current state of the art in each area and the directions for further research.

Tutorial Outline

  1. Introduction, problems and requirements
  2. Existing CBIR systems, such as QBIC and Virage
  3. Review of Image Features (such as color, texture and shape) used in CBIR systems
  4. Image data modeling, architecture of CBIR systems, four level data model used in the CHITRA system.
  5. Query Processing, basic techniques, nature of problem encountered, advanced optimization, quality versus processing cost considerations, performance evaluation of query processing algorithms.
  6. Indexing of features, high dimensional index structures, R-tree based methods, inherent limitations of the techniques, emerging techniques, similarity measures.

T3
Information Retrieval

by David D. Lewis
AT&T Labs - Research

1pm-5pm

Introduction to Machine Learning for Information Retrieval

The topic of the tutorial will be the use of machine learning methods to improve the performance of information retrieval systems, including systems for ranked retrieval, classification, filtering, etc. The emphasis will be on methods for improving the effectiveness of the IR system at making correct judgments of the content and relevance of text, as opposed, say, to improving the efficiency of these systems. We will emphasize supervised learning, but also spend some time on unsupervised methods for representation change.

A major emphasis of the tutorial will be on helping the attendees make links between important but sometimes confusing and ambiguous concepts from information retrieval (e.g. term weighting, query expansion, relevance feedback, classification, "local X" (e.g. local LSI, local feedback), etc.) and important but sometimes confusing and ambiguous concepts from machine learning (e.g. feature selection/extraction, supervised learning, overfitting, generalization, classification vs. regression, "neural X", etc.). Connections will also be drawn between methods developed in information retrieval and those in text mining.

Outline:

  1. Introduction
  2. Text Classification
  3. Learning Boolean Functions
  4. Feature Selection in IR
  5. Learning Probability Distributions
  6. Feature extraction in IR
  7. Instance-based Classifiers
  8. Unsupervised Learning in IR
  9. Research Directions in ML for IR
About the Instructor: David D. Lewis is a Principal Research Staff Member at AT&T Labs. He has been a researcher in information retrieval since 1985 and worked on machine learning for IR since 1990. His research interests include active learning, learning with high dimensional feature sets, estimation of class membership probabilities, evaluation in IR and machine learning, and systems issues in operational text classification.


T4
Software Agents

by Tim Finin and
Charles Nicholas
University of Maryland, Baltimore County

1pm-5pm

Software Agents for Information Retrieval

This tutorial will provide an introduction to software agents and their potential applications in IR systems. The tutorial will be divided into three sections of roughly one hour each followed by a short conclusion. The first will present concepts which underlie the software agents paradigm and illustrate them with a range of example applications. The second part will cover agent software architectures, agent communication languages, and cooperation protocols. The third segment will present examples of agent-based IR systems and discuss the techniques used in them.

Who should attend: Introductory

About the Instructors:

Dr. Timothy Finin is a Professor of Computer Science and Electrical Engineering at the University of Maryland Baltimore County. He has had over 25 years of experience in the applications of Artificial Intelligence to problems in database and knowledge base systems, intelligent information systems, expert systems, natural language processing, intelligent interfaces and robotics. He is currently working on the development of technology to support intelligent information agents. Prior to joining the UMBC, he was a Technical Director at the Unisys Center for Advanced Information Technology, a member of the faculty of the University of Pennsylvania, and on research staff of the MIT AI Lab. He holds a PhD in Computer Science from the University of Illinois. Finin is the author of over eighty research publications. He has been chair or program chair of several conferences in the area of intelligent systems and served as technical co-chair of Autonomous Agents-98.

Dr. Charles Nicholas is an Associate Professor of Computer Science at the University of Maryland Baltimore County. He received a Ph.D. in Computer Science from The Ohio State University in 1988. He has been at UMBC since August 1988. Nicholas served as the General Chair of CIKM'95, CIKM'96, CIKM'97, and CIKM'98. He was Co-Chair of the Principles of Digital Document Processing Workshops PODP'96 and PODDP'98. His areas of interest include information retrieval, electronic document processing, and software engineering. Dr. Nicholas is currently the director of the Center for Architectures for Data-driven Information Processing, a DOD-sponsored research center located at UMBC.