TWiki> Daoli Web>EmailOverloadStateOfArts (09 May 2008, Main.Admininistrator)EditAttach

State of Arts

Email is one of the most successful computer applications ever developed. According to marketing surveys, over 4 billion corporate email messages were exchanged per day in 2001, increasing to a projected 35 billion in 2005. IDC estimates that the size of business email volumes sent annually worldwide in 2007 will approach 5 billion gigabytes, nearly doubling the amount over the past two years. Moreover, 97% of workers report using email multiple times per week for a daily average of 49 minutes, and 71% of people state that it is "essential" for their everyday work.

Email applications were originally designed for asynchronous communication, but email has evolved to a point where it is now used for multiple purposes: document delivery and archiving; work task delegation and task tracking. It is also used for storing personal names and addresses, for sending reminders, asking for assistance, scheduling appointments, and for handling technical support queries. The malleability of email is called email overload .

A number of different technical approaches have been proposed to address the overloaded inbox. They focus on message removal, information structuring, message highlighting and workflow.

Message Removal

Message Removal has the simple aim of reducing the overall number of inbox items. Specific approaches include spam removal, personal filtering and assisted filing. Spam removal and personal filtering address new incoming messages while assisted filing aims to reduce inbox clutter by helping users to file processed messages.

Spam detection has been relatively successful because spam messages often have distinct properties such as large distribution lists, predictable headers and somewhat predictable message content.

Personal filtering is similar to spam removal except that users define their own rules for message identification and filing. Personal filtering has not been successful however for following reasons:

  • Writing filtering rules is a programming task which most users find both hard and time-consuming.
  • Users are often not confident that the rules they define will operate correctly. In particular, they are concerned that misdefined rules will lead to important messages being filed unseen and hence being overlooked.
  • Such frequently modified rules make rule maintenance an important issue.

Since categorization is a cognitively difficult task, filing has multiple problems for email users. First, users are often unable to remember the definitions and names of existing folders, leading them to create new folders that are synonymous with pre-existing ones. Second, folders are often also too small to be useful. Users have initial overhead of creating them, and they have to remember multiple folder definitions every time they make a filing decision.

Several agent-based systems have been designed to provide assisted filing. They use machine learning techniques to automatically elicit the defining characteristics of existing folders, based on message header properties and content. After tested offline on message corpora, they can categorize inbox documents with a reasonable degree of success, reporting over 85% accuracy. However, none of them has been tested in real usage contexts, and it may be that users are unwilling to trust systems to assist in filing their messages.

Information structuring

Information structuring predominantly tackles the problem of detecting relations between inbox messages to identify prior messages relevant to the current task. Such structure is intended to impose order on the undifferentiated inbox.

Almost all work on information structuring proposes novel visualizations for message threads. These range from combining the components of the thread linearly. to the construction of complex tree structures and subthreads. Most of these approaches are based on information derived from message subject lines. But topic drift means that subject line is only a weak indicator of relations between elements of a specific task. One way to address topic drift is to develop novel algorithms for thread detection that combine information from the message subject line, body, and header information about sender and addressee.

Another way to infer structure is to apply clustering techniques to the inbox to suggest new folders. Compared to cluster document collections and web pages, email raises specific challenges because the length and character of messages are very different from the document domain where clustering has generally been applied.

Message Highlighting and Labeling

Messages are highly undifferentiated in the overloaded inbox, and many commercial email interfaces allow users to visually mark various messages to indicate that they require special attention. Furthermore, both Outlook and Netscape allow users to label messages as 'todos', with this information appearing in a header field, allowing the user to sort and view 'todos' together. According to some case studies, marking outstanding messages as unread was judged to be much more effective for reminding than using dedicated 'todo' flags or other types of visual coding.

Workflow Systems

Workflow systems assume that collaborative organizational tasks have a predictable structure associated with different work roles. For example a purchase order may have to be initiated, approved by a manager and then processed by the purchasing department. However, they have two major limitations: coverage and lack of integration. The coverage problem arises because most tasks have an evolving structure and require iterative negotiation for their solution, which makes them inappropriate for workflow tools. And even when tasks are amenable to workflow, there are integration issues. Most workflow systems are not well integrated with email clients, so that users have to switch to a separate application - introducing extra cognitive overhead.

Besides the addressing of the overloaded inbox, there are some empirical studies of email usage such as filing, task management, triage, contact management, semi-structured messaging and individual differences. However, the main problem is the shortage of data. Other research areas have been novel system designs. Some designs have tackled the problem of task management either by providing direct support for tasks or by thread visualization to help users detect relations between related Inbox items. Other research addresses the filing and priority problem using machine learning techniques to classify Inbox messages. There have also been attempts to design email interfaces that reflect different cognitive styles or activity levels. More recently several researchers have proposed people-centric social interfaces to email. Another potentially important trend might be the move towards search-based clients such as Gmail by Google. Again a major problem with this design research lies in generalizing from these results; in most areas, one or two systems have been built and tested, and it is still not obvious which approaches are promising and which are not.

Comments

 
Topic revision: r2 - 09 May 2008 - 13:17:54 - Main.Admininistrator
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback