Interface agents: A
review of
the field
Stuart
E. Middleton
Intelligence,
Agents and
Multimedia group (IAM group)
Email: sem99r@ecs.soton.ac.uk
Web: http://www.ecs.soton.ac.uk/~sem99r
Abstract
Technical Report
Number: ECSTR–IAM01-001
ISBN: 0854327320
This paper reviews the origins of interface agents, discusses challenges that
exist within the interface agent field and presents
a survey
of current
attempts
to
find
solutions to these challenges. A history of agent systems from their birth in the 1960’s to the current day is described, along with the issues they try to address. A taxonomy of
interface agent systems is presented, and today’s
agent systems categorized
accordingly. Lastly, an analysis of the machine learning and user modelling techniques used by today’s
agents is
presented.
Keywords
Agents,
interface agents,
survey, review
Table of Contents
Abstract
..........................................................................................................................1
1 Introduction ............................................................................................................2
2 History of
software agents......................................................................................3
3 Issues
and challenges for
interface agents..............................................................5
4 Taxonomy
of interface agent
systems ....................................................................7
5 Review
of current interface
agent systems and prototypes ....................................9
5.1 Review
of current agent systems....................................................................9
5.1.1 Auction/market
domain ......................................................................9
5.1.2 Believable/entertainment domain.......................................................9
5.1.3 Email filtering domain
.......................................................................9
5.1.4 Expert assistance
domain .................................................................10
5.1.5 Matchmaking domain.......................................................................10
5.1.6 Meeting schedulers...........................................................................11
5.1.7 News
filtering domain ......................................................................11
5.1.8 Recommender
systems .....................................................................12
5.1.9 Web domain .....................................................................................13
5.1.10 Other domains ..................................................................................15
5.2 Classification of agent systems ....................................................................17
6 Conclusions ..........................................................................................................18
7 Glossary of
machine learning
terminology ..........................................................20
8 References ............................................................................................................23
Table of Figures
Figure 1 Classification of
agent systems ...................................................17
1 Introduction
The 1990’s have seen the dawn of a new paradigm in computing - software agents. Many researchers are currently active in this vibrant area, drawing from more traditional research
within
the artificial intelligence (AI) and human computer
interaction (HCI) communities. Kay [24] and others argue that one aspect of software
agent systems, the interface agent, has the potential to revolutionize computing as we know it, allowing us to advance
from direct manipulation of systems to indirect
interaction with agents. Removing the requirement for people to manage the small
details of a task liberates individuals,
empowering them to
accomplish
goals
otherwise requiring experts.
Now, this may be the future of computing but software agents originated from the field of artificial intelligence, back in the 1950’s. The next section describes some of the
important landmarks that happened along the
way
to where
we are
today.
2 History of software agents
Alan Turing, famous
for his work on computability [76], posed the question “Can
machines think?” [77]. His test, where a person
communicates
via a Teletype with either
a person or a computer, became known as the Turing test. The Turing test
requires a conversational computer to be capable of fooling a human at the other end. It
is the Turing test that
inspired the birth of the
artificial
intelligence community.
The discipline of artificial intelligence (AI) was born in the 1950’s. Marvin
Minsky,
after some work with neural networks (deemed a failure at
the time due to
the difficulty of learning
weights), teamed up with John McCarthy at MIT to work on symbolic search-based systems. At the same time at Carnegie-Mellon, Allen Newell
and Herbert Simon were successfully exploring heuristic
search to prove logic theorems. Initial successes
thus led to heuristic search of
symbolic representations
becoming the dominant approach
to AI.
The 1960’s saw
much progress. Now at Stanford,
McCarthy [38] had just invented LISP
and set
about representing the world with symbols, using logic to solve problems [39].
At the same time Newell [50] created the General Problem Solver which, given a suitable
representation, could solve any
problem. Problems
solved were in simple, noise and error free symbolic worlds, with the assumption that such
solutions would generalize to allow larger, real world problems to be
tackled. Researchers did not worry about keeping computation on a human time-scale, using the increases in hardware performance to constantly increase the possible search space size,
thus solving increasingly
impressive problems.
During the 1970’s, search became well understood [51]. Symbolic systems still dominated, with continuing hardware
improvements allowing steady,
successful progress. Robots were created, for example Shakey
[52], that lived in special block worlds, and could navigate around and stack blocks sensibly. Such simplified
worlds avoided the complexity of real world problems. The assumption, underpinning all the symbolic research, that simple symbolic worlds would generalize to the
real world,
was about
to be
found wanting.
In the 1980’s, expert systems were created
to try to solve real problems.
McCarthy
[40] had realized that “common sense” was required in addition to specialized domain
knowledge to solve anything but simple microworld problems.
A sub-field of AI,
knowledge representation, came into being to examine approaches to representing the everyday world. Unfortunately the idea of “common
sense” proved
impossible to represent, and knowledge-based systems were widely viewed to have failed to solve
real-world problems. At the same time, the backpropagation algorithm [67] caused a resurgence of interest in connectionist approaches, previously deemed a failure,
and Minsky [42] examined an
agent-based
approach for intelligence.
The late 1980’s and early 1990’s saw the decline
of search-based symbolic
approaches. Brooks [10] convincingly challenged the basic assumptions of the symbolic approaches, and instead created embodied,
grounded
systems for
robots using the “world as its own best model”. This bottom up approach was termed nouvelle AI, and had some initial successes.
However, it too failed to scale up to real-
world problems of any significant complexity. Connectionist approaches were aided
by new parallel
hardware in the early 1990’s, but the complexity of a parallel
architecture led
such systems to fail
in the
marketplace.
Knowledge engineering, now widely seen as costly and hard to re-use,
was
superseded by machine
learning techniques
borrowed from AI. Towards the end of the 1990’s, pattern-learning algorithms [44] could classify suitable domains
of knowledge, such as news stories and examination papers, with as much accuracy as
manual classification. Hybrids of traditional and nouvelle AI started to appear as new
approaches were sought.
The mid 1990’s saw Negroponte [49] and Kay’s [24] dream of indirect HCI coupled with Minsky’s
[42]
ideas on intelligence lead to the new field of agent-based
computing. Experiments
with interface agents that learnt about their user [33], and multi-agent systems where simple agents interacted to achieve their goals
[80]
dominated the research. Such agent systems were all grounded in the real world, using proven AI techniques to achieve concrete results (applying the maxim “a little AI goes
a long way”).
User modelling changed in the 1990’s too, moving from the static hand crafted representations of the 1980’s to dynamic behaviour based models [25]. Machine
learning techniques proved particularly adept
at identifying patterns
in user
behaviour.
3 Issues and challenges
for interface agents
Maes [33] describes interface
agents as
follows:
“Instead of user-initiated interaction via commands and/or direct manipulation, the user is engaged in a co-operative process in which
human and
computer agents
both initiate
communication, monitor events and perform tasks. The metaphor
used is that of a personal assistant who is collaborating with the user
in the same work
environment.”
The motivating concept behind Maes’ interface agents is to allow the user to delegate
mundane and tedious tasks to an agent assistant.
Her own agents follow this direction,
scheduling and
rescheduling meetings, filtering emails, filtering news and selecting good books.
Her goal is to reduce the workload of users by creating personalized agents to
which personal work can be
delegated.
There are many interface agent systems and prototypes, inspired by Maes early work, situated within a variety of domains. The majority of these systems are reviewed and categorized in the next section.
Common to these systems, however, are three issues
that must be addressed
before successful user collaboration with an agent can
occur:
Knowing the
user
Interacting with the
user
Competence in helping
the user
Knowing the user involves learning user preferences and work habits. If an assistant is to
offer help at the right time, and of the right sort, then it must learn how the user prefers to work. An eager assistant, always interrupting with irrelevant information,
would just annoy
the user
and increase
the overall
workload.
The following challenges
exist
for systems
trying
to learn
about users:
Extracting the
users’ goals and intentions
from observations and feedback
Getting sufficient
context
in which to set the users’
goals
Adapting to
the user’s
changing
objectives
Reducing
the initial training time
At any given time, an interface agent must have an idea of what the user is trying to
achieve in order to be able to offer effective
assistance. In addition to knowing what the user’s intentions are, there must be sufficient
contextual information about the
user’s current
situation
to avoid irrelevant agent help. Machine learning techniques help here, but which should
be used
and why?
Another problem is that regular users will typically have numerous concurrent tasks to perform. If an agent is to be helpful with more than one task, it must be able to discover when the user has stopped working on one job, and is progressing to another
– but what is the best
way
to detect
this?
Users are generally unwilling to invest much time and effort in training software
systems. They want results
early, before committing too much to a tool. This means
that interface agents must limit the initial period during which the agent learns enough about the user to offer useful help. What impact does this have on an agent’s learning ability?
A metaphor for indirect HCI has yet to reach maturity, so remains an open question. Lessons have been learned from direct manipulation interfaces.
Users need to feel in control,
expectations should not be unduly inflated and user mistakes should not be
penalized [53].
Interacting with the
user thus presents the following challenges: Deciding how
much control to delegate to
the
Choosing a
metaphor for agent
interaction
Making simple
systems that
novices can use
It is known from direct manipulation interfaces that users want to feel in control of what their tools are doing. By the nature of an autonomous interface agent, some
control has been delegated to it, in order for it to do its task. The question is, how do we build the users’ trust, and once a level of trust is established how much control do we give to the agents? Shneiderman [74] argues for a combination
of direct manipulation and indirect HCI, promoting user understanding of agents and the ability for users to control agent behaviour directly. How can we use these guiding principles
in our systems?
Interface metaphors, such as the desktop metaphor, guide users in the formation of
useful conceptual models a system. New metaphors will be required for indirect HCI,
presenting agents in a way helpful to users new to the system. Ideally, interface agents
should be so simple to use that delegating tasks becomes a natural
way of working,
amenable to the novice
user – but what is
a natural
way
of working with
agents?
Lastly, there is the issue of competence. Once the agent knows what the user is doing
and has a good interaction style, it must still formulate a plan of action that helps, not hinders,
the user.
The challenges are:
Knowing when
(and if) to interrupt the
user
Performing tasks
autonomously in the way preferred
by
the user
Finding strategies
for partial
automation of tasks
There is very little current research into how users can be best helped. Work from
other disciplines, such as computer
supported co-operative working (CSCW), can help but real user trials are needed to demonstrate and evaluate effectiveness
and usefulness of the personalized services performed by interface agents [54]. If an agent does
not reduce the workload
of a
real user in a real work
setting, it is less than
useful.
4 Taxonomy of interface agent systems
Several authors [54] [80] have suggested taxonomies for software agents as a whole, but
they tend to address interface agents as a monolithic class, citing a few examples
of the various
prototypical systems. With the maturing of the agent field, and
the growing number of interface
agents reported in the literature, a more detailed analysis is warranted. Mladenić [46] goes some way to achieving this requirement, adopting a machine learning
view
of interface agents.
Interface agents can be classified according to the role they perform, technology they
use or domain they inhabit. Interface agents are moving from research
to commercial exploitation, significantly increasing the roles and domains for agents as entrepreneurs
find new ways to exploit new markets.
The fundamental technology behind
the agents, however,
is undergoing less radical change, and thus provides
a more stable
basis on which to
build a useful taxonomy.
On this basis a survey of current interface
agent technology has been performed. The next section details the actual agent systems and prototypes reviewed. The result is a non-exclusive taxonomy
of the
technologies that specific
agent systems
support.
Character-based agents
Social agents
o
Recommender systems
Agents that learn about the user
o
Monitor user
behaviour
o
Receive
user feedback
Explicit feedback
Initial
training set
o Programmed by user
Agents with
user models
o Behavioural model
o Knowledge-based model
o Stereotypes
Character-based agents employ advanced
“character” based interfaces, representing real world characters (such as a pet dog or a human assistant [34]). Such agents draw on
existing real-world protocols, already known to even novice users, to facilitate more natural interaction. There are also applications in the entertainment
domain, creating state of the
art virtual
worlds populated by believable
agents.
Social agents talk to other agents (typically other interface agents of the same type) in order to share information. This technique is often used to bootstrap
new,
inexperienced interface agents with the experience of older interface agents (attached to other
users).
Recommender systems are a specific type of social agent. They are also referred
to as collaborative filters [60], finding relevant items based on the recommendations
of
others. Typically, the user’s own ratings are used to find similar users, with the aim of sharing recommendations
on common
areas of interest.
Agents employing a learning technology are classified
according to the type of information required by the learning technique
and the way the user
model is
represented. Algorithms requiring an explicit training set employ supervised learning, while those without a training set use unsupervised learning techniques
[44]. There are
three general ways to learn about the user: monitor the user, ask for feedback or allow
explicit programming by
the user.
Monitoring the user’s behaviour
produces unlabelled data, suitable for unsupervised
learning techniques.
This is generally the hardest way to learn, but is also the least intrusive. If the monitored
behaviour is assumed to be an example of what the user wants,
a positive example
can be inferred.
Asking the user for feedback, be it on a case-by-case basis or via an initial training
set, produces labelled training
data. Supervised learning
techniques can thus be employed, which usually outperform unsupervised learning. The disadvantage is that
feedback must be provided,
requiring an investment of effort (often significant) in the agent
by
the user.
User programming involves the user changing the agent explicitly. Programming can be performed in a variety of ways, from complex programming languages
to
the
specification of simple cause/effect graphs. Explicit programming requires significant
effort by the user.
User modelling [25] comes in two varieties, behavioural and
knowledge-based. Knowledge-based user modelling is typically the result of questionnaires and studies
of users,
hand-crafted into a set of heuristics. Behavioural
models are generally the
result of monitoring the user during an activity. Stereotypes
[64] can
be applied
to both cases, classifying the users into groups (or stereotypes), with the aim of applying generalizations to
people in those groups.
Specific interface agents will often implement
several of the above types of technology, and so would appear in multiple classes. A common example is an agent that
learns about the user and also supports a user model.
The presented taxonomy ought to be robust to the increase in new systems, since the fundamental technology of machine learning and
user modelling are unlikely to change as
quickly.
5 Review of current interface agent systems and prototypes A comparison of interface agents is difficult since there are no widely used standards
for reporting results. Where machine learning techniques
are employed, standard tests
such as precision
and recall provide useful metrics for comparing learning algorithms.
However, the
best test
of an
interface agent’s ability to help a user is a user trial. Unfortunately, user trials
in the
literature do not follow a
consistent methodology.
The analysis in this paper will focus on classifying the agent, identifying techniques
used
such as specific machine learning techniques or user modelling types, and where applicable results published by the original author. Comparisons of systems can thus be made on a
qualitative basis.
5.1 Review of current
agent systems
What follows is a review of known interface
agent systems. Agents are examined by application domain, so that similar types of interface agents can be compared.
The machine learning
algorithms specified
here are described later in the
glossary.
5.1.1 Auction/market domain
Kasbah [36] is a market system in which each user has an agent. The user programs the agent with a buying behaviour
profile, and the agent negotiates to buy and sell items
for the user.
Results: Users wanted more “human like” negotiation from the agents, otherwise well received.
Sardine [47] is an
auction
agent that tries to purchase
an airline ticket for the user,
based on some specified preferences. The user’s agent negotiates with travel agents to
secure the best deal.
5.1.2 Believable/entertainment
domain
ACT [28] is an
addition
to the ALIVE system. It is a creature within
the ALIVE
world, observing the user and learning chains of actions.
It tries to help the user by
completing new action chains in the pattern of previous
ones.
ALIVE [35] is a “magic mirror” system to a 3D world. Interactive agents (such as a dog) exist for users to play with. Gesture recognition, and competing goal architecture
is employed.
Cathexis [37]
is a
believable agent with
modelled emotions, as is the
Oz project [4].
5.1.3 Email filtering domain
MailCat [72] filters email by providing a choice of folders to the user. TF-IDF vectors are created for existing emails, and cosine similarity used to match new emails. The user has
the final say, choosing one of the suggested folders or moving messages manually.
Results: 0.3 second classification time, 60-80% accuracy giving user one choice, 80-
98% accuracy
giving
user 3 choices of
folder.
MAGI
[17] filters
emails, monitoring user behaviour
and receiving relevance feedback. CN2 and IBPL
are used
to classify emails.
Maxims [33] filters email by learning repetitive actions the user performs. It monitors
user actions
using memory-based reasoning to discover patterns. Agents can share
expertise with other
agents, and
user programming
is allowed.
Re:Agent [9]
is an email filter that accepts user provided keywords for its groupings. TF-IDF
vectors
are created for
each email, along with the TF of the user provided
keywords. This representation is then clustered using a nearest
neighbour and neural
network clustering algorithm (for comparison).
Results:
classification accuracy – neural network 94.8 ± 4.2%, nearest neighbour 96.9
± 2.3%; high accuracy due to simple classification task (into “work”
or “other”
categories).
5.1.4 Expert assistance domain
Coach [73] is a LISP help system that monitors user mistakes and offers unsolicited
advice. A knowledge-based user model is supported, with the concept
of
user experience stereotypically represented. Heuristics
adjust the model based on user mistakes.
Results: Student performance improved, knowledge of functions improved by a factor
of 5.
Eager [13] automates observed repetitive HyperCard actions. It monitors the user looking for
behaviour patterns, and creates helpful
macros from them.
Results: Users felt a loss of control; macros for some irrelevant small patterns were
created.
GALOIS [68] monitors the use of an application, and offers expert advise when users
are lost or being inefficient. An initial knowledge-based user
profile is
constructed from personal information, then a behavioural model built by observing user actions.
Stereotypes are
used to classify users,
thus allowing customized
help.
GESIA [11]
helps expert system developers by suggesting predicted
actions. A Bayesian network models user behaviour, allowing predictions with the help of hard-
coded domain knowledge.
Open Sesame! [20] observes
user actions and offers to automate repetitive tasks. The
ART-2 learning algorithm
is used.
Results: Only 2/129 suggestions were followed – system deemed to
have failed;
action patterns do
not generalize
across situations well.
5.1.5 Matchmaking domain
ExpertFinder [78] monitors
users’ Java code and finds people who use the same classes. TF-IDF vectors represent code files, and cosine similarity is used to find
similar people.
ReferralWeb [23] builds a social network from publicly available
web pages. People’s
names are extracted from pages, and co-occurrence of names within pages
imply a
social connection. Queries such
as “list
docs close to Mitchell” can
thus be issued.
Yenta [16] allows
user agents to “find” each other, and determine
commonality of interests. The SMART algorithm initially classifies user emails, newsgroups and created files in order to build an interest profile. Agents then find each other in the Yenta system, compare profiles
for
similarity, and suggest
other agents to try.
Results: Halves
the worst-case
search space, robust to removal
of agents.
5.1.6 Meeting schedulers
CAP [43] is a calendar manager, monitoring email and scheduling software to detect meeting patterns. Decision trees (ID3), using information gain to select features, are
converted to production rules.
Results: 31-60% accuracy (average of 47%) not sufficient for automation, rules were human readable which improved user
understanding.
Meeting scheduling agent [32] schedules meetings by learning repetitive actions the user performs. Memory-based reasoning and reinforcements learning are used. Users
can give explicit feedback.
Results: Confidence
for correct predictions settles at 0.8 to 1.0. Confidence for incorrect predictions settles
at 0 to 0.2. Some rouge confidence values remain after settling
time.
Haynes’ [19] meeting scheduler assigns an agent to each user, who then programs it with their personal
preferences. The agents then negotiate meeting times
with each
other.
5.1.7 News filtering
domain
ANATAGONOMY [69] is based on the Krakatoa Chronicle,
providing a personalized newspaper. Implicit feedback from
user activity has been
added.
Results:
1-10%
error after 3 days
settling time.
Butterfly [29] finds interesting conversations within Usenet newsgroups.
The user initially provides keywords,
and term frequency similarity between
newsgroups and
the user’s profile is
computed.
IAN [17] filters Usenet news, taking relevance feedback from the user. C4.5 rule induction
with TF keyword selection
(low entropy words being removed) is compared
to IBPL (as
used in MAGI).
Results: accuracy – C4.5 broad topics 70%, narrow topics 25-30% IBPL broad topics
59-65%, narrow topics
40-45%.
The Krakatoa Chronicle [22] is a personalized newspaper
which adapts to its users’
preferences. User reading is monitored and relevance feedback accepted. The SMART algorithm
is used, with TF-IDF,
to represent articles and
compute similarity.
NewsDude [6] reads interesting news articles via a speech interface. The news source
is Yahoo! News, with an initial training set of interesting news articles
provided by the user. Length of listening time provides implicit user feedback on articles read out.
A short-term user model is based on TF-IDF (cosine similarity), and long-term model based
on a
naïve Bayes
classifier (multi-variate Bernoulli formulation).
Results: Accuracy 60-76% (using hybrid of long and shor<