Robot Task Planning and Situation Handling in Open Worlds

2025-04-15 0 0 2.87MB 7 页 10玖币
侵权投诉
Robot Task Planning and Situation Handling in Open Worlds
Yan Ding1, Xiaohan Zhang1, Saeid Amiri1, Nieqing Cao1, Hao Yang2, Chad Esselink2, Shiqi Zhang1
Abstract Automated task planning algorithms have been
developed to help robots complete complex tasks that require
multiple actions. Most of those algorithms have been developed
for “closed worlds” assuming complete world knowledge is pro-
vided. However, the real world is generally open, and the robots
frequently encounter unforeseen situations that can potentially
break the planner’s completeness. This paper introduces a
novel algorithm (COWP) for open-world task planning and
situation handling that dynamically augments the robot’s action
knowledge with task-oriented common sense. In particular,
common sense is extracted from Large Language Models based
on the current task at hand and robot skills. For systematic
evaluations, we collected a dataset that includes 561 execution-
time situations in a dining domain, where each situation
corresponds to a state instance of a robot being potentially
unable to complete a task using a solution that normally works.
Experimental results show that our approach significantly
outperforms competitive baselines from the literature in the suc-
cess rate of service tasks. Additionally, we have demonstrated
COWP using a mobile manipulator. The project website is
available at: https://cowplanning.github.io/, where
a more detailed version can also be found. This version has
been accepted for publication in Autonomous Robots.
I. INTRODUCTION
Robots that operate in the real world frequently encounter
complex tasks that require multiple actions. Automated task
planning algorithms have been developed to help robots
sequence actions to accomplish those tasks [1]. Closed world
assumption (CWA) is a presumption that was developed
by the knowledge representation community and states that
“statements that are true are also known to be true” [2]. Most
current task planners have been developed for closed worlds,
assuming complete domain knowledge is provided and one
can enumerate all possible world states [3]–[6]. However,
the real world is “open” by nature, and unforeseen situations
are common in practice [7]. As a consequence, current
automated task planners tend to be fragile in open worlds rife
with situations. Fig. 1 shows an example situation: Aiming
to grasp a cup for drinking water, a robot found the cup
was occupied with forks and knives. Although one can name
many such situations, it is impossible to provide a complete
list of them. To this end, researchers have developed open-
world planning methods for robust task completions in real-
world scenarios [7]–[12].
There are mainly two ways of performing open-world
planning without humans in the loop in the literature. One
relies on dynamically building an external knowledge base to
1Yan Ding, Xiaohan Zhang, Saeid Amiri, Nieqing Cao, and Shiqi Zhang
are with SUNY Binghamton. {yding25, xzhan244, samiri1,
ncao1, zhangs}@binghamton.edu
2Hao Yang and Chad Esselink are with Ford Motor Company.
{hyang1, cesselin}@ford.com
Wrist Camera
Gripper
UR5e Arm
Segway Base
CPU
Battery
Fig. 1. An illustrative example of a situation in the real world, encountered
during the execution of the plan “delivering a cup to a human for drinking
water.” The robot approached a cabinet in a kitchen room on which a cup
was located. The robot then found the cup to be delivered. Before grasping it,
however, the robot detected a situation that the cup was occupied with a fork,
a knife, and a spoon. This situation prevented the robot from performing
the current action (i.e., grasping) and rendered normal solutions for drinking
water invalid. A mobile service robot with a UR5e arm on a Segway RMP-
110 base was used for demonstrations in this work.
assist a pre-defined task planner, where this knowledge base
is usually constructed in an automatic way using external
information [7]–[9]. Such external knowledge bases are con-
sidered bounded due to their representation and knowledge
source, which limits the “openness” of their task planners.
Another (more recent) way of building open-world planners
is to leverage Large Language Models (LLMs) [13]. LLMs
have significantly improved the performance of downstream
language tasks in recent years [14]–[17]. Recent research
has demonstrated that those LLMs contain a wealth of
commonsense knowledge [12], [18]–[20]. While it is an in-
tuitive idea of extracting common sense from LLMs for task
planning [10]–[12], [21], there is a fundamental challenge
for robots to “ground” domain-independent commonsense
knowledge [22] to specific domains that are featured with
many domain-dependent constraints. We propose to acquire
common sense from LLMs, and aim to improve the task
completion and situation handling skills of service robots.
In this paper, we develop a robot task planning framework,
called Common sense-based Open-World Planning (COWP),
that uses an LLM (GPT-3 in our case [14]) for dynam-
ically augmenting automated task planners with external
task-oriented common sense. COWP is based on classical
planning and leverages LLMs to augment action knowledge
(action preconditions and effects) for task planning and
situation handling. The main contribution of this work is
a novel integration of a pre-trained LLM with a knowledge-
based task planner. Inheriting the desirable features from
both sides, COWP is well grounded in specific domains while
arXiv:2210.01287v2 [cs.RO] 29 Sep 2024
embracing commonsense solutions at large.
For systematic evaluations, we have created a dataset
with 561 execution-time situations collected from a dining
domain [23]–[25] using a crowd-sourcing platform, where
each situation corresponds to an instance of a robot not being
able to perform a plan (that normally works). According to
experimental results, we see COWP performed significantly
better than three literature-selected baselines [6], [8], [11]
in success rate. We implemented and demonstrated COWP
using a mobile manipulator.
II. BACKGROUND AND RELATED WORK
In this section, we first briefly discuss classical task
planning methods that are mostly developed under the closed
world assumption. We then summarize three families of
open-world task planning methods for robots, which are
grouped based on how unforeseen situations are addressed.
Classical Task Planning for Closed Worlds: Closed world
assumption (CWA) indicates that an agent is provided with
complete domain knowledge, and that all statements that are
true are known to be true by the agent [2]. Most automated
task planners have been developed under CWA [1], [5],
[26]. Although robots face the real world that is open by
nature, their planning systems are frequently constructed
under the CWA [7], [27], [28]. The consequence is that
those robot planning systems are not robust to unforeseen
situations at execution time. In this paper, we aim to develop
a task planner that is aware of and able to handle unforeseen
situations in open-world scenarios.
Open-World Task Planning with Human in the Loop:
Task planning systems have been developed to acquire
knowledge via human-robot interaction to handle open-
world situations [29]–[31]. For instance, researchers created
a planning system that uses dialog systems to augment
their knowledge bases [29], whereas Amiri et al. (2019)
further modeled the noise in language understanding [30].
Tucker et al. (2020) enabled a mobile robot to ground new
concepts using visual-linguistic observations, e.g., to ground
the new word “box” given command of “move to the box”
by exploring the environment and hypothesizing potential
new objects from natural language [31]. The major difference
from those open-world planning methods is that COWP does
not require human involvement.
Open-World Task Planning with External Knowledge:
Some existing planning systems address unforeseen situa-
tions by dynamically constructing an external knowledge
base for open-world reasoning. For instance, researchers have
developed object-centric planning algorithms that maintain a
database about objects and introduce new object concepts
and their properties (e.g., location) into their task plan-
ners [8], [9]. For example, Jiang et al. (2019) developed an
object-centric, open-world planning system that dynamically
introduces new object concepts through augmenting a local
knowledge base with external information [8]. In the work
of Hanheide et al. (2017), additional action effects and
assumptive actions were modeled as an external knowledge
to explain the failure of task completion and compute plans
in open worlds [7]. A major difference from their methods is
that COWP employs an LLM that is capable of responding
to any situation, whereas the external knowledge sources of
those methods limits the openness of their systems.
Open-World Task Planning with LLMs: LLMs can encode
a large amount of common sense from corpus [32] and
have been applied to robot systems to complete high-level
tasks. Example LLMs include BERT [33], GPT-2 [17], GPT-
3 [14], and OPT [15]. For example, Kant et al. (2022)
developed a household robot that uses a fine-tuned LLM
to reason about rearrangements of new objects [10]. Other
teams used LLMs to compute plans for high-level tasks
specified in natural language (e.g., “make breakfast”) by
sequencing actions [11], [12], [34]. Different from the above-
mentioned approaches, in addition to commonsense knowl-
edge extracted from LLMs, our system utilizes rule-based
action knowledge from human experts. As a result, our
planning system can be better grounded to specific domains,
and is able to incorporate common sense to augment robot
capabilities supported by predefined skills.
III. ALGORITHM
In this section, we first provide a problem statement
and then present our open-world planning approach called
Common sense-based Open-World Planning (COWP).
A. Problem Description
A classical task planning problem is defined by a do-
main description and a problem description. The domain
description includes a set of actions, and action preconditions
and effects. The problem description includes an initial state
and goal conditions. In this paper, such a classical planning
system is referred to as a closed-world task planner. In our
case, we provide the robot with a predefined closed-world
task planner (implemented using PDDL [35]), and an LLM
(GPT-3 in our case). PDDL, an action-centered language,
is designed to formalize Artificial Intelligence (AI) planning
problems, allowing for a more direct comparison of planning
algorithms and implementations [35]. A situation is defined
as an unforeseen world state that potentially prevents an
agent from completing a task using a solution that normally
works. The goal of an open-world planner is to compute
plans and handle situations towards completing service tasks
or reporting “no solution” as appropriate.
B. Algorithm Description
Fig. 2 illustrates the three major components (yellow
boxes) of our COWP framework. Task Planner is used
for computing a plan under the closed-world assumption
and is provided as prior knowledge in this work. Plan
Monitor evaluates the overall feasibility of the current plan
using common sense. Knowledge Acquirer is for acquiring
common sense to augment the robot’s action effects when
the task planner generates no plan.
Algorithm 1 describes how the components of COWP
interact with each other. Initially, Task Planner generates a
摘要:

RobotTaskPlanningandSituationHandlinginOpenWorldsYanDing1,XiaohanZhang1,SaeidAmiri1,NieqingCao1,HaoYang2,ChadEsselink2,ShiqiZhang1Abstract—Automatedtaskplanningalgorithmshavebeendevelopedtohelprobotscompletecomplextasksthatrequiremultipleactions.Mostofthosealgorithmshavebeendevelopedfor“closedworlds...

展开>> 收起<<
Robot Task Planning and Situation Handling in Open Worlds.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:学术论文 价格:10玖币 属性:7 页 大小:2.87MB 格式:PDF 时间:2025-04-15

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注