Privacy preserving planning in multi-agent stochastic environments

Collaborative privacy preserving planning ( cppp ) gained much attention in the past decade. cppp aims to create solutions for multi agent planning problems where cooperation is required to achieve an efficient solution, without exposing information that the agent considers private in the process. T...

Full description

Saved in:

Bibliographic Details
Published in:	Autonomous agents and multi-agent systems Vol. 36; no. 1
Main Authors:	Hefner, Tommy, Shani, Guy, Stern, Roni
Format:	Journal Article
Language:	English
Published:	New York Springer US 01.04.2022 Springer Nature B.V
Subjects:	Adaptation Algorithms Artificial Intelligence Computer Science Computer Systems Organization and Communication Networks Domains Dynamic programming Iterative algorithms Iterative methods Markov processes Message passing Multiagent systems Privacy Real-time programming Run time (computers) Software Engineering/Programming and Operating Systems Synchronism User Interfaces and Human Computer Interaction Privacy preserving planning Multi-agent planning
ISSN:	1387-2532, 1573-7454
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Collaborative privacy preserving planning ( cppp ) gained much attention in the past decade. cppp aims to create solutions for multi agent planning problems where cooperation is required to achieve an efficient solution, without exposing information that the agent considers private in the process. To date, cppp has focused on domains with deterministic action effects. However, in real-world problems action effects are often non-deterministic, and actions can have multiple possible effects with varying probabilities. In this paper, we introduce Stochastic cppp ( scppp ), which is an extension of cppp to domains with stochastic action effects. We show how scppp can be modeled as a Markov decision process ( mdp ) and how the value-iteration algorithm can be adapted to solve it. This adaptation requires extending value-iteration to support multiple agents and privacy. Then, we present two adaptions of the real-time dynamic programming ( rtdp ) algorithm, a popular algorithm for solving mdp s, designed to solve scppp problems. The first rtdp adaptation, called distributed rtdp ( drtdp ), yields identical behavior to applying rtdp in a centralized manner on the joint problem. To preserve privacy, drtdp uses a message passing mechanism adopted from the mafs algorithm. The second rtdp adaptation is an approximation of drtdp called public synchronization rtdp ( ps - rtdp ). ps - rtdp differs from drtdp mainly in its message passing mechanism, where ps - rtdp sends significantly fewer messages than drtdp . We experimented on domains adapted from the deterministic cppp literature by adding different stochastic effects to different actions. The results show that ps - rtdp can reduce the amount of messages compared to drtdp by orders of magnitude thus improving run-time, while producing policies with similar expected costs.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1387-2532 1573-7454
DOI:	10.1007/s10458-022-09554-w