Value Driven Landmarks for Oversubscription Planning
Frontier Search and Plan Reconstruction in Oversubscription Planning
Oversubscription planning (OSP) (Smith 2004) is a problem of choosing an action sequence which reaches a state with a high utility, given a budget for total action cost. This formulation allows to handle situations with underconstrained resources, which do not allow to achieve all possible goal propositions. In optimal OSP, the task is further constrained to finding a path which achieves a state with maximal utility. Best-First-Branch-and-Bound (BFBB) is a heuristic search algorithm which is widely used for solving OSP problems. BFBB relies on an admissible utility upper-bounding heuristic function (with budget restrictions) h: S × R 0+ → R to estimate the true utility h ∗ (s, b). An incremental BFBB search algorithm with landmark-based approximations (inc-compile-and-BFBB) was proposed for OSP heuristic search (Domshlak and Mirkis 2015) to address tasks with non-negative and 0-binary utility functions. inc-compile-and-BFBBmaintains the best solution so ar and a set of reference states, extended with all the non-redundant value-carrying states discovered during the search. Each iteration requires search re-start in order to exploit the new information obtained along the search. Recent work presented a relative estimation of achievements with value driven landmarks (Muller and Karpas 2018a) addressing arbitrary additive utility functions, which incrementally improves the best solution so far eliminating the need to maintain a set of reference states. This paper propose a progressive frontier search algorithm, which alleviates the computational cost of search restart once new information is acquired. Our technique allows the new search iteration to continue from any state on the frontier of the previous search iteration, leading to improved efficiency of the search. An extended version of this abstract is available online (Muller and Karpas 2018b).
Relative Net Utility and the Saint Petersburg Paradox
The famous St Petersburg Paradox shows that the theory of expected value does not capture the real-world economics of decision-making problem. Over the years, many economic theories were developed to resolve the paradox and explain the subjective utility of the expected outcomes and risk aversion. In this paper, we use the concept of the net utility to resolve the St Petersburg paradox. The reason why the principle of absolute instead of net utility does not work is because it is a first order approximation of some unknown utility function. Because the net utility concept is able to explain both behavioral economics and the St Petersburg paradox it is deemed a universal approach to handling utility. Finally, this paper explored how artificial intelligent (AI) agent will make choices and observed that if AI agent uses the nominal utility approach it will see infinite reward while if it uses the net utility approach it will see the limited reward that human beings see.
Automated Tactical Decision Planning Model with Strategic Values Guidance for Local Action-Value-Ambiguity
In many real-world planning problems, action's impact differs with a place, time and the context in which the action is applied. The same action with the same effects in a different context or states can cause a different change. In actions with incomplete precondition list, that applicable in several states and circumstances, ambiguity regarding the impact of the action is challenging even in small domains. To estimate the real impact of actions, an evaluation of the effect list will not be enough; a relative estimation is more informative and suitable for estimation of action's real impact. Recent work on Over-subscription Planning (OSP) defined the net utility of action as the net change in the state's value caused by the action. The notion of net utility of action allows for a broader perspective on value action impact and use for a more accurate evaluation of achievements of the action, considering inter-state and intra-state dependencies. To achieve value-rational decisions in complex reality often requires strategic, high level, planning with a global perspective and values, while many local tactical decisions require real-time information to estimate the impact of actions. This paper proposes an offline action-value structure analysis to exploit the compactly represented informativeness of net utility of actions to extend the scope of planning to value uncertainty scenarios and to provide a real-time value-rational decision planning tool. The result of the offline pre-processing phase is a compact decision planning model representation for flexible, local reasoning of net utility of actions with (offline) value ambiguity. The obtained flexibility is beneficial for the online planning phase and real-time execution of actions with value ambiguity. Our empirical evaluation shows the effectiveness of this approach in domains with value ambiguity in their action-value-structure.
Economics of Human-AI Ecosystem: Value Bias and Lost Utility in Multi-Dimensional Gaps
In recent years, artificial intelligence (AI) decision-making and autonomous systems became an integrated part of the economy, industry, and society. The evolving economy of the human-AI ecosystem raising concerns regarding the risks and values inherited in AI systems. This paper investigates the dynamics of creation and exchange of values and points out gaps in perception of cost-value, knowledge, space and time dimensions. It shows aspects of value bias in human perception of achievements and costs that encoded in AI systems. It also proposes rethinking hard goals definitions and cost-optimal problem-solving principles in the lens of effectiveness and efficiency in the development of trusted machines. The paper suggests a value-driven with cost awareness strategy and principles for problem-solving and planning of effective research progress to address real-world problems that involve diverse forms of achievements, investments, and survival scenarios.