1 Introduction
Logic-based languages are well-suited to model complex domains since they allow the representation of complex relations among the involved objects. Probabilistic logic languages, such as ProbLog (De Raedt et al. Reference De Raedt, Kimmig and Toivonen2007) and Probabilistic Answer Set Programming under the credal semantics (Cozman and Mauá Reference Cozman and Mauá2020), represent uncertain data with probabilistic facts (Sato Reference Sato1995), that is facts with an associated probability. The former is based on the Prolog language (Lloyd Reference Lloyd1987), while the latter adopts Answer Set Programming (ASP) (Brewka et al. Reference Brewka, Eiter and Truszczyński2011). ASP has been proved effective in representing hard combinatorial tasks, thanks to expressive constructs such as aggregates (Alviano and Faber Reference Alviano and Faber2018). Moreover, several extensions have been proposed for managing uncertainty also through weighted rules (Lee and Wang Reference Lee and Wang2016), increasing even further the possible application scenarios.
Usually, decision theory (DT) problems are composed of a set of possible actions, a selection of which defines a strategy, and a set of utility attributes, which indicate the utility (i.e., a reward, possibly negative) of completing a particular task. The goal is to find the strategy that optimizes the overall expected utility. Expressing DT problems with (probabilistic) logic languages enables users to identify the best action in uncertain and complex domains. While DTProbLog (Van den Broeck et al. Reference Van den Broeck, Thon, van Otterlo and De Raedt2010) is a ProbLog extension that solves DT tasks represented with a ProbLog program, no tool, to the best of our knowledge, is available to solve them with a probabilistic answer set language. We believe that a (probabilistic) ASP-based tool, by providing expressive syntactic constructs such as aggregates and choice rules, would be of great support in complex environments. For example, we may be interested in modeling a viral marketing scenario, where the goal is to select a set of people to target with specific ads to maximize sales. In this scenario, uncertainty could come from the shopping behavior of individuals.
In this paper we close this gap and introduce decision theory problems in Probabilistic Answer Set Programming under the credal semantics (DTPASP). In particular, we extend Probabilistic Answer Set Programming under the credal semantics (PASP) with decision atoms and utility attributes (Van den Broeck et al. Reference Van den Broeck, Thon, van Otterlo and De Raedt2010). Every subset of decision atoms defines a different strategy, that is a different set of actions that can be performed in the domain of interest. In the viral marketing example, the decisions are whether to target an individual with an ad. However, there is uncertainty on the actual effectiveness of the targeting action. At the same time, targeting a person involves a cost that can be represented with a utility attribute with a negative reward. The fact that a person buys a product, instead, is associated with a positive reward. Moreover, there can be other factors to consider, such as preferences among the different possible items that can be bought which can be conveniently represented using ASP.
In PASP, since each possible world can have multiple models, queries are associated with probability ranges indicated by a lower and an upper probability, instead of point values. Similarly, in DTPASP, we need to consider lower and upper rewards, and look for the strategies that maximize either of the two. So, a solution of a decision theory problem expressed with DTPASP is composed of two strategies.
We have developed two algorithms to solve the task of finding the strategy that maximizes the lower expected reward and the strategy that maximizes the upper expected reward, together with the values of these rewards. A first algorithm is based on answer sets enumeration. This can be adopted only on small domains, so we consider it as a baseline. We propose a second algorithm, based on three layers of Algebraic Model Counting (AMC) (Kimmig et al. Reference Kimmig, Van den Broeck and De Raedt2017) that adopts knowledge compilation (Darwiche and Marquis Reference Darwiche and Marquis2002) to speed up the inference process. Empirical results show that the latter algorithm is significantly faster than the one based on enumeration, and can handle domains of non-trivial size.
The paper is structured as follows: in Section 2 we briefly discuss background knowledge. Section 3 discusses DTProbLog, a framework to solve decision theory problems in Probabilistic Logic Programming. Section 4 illustrates the task of Second Level Algebraic Model Counting. Section 5 introduces DTPASP together with the optimization task to solve. In Section 6 we describe the algorithms to solve the task that we test against the baseline in Section 7. Section 8 surveys some of the related works and Section 9 concludes the paper.
2 Background
This section introduces the basic concepts of Answer Set Programming and Probabilistic Answer Set Programming.
2.1 Answer set programming
In the following, we will use the verbatim font to denote code that can be executed with a standard ASP solver. Here we consider a subset of ASP (Brewka et al. Reference Brewka, Eiter and Truszczyński2011). An ASP program (or simply an ASP) is a finite set of disjunctive rules. A disjunctive rule (or simply rule) is of the form
where every $h_i$ is an atom and every $b_i$ is a literal. We consider only safe rules, where every variable in the head also appears in a positive literal in the body. This is a standard requirement in ASP. If the head is empty, the rule is called a constraint, if the body is empty and there is only one atom in the head, the rule is called a fact, and if there is only one atom in the head with one or more literals in the body the rule is called normal. A choice rule is of the form $0\{a\}1 \,\,{:\!-}\ b_1,\dots, b_n$ and indicates that the atom $a$ can be selected or not if the body is true. Usually, we omit 0 and 1 and consider them implicit. ASP allows the use of aggregate atoms (Alviano and Faber Reference Alviano and Faber2018) in the body. We consider aggregates of the form $\# \varphi \{\epsilon _0 ; \dots ; \epsilon _n\} \ \delta g$ where $g$ is called guard and can be either a constant or a variable, $\delta$ is a comparison arithmetic operator, $\varphi$ is an aggregate function symbol, and $\epsilon _0, \dots, \epsilon _n$ is a set of expressions where each $\epsilon _i$ has the form $t_1, \dots, t_n : F$ and each $t_i$ is a term whose variables appear in the conjunction of literals $F$ . An example of aggregate atom is $\#count\{A : p(A)\} = 2$ , that is true if the number of ground substitutions for $A$ that make $p(A)$ true is $2$ . Here, $2$ is the guard and $\#count$ is the aggregate function symbol. The primal graph of a ground answer set program $\mathcal {P}$ is such that there is one vertex for each atom appearing in $\mathcal {P}$ and there is an undirected edge between two vertices if the corresponding atoms appear simultaneously in at least one rule of $\mathcal {P}$ .
The semantics of ASP is based on the concept of answer set, also called stable model. With $B_{\mathcal {P}}$ we denote the Herbrand base of an answer set program $\mathcal {P}$ , that is the set of ground atoms that can be constructed with the symbols in $\mathcal {P}$ . A variable is called local to an aggregate if it appears only in the considered aggregate; if instead it occurs in at least one literal not involved in aggregations, it is called global. The grounding of a rule with aggregates proceeds in two steps, by first replacing global variables with ground terms, and then replacing local variables appearing in aggregates with ground terms. An interpretation $I$ of $\mathcal {P}$ is a subset of $B_{\mathcal {P}}$ . An aggregate is true in an interpretation $I$ if the evaluation of the aggregate function under $I$ satisfies the guards. An interpretation satisfies a ground rule if at least one of the $h_i$ s is true in it when all the $b_i$ s are true in it. A model of $\mathcal {P}$ is an interpretation that satisfies all the groundings of all the rules of $\mathcal {P}$ . Given a ground program $\mathcal {P}_g$ and an interpretation $I$ , by removing from $\mathcal {P}_g$ of $\mathcal {P}$ all the rules where at least one of the $b_i$ s is false in an interpretation $I$ we get the reduct (Faber et al. Reference Faber, Leone and Pfeifer2004) of $\mathcal {P}_g$ with respect to $I$ . An answer set (or stable model) for a program $\mathcal {P}$ is an interpretation that is a minimal (under set inclusion) model of the reduct of the grounding $\mathcal {P}_g$ of $\mathcal {P}$ . We indicate with $AS(\mathcal {P})$ the set of all the answer sets of a program $\mathcal {P}$ . An ASP usually has multiple answer sets. However, we may be interested only in the projective solutions (Gebser et al. Reference Gebser, Kaufmann and Schaub2009) onto a set of ground atoms $B$ , which are given by the set $AS_{B}(\mathcal {P}) = \{A \cap B \mid A \in AS(\mathcal {P})\}$ .
To clarify these concepts, consider the following example:
Example 1 The following program $\mathcal {P}$ has, respectively, two choice rules, a normal rule, and a disjunctive rule.
The program has 5 answer sets: $AS(\mathcal {P}) = \{\{\}, \{a, qr\}, \{b, qr\}, \{b, qr, a\}, \{b, nqr\}\}$ . If we project the solutions onto the $qr$ and $a$ atoms we get 3 answer sets: $\{\{\}, \{a, qr\}, \{qr\}\}$ .
2.2 Probabilistic answer set programming (PASP)
PASP extends ASP by representing uncertainty with weights (Lee and Wang Reference Lee and Wang2016) or probabilities (Cozman and Mauá Reference Cozman and Mauá2016) associated with facts. Here we consider PASP under the credal semantics (Cozman and Mauá Reference Cozman and Mauá2016). We will use the acronym PASP to also denote a probabilistic answer set program, the intended meaning will be clear from the context.
PASP allows probabilistic facts of the form (De Raedt et al. Reference De Raedt, Kimmig and Toivonen2007) $\Pi _i::f_i$ where $\Pi _i \in [0,1]$ and $f_i$ is an atom. We only consider ground probabilistic facts. Moreover, we require that probabilistic facts cannot appear in the head of rules, a property called disjoint condition (Sato Reference Sato1995). Every possible subset of probabilistic facts (there are $2^n$ of them, where $n$ is the number of probabilistic facts) identifies a world $w$ , that is an ASP obtained by adding the atom of the selected probabilistic facts to the rules of the program. Each world $w$ is assigned a probability computed as
With this setting, we have two levels to consider: at the first level, we need to consider the worlds, each with an associated probability. At the second level, for each world we have one or more answer sets. Since a world may have more than one model, in order to assign a probability to queries we need to decide how the probability mass of the world is distributed on its answer sets. We can choose a particular distribution for the answer sets of a world, such as a uniform (Totis et al. Reference Totis, De Raedt and Kimmig2023) or a distribution that maximizes the entropy (Kern-Isberner and Thimm Reference Kern-Isberner and Thimm2010). However, here we follow the more general path of the credal semantics, and we refrain from assuming a certain distribution for the answer sets of a world. This implies that queries are associated with probability ranges instead of point values. Under this semantics, the probability of a query $q$ (i.e., a conjunction of ground literals), $P(q)$ , is associated with a probability range $[\underline {P}(q),\overline {P}(q)]$ where the lower and upper bounds are computed as:
In other words, the lower bound (or lower probability) $\underline {P}(q)$ is given by the sum of the probabilities of the worlds where the query is true in all answer sets and the upper bound (or upper probability) $\overline {P}(q)$ is given by the sum of the probabilities of the worlds where the query is true in at least one answer set. That is, a world contributes to both the lower and upper probability if the query is true in all of its answer sets while it contributes only to the upper probability if the query is true only in some of the answer sets. Note that $\underline {P}(q) = 1 - \overline {P}(not \ q)$ and $\overline {P}(q) = 1 - \underline {P}(not \ q)$ : this is true if every world has at least one answer set; otherwise, inconsistencies must be managed (see the discussion below). To clarify this, consider the following illustrative example:
Example 2 The following program has two probabilistic facts, $a$ and $b$ , with probabilities of 0.3 and 0.4, respectively.
There are $2^2 = 4$ worlds, listed in Table 1. Consider the query $qr$ . Call $w_0$ the world where both $a$ and $b$ are false, $w_1$ the world where $a$ is true and $b$ is false, $w_2$ the world where $b$ is true and $a$ is false, and $w_3$ the world where both $a$ and $b$ are true. $P(w_0) = (1 - 0.3) \cdot (1 - 0.4) = 0.42$ , $P(w_1) = 0.3 \cdot (1-0.4) = 0.18$ , $P(w_2) = (1-0.3) \cdot 0.4 = 0.28$ , and $P(w_3) = 0.3 \cdot 0.4 = 0.12$ . Note that the sum of the probabilities of the worlds equals 1. For $w_0$ , $AS(w_0) = \{\{\}\}$ , the query is false (not present) in the answer set $\{\}$ so we do not have a contribution to any of the probabilities. For $w_1$ , $AS(w_1) = \{\{a,qr\}\}$ , the query is true in the only answer set, so we have a contribution of $P(w_1)$ to both the lower and upper probability. For $w_2$ , $AS(w_2) = \{\{b,qr\}, \{b,nqr\}\}$ , the query is true in only one of the two answer sets, so we have a contribution of $P(w_2)$ only to the upper probability. For $w_3$ , $AS(w_3) = \{\{a,b,qr\}\}$ , the query is true in all the answer sets (there is only one) and we have a contribution of $P(w_3)$ to both the lower and upper probability. Overall, $P(qr) = [P(w_1) + P(w_3), P(w_1) + P(w_2) + P(w_3)] = [0.3,0.58]$ . Note that the set of all the answer sets of the worlds is the same as Example 1 (5 answer sets in total).
The credal semantics requires that every world has at least one answer set, that is it is satisfiable. If this does not hold, some of the probability mass, that is associated with the inconsistent worlds, is lost, since it is not considered neither in the formula for the lower nor in the formula for the upper probability (equation (2)). Let us denote with $P(\mathit {inc})$ the probability of the inconsistent worlds, computed as
We consider an approach also adopted in smProbLog (Totis et al. Reference Totis, De Raedt and Kimmig2023): the probability of the inconsistent worlds, $P(\mathit {inc})$ , is treated as a third probability value in addition to the lower and upper probability. In this way, $\underline {P}(q) = 1 - \overline {P}(\mathit {not} \ q) - P(\mathit {inc})$ and $\overline {P}(q) = 1 - \underline {P}(\mathit {not} \ q) - P(\mathit {inc})$ . In this case, if all the worlds are inconsistent, both the lower and upper probability are 0. Let us clarify this with an example:
Example 3 Consider Example 2 with an additional constraint:
Here, the constraint makes the world $w_3$ of Table 1 inconsistent as the answer set $\{a,b,qr\}$ violates the constraints. Thus $P(\mathit {inc}) = P(w_3) = 0.12$ . Consider the query $qr$ . We have $\underline {P}(qr) = P(w_1) = 0.18$ , $\overline {P}(qr) = P(w_1) + P(w_2) = 0.18 + 0.28 = 0.46$ , $\underline {P}(\mathit {not} \ qr) = P(w_0) = 0.42$ , and $\overline {P}(\mathit {not} \ qr) = P(w_0) + P(w_2) = 0.7$ . Now note that $\underline {P}(qr) = 1 - \overline {P}(\mathit {not} \ qr) - P(\mathit {inc}) = 1 - 0.7 - 0.12 = 0.18$ .
3 DTProbLog
DTProbLog extends the ProbLog language with a set $D$ of (possibly non-ground) decision atoms represented with the syntax $?::d$ where $d$ is an atom, and a set $U$ of utility attributes of the form $u \to r$ , where $r \in \mathbb {R}$ is the reward obtained when the utility atom $u$ is satisfied. In the rest of the paper, we will use the terms utility and reward interchangeably and we will use the notation $utility(u,r)$ in code snippets to denote utility attributes. A set of decision atoms defines a strategy $\sigma$ . There are $2^{|D|}$ possible strategies. A DTProbLog program is a tuple $(\mathcal {P},U,D)$ where $\mathcal {P}$ is a ProbLog program, $U$ is a set of utility attributes, and $D$ a set of decision atoms. Given a strategy $\sigma$ , adding all decision atoms from $\sigma$ to $\mathcal {P}$ yields a ProbLog program $\mathcal {P}_\sigma$ . The utility of a strategy $\sigma$ , $\mathit {Util}(\sigma )$ , is given by:
where
That is, the utility of a strategy $\sigma$ is the sum of the probabilities of the worlds $w_\sigma$ of the ProbLog program $P_\sigma$ identified by $\sigma$ multiplied by the reward of each world. The reward of a world $w$ , $R(w)$ , is computed as the sum of the rewards of the utility attributes true in $w$ . Equivalently, the task can also be expressed as:
where $P_\sigma (w)$ is computed with equation (1) by considering the ProbLog program $\mathcal {P}_\sigma$ . The goal of the decision theory task is to find the strategy that maximizes the utility, that is,
To clarify, consider Example4:
Example 4 The following DTProbLog program has two probabilistic facts, two decision atoms, and three utility attributes.
There are $2^2 = 4$ possible strategies that we indicate with $\sigma _{\emptyset }$ , $\sigma _{da}$ , $\sigma _{db}$ , and $\sigma _{dadb}$ , each one defining a different ProbLog program. With $\sigma _{\emptyset } = \{\}$ , both decision atoms are not selected, and we get a utility of 0. With $\sigma _{da} = \{da\}$ , we get $P(q) = 0.1$ , $P(da)\,{=}\,1$ , so $\mathit {Util}(\sigma _{da}) = 0.1 \cdot 4 - 1 \cdot 3 = -2.6$ . With $\sigma _{db} = \{db\}$ , we get $P(q) = 0.7$ , $P(db) = 1$ , so $\mathit {Util}(\sigma _{db}) = 0.7 \cdot 4 - 1 \cdot 2 = 0.8$ . With $\sigma _{dadb} = \{da,db\}$ , we get $P(q) = 0.73$ , $P(da)\,{=}\,1$ , $P(db) = 1$ , so $\mathit {Util}(\sigma _{dadb}) = 0.73 \cdot 4 - 1 \cdot 2 - 1 \cdot 3 = -2.08$ . Overall, the strategy that maximizes the utility is $\sigma _{db} = \{db\}$ .
A DTProbLog program is converted into a compact form based on Algebraic Decision Diagram (ADD) (Bahar et al. Reference Bahar, Frohm, Gaona, Hachtel, Macii, Pardo and Somenzi1997) with a process called knowledge compilation (Darwiche and Marquis Reference Darwiche and Marquis2002), extensively adopted in Probabilistic Logic Programming (De Raedt et al. Reference De Raedt, Kimmig and Toivonen2007; Riguzzi Reference Riguzzi2022). ADDs are an extension of Binary Decision Diagrams (BDDs) (Akers Reference Akers1978). BDDs are rooted directed acyclic graphs where there are only two terminal nodes, 0 and 1. Every internal node (called decision node) is associated with a Boolean variable and has two children, one associated with the assignment true to the variable represented by the node and one associated with false. Several additional imposed properties such as variable ordering allow one to compactly represent large search spaces, even if finding the optimal ordering of the variables that minimize the size of the BDD is a complex task (Meinel and Slobodová Reference Meinel and Slobodová1994). In ADDs, leaf nodes may be associated with elements belonging to a set of constants (e.g., natural numbers) instead of only 0 and 1, that has been proved effective in multiple scenarios (Bahar et al. Reference Bahar, Frohm, Gaona, Hachtel, Macii, Pardo and Somenzi1997).
4 Second level algebraic model counting
Kiesel et al. (Reference Kiesel, Totis and Kimmig2022) introduced Second Level Algebraic Model Counting (2AMC), needed to solve tasks such as MAP inference (Shterionov et al. Reference Shterionov, Renkens, Vlasselaer, Kimmig, Meert and Janssens2015) and Decision theoretic inference (Van den Broeck et al. Reference Van den Broeck, Thon, van Otterlo and De Raedt2010), in Probabilistic Logic Programming, and inference in smProbLog (Totis et al. Reference Totis, De Raedt and Kimmig2023) programs. These problems are characterized by the need for two levels of Algebraic Model Counting (2AMC) (Kimmig et al. Reference Kimmig, Van den Broeck and De Raedt2017).
The ingredients of a 2AMC problem are:
-
• a propositional theory $\Pi$ ;
-
• a partition $(V_i,V_o)$ of the variables in $\Pi$ ;
-
• two commutative semirings (Gondran and Minoux Reference Gondran and Minoux2008) $\mathcal {R}_{i} = (R^{i},\oplus ^i,\otimes ^i,n_{\oplus }^i,n_{\otimes }^i)$ and $\mathcal {R}_{o} = (R^{o},\oplus ^o,\otimes ^o,n_{\oplus }^o,n_{\otimes }^o)$ ;
-
• two weight functions, $w_{i}$ and $w_{o}$ , associating each literal of the program with a weight; and
-
• a transformation function $f$ mapping the values of $R^i$ to those of $R^o$
Let us denote with $T$ the tuple $(\Pi, V_i,V_o,\mathcal {R}_i,\mathcal {R}_o,w_i,w_o,f)$ . The task requires solving:
where $\mu (V_{o})$ is the set of possible assignments to the variables in $V_{o}$ and $\varphi (\Pi \mid I_{o})$ is the set of possible assignments to the variables in $\Pi$ that satisfy $I_{o}$ . In practice, for every possible assignment of the variables $V_o$ , we need to solve a first AMC task on the variables $V_i$ . Then, the transformation function maps the obtained values into elements of the outer semiring and we need to solve a second AMC task, this time by considering $V_o$ .
Within 2AMC, the DTProbLog task can be solved by considering (Kiesel et al. Reference Kiesel, Totis and Kimmig2022)
-
• as $V_o$ the decision atoms and as $V_i$ the remaining literals;
-
• as inner semiring the gradient semiring (Eisner Reference Eisner2002) $\mathcal {R}_i = (\mathbb {R}^2, +, \otimes, (0,0), (0,1))$ where $+$ is component-wise and $(a_0,b_0) \otimes (a_1,b_1) = (a_0 \cdot a_1, a_0 \cdot b_1 + a_1 \cdot b_0)$ ;
-
• as inner weight function $w_i$ mapping a literal $v$ to $(p,0)$ if $v = a$ where $a$ is a probabilistic fact $p::a$ , to $(1-p,0)$ if $v = not \ a$ where $a$ is a probabilistic fact $p::a$ , and all the other literals to $(1,r)$ where $r$ is their utility;
-
• as transformation function $f(p,u) = (u,\{\})$ if $p \neq 0$ and $f(0,u) = (-\infty, D)$ ;
-
• as outer semiring $\mathcal {R}_o = (\mathbb {R} \times 2^{|D|}, \oplus, \otimes, (-\infty, D), (0,\{\}))$ where $(a_0,b_0) \oplus (a_1,b_1)$ is equal to $(a_0,b_0)$ if $a_0 \gt a_1$ , otherwise $(a_1,b_1)$ and $(a_0,b_0) \otimes (a_1,b_1) = (a_0 + a_1, b_0 \cup b_1)$ ; and
-
• as outer weight function $w_o = (0,\{a\})$ if $a$ is a decision atom, $(0,\{\})$ otherwise.
Kiesel et al. (Reference Kiesel, Totis and Kimmig2022) also extended the aspmc tool (Eiter et al. Reference Eiter, Hecher and Kiesel2021) to solve 2AMC tasks. aspmc converts a program into a tractable circuit (knowledge compilation) by first grounding it and then by generating a propositional formula such that the answer sets of the original program are in one-to-one correspondence with the models of the formula. The pipeline is the following: first, aspmc breaks cycles in the program using Tp-unfolding (Eiter et al. Reference Eiter, Hecher and Kiesel2021), which draws inspiration from Tp-Compilation (Vlasselaer et al. Reference Vlasselaer, Van den Broeck, Kimmig, Meert and De Raedt2016). The obtained acyclic program is then translated to a propositional formula by applying a treewidth-aware version of Clark’s Completion similar to that of Hecher (Reference Hecher2022). Lastly, it solves the algebraic answer set counting task by leveraging knowledge compilers such as c2d (Darwiche Reference Darwiche2004). The result of the compilation is a circuit in negation normal form (NNF). An NNF is a rooted directed acyclic graph where leaf nodes are associated with a literal or a truth value (true or false) and internal nodes are associated with conjunctions or disjunctions. Usually, sd-DNNF are considered, that is NNF with three additional properties. If we denote with $V(n)$ all the variables that appear in the subgraph with root $n$ , these are (i) $V(n_a) \cap V(n_b) = \emptyset$ for each child $n_a$ and $n_b$ of an internal node associated with a conjunction (decomposability property); (ii) $n_a \land n_b$ is inconsistent for each child $n_a$ and $n_b$ of an internal node associated with a disjunction (determinism property); and (iii) $V(n_a) = V(n_b)$ for each child $n_a$ and $n_b$ of an internal node associated with a disjunction (smoothness property).
To handle 2AMC, aspmc has been extended to perform Constrained Knowledge Compilation (Oztok and Darwiche Reference Oztok and Darwiche2015). The idea is to compile the underlying logical part of the program into a tractable circuit representation over which it can evaluate the given 2AMC instance in polynomial time in its size. Naturally, since 2AMC is harder than (weighted) model counting, one cannot use any sd-DNNF here. Instead, aspmc uses X-first circuits which constrain the order in which variables are decided. Namely, the outer variables need to be decided first, before the inner variables can be decided.
Such ordering constraints can severely limit our options during compilation and, thus, often lead to much larger circuits. To alleviate this, Kiesel et al. (Reference Kiesel, Totis and Kimmig2022) introduced X/D-first circuits that allow for less strict variable orders by using the definability property (Definition1).
Definition 1 (Definability Lagniez et al. Reference Lagniez, Lonca and Marquis2016). A variable $a$ is defined by a set of variables $X$ with respect to a theory $T$ if for every assignment $x$ of $X$ it holds that $x \cup T \models a$ or $x \cup T \models not \ a$ . We denote the set of variables that are not in $X$ and defined by $X$ with respect to $T$ by $D(T, X)$ .
Example 5 In the program
The atom $c$ is defined in terms of $a,b$ and in terms of $na, b$ . Furthermore $a$ is defined by $na, b$ and $na$ is defined by $a, b$ . Only $b$ is not defined by any other variables.
Intuitively, a variable $y$ defined by $X$ can be seen to represent the truth value of a propositional formula over the variables in $X$ . Therefore, deciding $y$ is no different than making a complex decision over the variables in $X$ . It was shown that under weak conditions on the used semirings and transformation function, this also preserves the possibility of tractable evaluation of 2AMC instances over the resulting circuit (Kiesel et al. Reference Kiesel, Totis and Kimmig2022). During the evaluation of a 2AMC task, the variables of the circuit are split into two sets, call them $X$ and $Y$ . An internal node $n$ is termed pure if $V(n) \subseteq X \cup D(n,X)$ or $V(n) \subseteq Y$ , mixed otherwise. An NNF is X/D-first if, for each and node $n$ , all its children are pure or one child $n_i$ is mixed, all the others $n_j$ are pure and $V(n_j) \subseteq X \cup D(n_i,X)$ . This additional property allows handling the variables of $X$ and $Y$ separately but allows them to be decided somewhat intertwined, whenever variables in $Y$ are defined by the variables in $X$ . Korhonen and Järvisalo (Reference Korhonen and Järvisalo2021) have shown that it is highly beneficial to determine the order in which variables are decided from a so-called tree decomposition of the logical theory. Here, the decomposition intuitively provides guidance on how the problem can be split into smaller sub problems. The obtained circuit is evaluated bottom up.
5 Representing decision theory problems with probabilistic answer set programming
Following DTProbLog (Van den Broeck et al. Reference Van den Broeck, Thon, van Otterlo and De Raedt2010), we use $\mathit {utility}(a,r)$ to denote utility attributes where $a$ is an atom and $r \in \mathbb {R}$ indicates the utility of satisfying it. For example, with $\mathit {utility}(a,-3.3)$ we state that if $a$ is satisfied we get a utility of $-3.3$ . A negative utility represents, for example, a cost, while a positive utility represents a gain. We use the functor $\mathit {decision}$ to denote decision atoms. For example, with $\mathit {decision} a$ we state that $a$ is a decision atom. A decision atom indicates that we can choose whether to perform or not to perform the specified action. We consider only ground decision atoms.
Definition 2 A decision theory probabilistic answer set program DTPASP is a tuple $(\mathcal {P},D,U)$ where $\mathcal {P}$ is a probabilistic answer set program, $D$ is the set of decision atoms, and $U$ is the set of utility attributes.
Since in PASP queries are associated with a lower and an upper probability, in DTPASP we need to consider lower and upper rewards and look for the strategies that maximize them. A strategy is, as in DTProbLog, a subset of the possible actions. Having fixed a strategy $\sigma$ , we obtain a PASP $P_\sigma$ that generates a set of worlds. Each answer set $A$ of a world $w_{\sigma }$ of $P_{\sigma }$ is associated with a reward given by the sum of the utilities of the atoms true in it:
Let us call $\underline {R}(w_\sigma )$ the minimum of the rewards of an answer set of $w_\sigma$ and $\overline {R}(w_\sigma )$ the maximum of the rewards of an answer set of $w_\sigma$ . That is:
Since we impose no constraints on how the probability mass of a world is distributed among its answer sets, we can obtain the minimum utility from a world by assigning all the mass to the answer set with the minimum reward and the maximum utility from a world by assigning all the mass to the answer set with the maximum reward. The first is the worst-case scenario and the latter is the best-case scenario. The expected minimum $\underline {U}(w_\sigma )$ and maximum $\overline {U}(w_\sigma )$ utility from $w_\sigma$ are thus:
A strategy $\sigma$ is associated with a lower $\underline {\mathit {Util}}(\sigma )$ and upper $\overline {\mathit {Util}}(\sigma )$ utility given by, respectively:
Let us indicate the range of expected utilities with $\mathit {Util}(\sigma ) = [\underline {\mathit {Util}}(\sigma ),\overline {\mathit {Util}}(\sigma )]$ . Finally, the goal of decision theory in PASP is to find the two strategies that maximize the lower and upper bound of the expected utility, let us call them the lower and upper strategies, respectively, that is:
Furthermore, if every world in a PASP has exactly one answer set: (i) $\underline {\mathit {Util}}(\sigma )$ and $\overline {\mathit {Util}}(\sigma )$ of equation (12) coincide and (ii) equations (4) and (12) return the same value. In fact, if every world has exactly one answer set, the reward for each world is the same if we consider the best- or worst-case scenario, thus the sum of the rewards coincides in the two cases. Similar considerations hold for (ii).
5.1 Examples
In this section, we will discuss a practical application of DTPASP to the viral marketing scenario. We consider here the problem of computing the upper strategy, but considerations for the lower strategy are analogous. First, let us introduce a running example that will be discussed multiple times across the paper.
Example 6 (Running Example.). Consider a variation of Example 2 with 2 probabilistic facts and 2 decision atoms.
There are 4 possible strategies that we indicate with $\sigma _{\emptyset, da,db,dadb}$ . With $\sigma _{\emptyset } = \{\}$ we have the PASP
This program has 4 worlds, listed in Table 2, each having no answer sets where the utility atoms are true, so $\mathit {Util}(\sigma _{\emptyset }) = [0,0]$ .
With $\sigma _{da}$ we have the PASP
The worlds together with rewards and answer sets are listed in Table 3. This strategy has a utility of $\mathit {Util}(\sigma _{da}) = [0.36 + 0.24, 0.36 + 0.24] = [0.6,0.6]$ .
With $\sigma _{db}$ we have the PASP
The worlds together with rewards and answer sets are listed in Table 4. This strategy has a utility of $\mathit {Util}(\sigma _{db}) = [-3.36 - 1.44, 0.56 + 0.24] = [-4.8,0.8]$ .
With $\sigma _{dadb}$ we have the PASP
The worlds together with rewards and answer sets are listed in Table 5. This strategy has a utility of $\mathit {Util}(\sigma _{dadb}) = [0.36-3.36+0.24, 0.36+0.56+0.24] = [-2.76,1.16]$ . Overall, the strategy $\sigma _{da}$ yields the highest lower bound (0.6) for the utility while $\sigma _{dadb}$ yields the highest upper bound (1.16) for the utility. That is, in the worst-case we obtain a reward of 0.6 and in the best case a reward of 1.16.
Example 7 Consider a marketing scenario, where people go shopping with a given probability (probabilistic facts). We need to decide which people to target with a marketing action (decision atoms). If these people go shopping and are targeted with a personalized advertisement, then they can buy some products. These products have an associated utility, as the target operation does (because, e.g.,, targeting involves a cost). Moreover, suppose we have a constraint imposing a limit on the quantity of a certain product. For example, suppose we want to limit the sales of spaghetti to one unit, because the company has limited stock of that product. If we consider 2 people, Anna and Bob, the just described scenario can be represented with the following program.
By applying the same approach of Example 6, we have 4 possible strategies, $\sigma _{00} = \{\}$ , $\sigma _{01} = \{\mathit {target(anna)}\}$ , $\sigma _{10} = \{\mathit {target(bob)}\}$ , and $\sigma _{11} = \{\mathit {target(anna)}, \mathit {target(bob)}\}$ with $\mathit {Util}(\sigma _{00}) = [0,0]$ , $\mathit {Util}(\sigma _{01}) = [-1.2,2.8]$ , $\mathit {Util}(\sigma _{10}) = [1.5,1.5]$ , and $\mathit {Util}(\sigma _{11}) = [0.3,4.3]$ . Thus, $\sigma _{10} = \{\mathit {target(bob)}\}$ is the strategy that maximizes the lower bound of the utility while the strategy $\sigma _{11} = \{\mathit {target(anna)}, \mathit {target(bob)}\}$ is the strategy that maximizes the upper bound of the utility. So, in the worst-case, by targeting Bob, we get a reward of 1.5. Similarly, in the best case, if we target both Anna and Bob we get a reward of 4.3.
In Example6, for every strategy, every world is satisfiable (it has at least one answer set). However, this may not always be the case. Consider these three different scenarios: (i) a constraint is such that for some strategies, every world has no answer sets, while for all the remaining strategies every world has at least one answer set, that is the constraint involves only decision atoms; (ii) a constraint is such that some of the worlds in some strategies have no answer sets, that is the constraint involves decision atoms and probabilistic facts; (iii) a constraint is such that for all the strategies the same worlds have no answer sets, that is the constraint involves only probabilistic facts. These possible sources of inconsistencies are clarified with the following three examples.
Example 8 Consider Example 6 with the additional constraint $\,\,{:\!-}\ da, db$ that prevents $da$ and $db$ to be performed simultaneously. Here, all the probabilistic answer set programs for all the strategies have a credal semantics, except for the one with both $da$ and $db$ true ( $\sigma _{dadb}$ ), which leads to inconsistent worlds.
Example 9 Consider Example 6 with the additional constraint $\,\,{:\!-}\ db, a$ that prevents $db$ and $a$ to be true simultaneously. Consider $\sigma _{db}$ . Only two worlds are satisfiable, namely $w_0 = \{\}$ and $w_2 = \{b\}$ , so all the worlds of the PASP identified by the strategy $\sigma _{db}$ are inconsistent. Therefore, we have to decide whether to discard this strategy or to keep it and consider the inconsistent worlds in some way. Similar considerations can be applied to $\sigma _{dadb}$ . In this example, some strategies yield a consistent PASP, others do not.
Example 10 Consider Example 6 with the additional constraint $\,\,{:\!-}\ a, b$ that prevents $a$ and $b$ to be true simultaneously. Differently from Example 9, here every strategy $\sigma _{ij}, \ i,j \in \{da,db\}$ , results in a probabilistic answer set program has an inconsistent world, since world $w_3$ (Table 1), where both $a$ and $b$ are true, is inconsistent. Here, all strategies yield an inconsistent probabilistic answer set program.
The previous three examples illustrate some of the possible scenarios that may arise. In Example8, we can discard the inconsistent probabilistic answer set program where none of the worlds are satisfiable and pick the best strategy among $\sigma _{\emptyset }$ , $\sigma _{da}$ , and $\sigma _{db}$ . For Examples9 and 10, where only some of the worlds are inconsistent, we can ignore these and proceed in analyzing the remaining.
6 Algorithms for computing the best strategy in a DTPASP
To solve the optimization problems represented in equation (13), we discuss two different algorithms. We have three different layers of complexity: (i) the computation of the possible strategies, (ii) the computation of the worlds, and (iii) the computation of the answer sets with the highest and lowest reward (equation (10)). Due to this, the task cannot be solved with standard optimization constructs, such as $\#\mathit {maximize}$ , available in ASP solvers.
We implemented a first exact algorithm that iteratively enumerates all the strategies and, for every strategy, computes the answer sets for every world. Given a decision theoretic probabilistic answer set program with $n$ probabilistic facts and $d$ decision atoms, if every world is satisfiable, we need to generate at least $2^d \cdot 2^n = 2^{d+n}$ answer sets. Clearly, this algorithm is feasible only for trivial domains, so we consider it only as a baseline.
We propose a second and more interesting approach. First, the task of equation (13) cannot be represented as a 2AMC problem since it requires three layers of AMC (3AMC). Thus, by extending equation (8), we define 3AMC as:
That is, we add a third layer of AMC on top of 2AMC, obtaining 3AMC. For the decision theory task in probabilistic answer set programming, in the innermost layer (call it $A_i$ ), we have both the strategy and the world fixed, and we compute the reward for each answer set (equation (9)) and find the one that minimizes and maximizes the reward. In the middle layer (call it $A_m$ ), we have a fixed strategy and we need to find the probabilities of all the worlds, multiply them by the optimal rewards obtained in the previous step, and sum all these products (equation (12)). Lastly, in the outer layer (call it $A_o$ ) we need to compute all the strategies and find the two that maximize the lower and upper utility (equation (13)), respectively. Note that with only one AMC we are able to find both. If we call $H$ , $D$ , and $F$ , the set of all the variables, decision atoms, and probabilistic facts, respectively, for the innermost layer $A_i$ we have:
-
• as semiring, the minmax-plus semiring $\mathcal {R}_i = (\mathbb {R}^2, minmax, +^2, (\infty, -\infty ), (0,0))$ , where $minmax((a_0,b_0),(a_1,b_1))$ returns the pair ( $a_0$ if $a_0 \lt a_1$ else $a_1$ , $b_0$ if $b_0 \gt b_1$ else $b_1$ ) and $+^2((a_0,b_0),(a_1,b_1)) = (a_0+a_1,b_0+b_1)$
-
• as variables $H \setminus D \setminus F$
-
• as weight function
\begin{equation*} w_0(a) = \begin {cases} (r,r) & \text {if } \,(a,r) \in U, \\ (0,0) & \text { otherwise}. \end {cases} \end{equation*}
As transformation function $f_{im}$ that maps the values of $A_i$ to $A_m$ we have $f_{im} : \mathbb {R}^2 \to \mathbb {R}^3$ , $f_{im}(a,b) = (1,a,b)$ . As middle layer $A_m$ we have:
-
• as semiring, the two-gradient semiring $\mathcal {R}_m = (\mathbb {R}^3, \oplus ^G, \otimes ^G, (0,0,0), (1,0,0))$ with $(a_0,b_0,c_0) \oplus ^G (a_1, b_1,c_1) = (a_0 + a_1, b_0 + b_1, c_0 + c_1)$ and $(a_0,b_0,c_0) \otimes ^G (a_1, b_1,c_1) = (a_0 \cdot a_1, a_0 \cdot b_1 + a_1 \cdot b_0, a_0 \cdot c_1 + a_1 \cdot c_0)$
-
• as variables $F$
-
• as weight function
\begin{equation*} w_1(a) = \begin {cases} (p,0,0) & \text { if } a = f \text { where } f \text { is a probabilistic fact } p::f, \\ (1-p,0,0) & \text { if } a = not \ f \text { where } f \text { is a probabilistic fact } p::f, \\ (1,0,0) & \text { otherwise}. \end {cases} \end{equation*}
Here, the first component stores the probability computed so far while the remaining two store the lower and upper bounds for the utility. As transformation function $f_{mo}$ that maps the values of $A_m$ to $A_o$ , we have $f_{mo}(a,b,c) = (b,c,\{\},\{\})$ , where $b$ and $c$ are the second and third components of the triple obtained from the previous layer. Lastly, as outer layer $A_o$ we have:
-
• as semiring $\mathcal {R}_o = (\mathbb {R}^2 \times {2^{|D|}}^2, \mathit {max}^4, \mathit {sum}^4, (\infty, -\infty, D,D), (0,0,\{\},\{\}))$
-
• as variables $D$
-
• as weight function
\begin{equation*} w_2(a) = \begin {cases} (0,0,\{a\},\{a\}) & \text { if } a \text { is a decision atom,}\\ (0,0,\{\},\{\}) & \text { otherwise}. \end {cases} \end{equation*}
where $\mathit {max}^4((v_0,v_1,S_0,S_1),(v_a,v_b,S_a,S_b)) = (v_x,v_y,S_x,S_y)$ with $v_x = v_0$ if $v_0 \gt v_a$ else $v_a$ , $v_y = v_1$ if $v_1 \gt v_b$ else $v_b$ , $S_x = S_0$ if $v_0 \gt v_a$ else $S_a$ , $S_y = S_1$ if $v_1 \gt v_b$ else $S_b$ and $\mathit {sum}^4((v_0,v_1,S_0,S_1),(v_a,v_b,S_a,S_b)) = (v_0 + v_a, v_1 + v_b, S_0 \cup S_a, S_1 \cup S_b)$ . The first two components store, respectively, the value of the lower and upper strategies and the last two the decision atoms yielding these values. Note that we assume that decision atoms do not have an associated utility. This is not a restriction since it is always possible to mimic it by adding a rule $rda \,\,{:\!-}\ da$ and a utility attribute on $rda$ for a decision atom $da$ .
6.1 Implementation in aspmc
Before discussing the algorithm, let us introduce some concepts.
Definition 3 A tree decomposition (Bodlaender Reference Bodlaender1988) of a graph $G$ is a pair $(T, \chi )$ , where $T$ is a tree and $\chi$ is a labeling of $V(T)$ (the set of nodes of $T$ ) by subsets of $V(G)$ (the set of nodes of $G$ ) s.t. 1) for all nodes $v \in V(G)$ there is $t \in V(T)$ s.t. $v \in \chi (t)$ ; 2) for every edge $\{v_1, v_2\} \in V(E)$ there exists $t \in V(T)$ s.t. $v_1, v_2 \in \chi (t)$ ; and 3) for all nodes $v \in V(G)$ the set of nodes $\{t \in V(T) \mid v \in \chi (t)\}$ forms a (connected) subtree of $T$ . The width of $(T, \chi )$ is $\max _{t \in V'} |\chi (t)| - 1$ . The treewidth of a graph is the minimal width of any of its tree decompositions.
Intuitively, treewidth is a measure of the distance of a graph from being a tree. Accordingly, a graph is a tree if and only if it has treewidth $1$ . The idea behind treewidth is that problems that are simple when their underlying structure is a tree, may also be simple when they are not far from trees, that is have low treewidth. Practically, we can use tree decompositions witnessing the low treewidth to decompose problems into smaller subproblems in such cases.
We assume that programs have already been translated into equivalent 3AMC instances, where the propositional theory is a propositional formula in conjunctive normal form (CNF). CNFs are sets of clauses $C_i$ , representing their conjunction, where a clause is a set of literals, that is possibly negated propositional variables, which is interpreted disjunctively.
Example 11 Our running example for a CNF is the formula
Here, $v_1, v_2, x_1, \dots$ are propositional variables, and thus, $v_1, \neg v_1$ are both literals. Furthermore, $v_1 \vee x_1$ and $\neg v_1 \vee \neg x_1$ denote clauses.
For CNFs, the relevant underlying structure is often chosen as their primal graph.
Definition 4 Given a CNF $\mathcal {C}$ , the primal graph (Oztok and Darwiche Reference Oztok and Darwiche2014) of $\mathcal {C}$ is defined as the graph $G = (V, E)$ such that $V$ is the set of propositional variables occurring in $V$ and $\{v,x\} \in E$ if $v, x$ co-occur in a clause of $\mathcal {P}$ .
Example 12 (cont.). The primal graph of $\mathcal {C}_{run}$ is given in Figure 1. Two of its tree decompositions are given in Figure 2. Its treewidth is 2 (Figure 2 left). A smaller width is not possible here, since the graph contains a clique over three vertices ( $x_1, x_2, x_3$ ). The decomposition shown in Figure 2 right is not optimal as it has width 4.
Typically in constrained compilation, for a 3AMC instance with propositional theory $T$ and, outer, middle, and inner variables $\mathbf {X}_{O},\mathbf {X}_{M},\mathbf {X}_{I}$ , respectively, one would first decide all variables in $\mathbf {X}_{O}$ , then all variables in $\mathbf {X}_{M}$ , and finally all variables in $\mathbf {X}_{I}$ . In general, this is necessary to preserve correctness for 3AMC evaluation. However, the idea of Kiesel et al. (Reference Kiesel, Totis and Kimmig2022) allows us to perform constrained compilation along a tree decomposition, where these ordering constraints are relaxed using defined variables.
Given a CNF $\mathcal {C}$ and a partition $\mathbf {X}_{O},\mathbf {X}_{M},\mathbf {X}_{I}$ of its variables $\mathbf {X}$ , a tree decomposition $(T, \chi )$ of the primal graph of $\mathcal {C}$ is a $\mathbf {X}_{O} \gt \mathbf {X}_{M} \gt \mathbf {X}_{I}/D$ tree decomposition, if
-
1. there exists $t_{O} \in V(T)$ such that
-
(a) $\chi (t_{O}) \subseteq \mathbf {X}_{O} \cup D(C, \mathbf {X}_{O})$ , and
-
(b) every path from $\mathbf {X}_{O}$ to $\mathbf {X}\setminus (\mathbf {X}_{O} \cup D(C, \mathbf {X}_{O}))$ in the primal graph of $\mathcal {C}$ uses a vertex from $\chi (t_{O})$ ,
-
-
2. and there exists $t_{M} \in V(T)$ such that
-
(a) $\chi (t_{M}) \subseteq \mathbf {X}_{M} \cup D(C, \mathbf {X}_{O} \cup \mathbf {X}_{M})$ , and
-
(b) every path from $\mathbf {X}_{M}$ to $\mathbf {X}\setminus (\mathbf {X}_{O} \cup \mathbf {X}_{M}\cup D(C, \mathbf {X}_{O} \cup \mathbf {X}_{M}))$ in the primal graph of $\mathcal {C}$ uses a vertex from $\chi (t_{M})$ .
-
This property guarantees the following: during constrained compilation, we can decide all variables in $\chi (t_{O})$ (as in 1. (a)) and be sure that the remaining CNF decomposes into strongly connected components that either contain only variables in $\mathbf {X}_O \cup D(C, \mathbf {X}_{O})$ or no variables in $\mathbf {X}_O \cup D(C, \mathbf {X}_{O})$ . Intuitively, we separate the outer variables from the remaining ones (modulo definition). An analogous statement can be made for $\chi (t_{M})$ (as in 2. (a)).
The same arguments as in Kiesel et al. (Reference Kiesel, Totis and Kimmig2022) allow us to conclude that we can use $\mathbf {X}_{O} \gt \mathbf {X}_{M} \gt \mathbf {X}_{I}/D$ tree decompositions to compile 3AMC instances and obtain a correct result.
As with 2AMC, definability helps in solving the task.
Example 13 (cont.). Consider a 3AMC instance with propositional theory $\mathcal {C}_{run}$ , and outer, middle, and inner variables
If we disregard definability, an optimal tree decomposition that implements order constraints is the right decomposition in Figure 2. We have as $t_{O}$ the first bag from the top and as $t_{M}$ the third bag from the top. This is optimal, since the minimal set of variables from $\mathbf {X}_{O}$ to separate $\mathbf {X}_{O}$ from $\mathbf {X}_{M} \cup \mathbf {X}_{I}$ is $\mathbf {X}_{O}$ . Additionally, since we need to satisfy all properties of a tree decomposition, we need a bag that contains all variables in $\mathbf {X}_{O}$ and $x_1$ .
On the other hand, if we exploit definability, the left decomposition in Figure 2 is allowed. Due to the clauses $v_1 \vee x_1$ , and $\neg v_1 \vee \neg x_1$ , $x_1$ is defined in terms of $v_1$ and therefore $x_1 \in D(\mathcal {C}_{run}, \mathbf {X}_{O})$ . Then, we can choose the second bag from the top as $t_{O}$ , and choose the third bag from the top as $t_{M}$ . Clearly, $\{x_1\}$ already separates $\mathbf {X}_{O}$ from the non-defined variables of the variables.
To generate $\mathbf {X}_{O} \gt \mathbf {X}_{M} \gt \mathbf {X}_{I}/D$ tree decompositions, we proceed by using Algorithm1 on the CNF generated by aspmc. We first choose the variables to decide first in order to separate the outer variables $\mathbf {X}_O$ from the remaining ones (lines 1–3), and generate a decomposition $TD_O$ for $G_O$ , the part of the primal graph $G$ that we “cut off” using the separator $S_O$ (lines 4–7). Here, SCC $(G)$ denotes the set of strongly connected components of $G$ , and CLIQUE $(V)$ the complete graph over the vertices $V$ . Furthermore, MINIMUMSEPARATOR $(G,V,W)$ is a polynomial subroutine based on a standard min-cut/max-flow algorithm that computes a minimum set of vertices from $V \cup W$ that separates all vertices in $W$ from any other vertices in $G$ , and TREEDECOMPOSITION $(G)$ computes a decomposition from $G$ .Footnote 1 We repeat this idea for the inner variables (lines 8–13) obtaining a decomposition $TD_M$ . Note that in this step we consider variables that are defined in terms of both the outer and the middle variables, allowing additional freedom. Next, we generate a decomposition $TD_I$ for the remaining inner variables, and finally combine all three decompositions to one final decomposition of $C$ . For this step it is important that we add the cliques to $G_O, G_M$ , and $G_I$ , since these intuitively mark the points at which we connect the partial decompositions.
Example 14 (cont.). Given CNF $\mathcal {C}_{cur}$ and the partition
Algorithm 1 would choose $S_O = \{x_1\}$ , and produce $TD_{O}$ as the first two rows of the left tree decomposition in Figure 2 . $S_M$ would be chosen as $\{x_2, x_3\}$ , and the decomposition contains the third bag from the top of the left tree decomposition in Figure 2 as a singular bag. Finally, $TD_I$ could consist of the bottom four rows. Combining them would result in the left decomposition in Figure 2 .
The last step in our implementation is to evaluate our 3AMC instances over the constrained circuits produced by c2d (Darwiche Reference Darwiche2004). For this, we use the standard approach of Kiesel et al. (Reference Kiesel, Totis and Kimmig2022).
7 Experiments
We implemented the two aforementioned algorithms in Python. We integrated the enumeration-based algorithm into the open-source PASTA solver (Azzolini et al. Reference Azzolini, Bellodi, Riguzzi, Gottlob, Inclezan and Maratea2022) that leverages clingo (Gebser et al. Reference Gebser, Kaminski, Kaufmann and Schaub2019) to compute the answer sets.Footnote 2 The algorithm based on 3AMC is built on top of aspmc (Eiter et al. Reference Eiter, Hecher and Kiesel2021) and we call it aspmc3.Footnote 3 During the discussion of the results, we denote them as PASTA and aspmc3, respectively. We ran the experiments on a computer with Intel® Xeon® E5-2630v3 running at 2.40 GHz with 8 Gb of RAM and a time limit of 8 h. Execution times are computed with the bash command time and we report the real field. We generated six synthetic datasets for the experiments. In the following examples, we report the aspmc3 version of the code. The programs are the same for PASTA except for the negation symbol: $not$ for PASTA and $\backslash +$ for aspmc3. All the probabilities of the probabilistic facts are randomly set. In the following snippets we will use the values 0.1, 0.2, 0.3, and 0.4 for conciseness. Moreover, every PASP obtained from every strategy has at least one answer set per world.
As a first test (t1), we fix the number $n$ of probabilistic facts to 2, 5, 10, and 15 and increase the number $d$ of decision atoms from 0 until we get a memory error or reach the timeout. We associate a utility of 2 to $qr$ and −12 to $nqr$ , as in Example6. We use $da/1$ for decision atoms and $a/1$ for probabilistic facts. We identify the different individuals of the programs with increasing integers, starting from 0, and, in each of these, we add a rule $qr \,\,{:\!-}\ a(j), da(i)$ if $i$ is even and two rules $qr \,\,{:\!-}\ da(i), a(j), \backslash + nqr$ and $nqr \,\,{:\!-}\ da(i), a(j), \backslash + qr$ , if $i$ is odd, where $j = i \% n$ . For example, with $n = 2$ and $d = 4$ , we have:
In a second test (t2) we consider a dual scenario w.r.t. t1: here, we fix the number of decision atoms to 2, 5, 10, and 15, and increase the number of probabilistic facts from 0 until we get a memory error or reach the timeout. The generation of the rules follows the same pattern of t1, but with probabilistic facts swapped with decision atoms. Similarly, we still consider two utility attributes. For example, with $n = 4$ and $d = 2$ , we have:
In a third test (t3), for every index $i$ , we insert both a probabilistic fact and a decision atom. We add a rule $qr \,\,{:\!-}\ a(i), da(i)$ if $i$ is even and two rules $qr \,\,{:\!-}\ a(i), da(i), \backslash + nqr$ and $nqr \,\,{:\!-}\ a(i), da(i), \backslash + qr$ if $i$ is odd. Moreover, we add a rule $rda(i) \,\,{:\!-}\ da(i)$ for every $i$ and associate a random utility between −10 and 10 to each $rda(i)$ , in addition to $qr/0$ and $nqr/0$ . For example, with $n = d = 4$ we have:
In a fourth test (t4), we adopt the same setting of t3 but the utilities are given only to $qr/0$ and $nqr/0$ .
In another test (t5), we consider three rules: $qr \,\,{:\!-}\ \bigwedge _{i \ even} a(i), da(i)$ , $qr \,\,{:\!-}\ \bigwedge _{i \ odd} a(i), da(i), \backslash + nqr$ , and $nqr \,\,{:\!-}\ \bigwedge _{i \ odd} a(i), da(i), \backslash + qr$ , and a rule $rda(i) \,\,{:\!-}\ da(i)$ for every $i$ . Utilities are associated with $rda/1$ atoms (a random integer between −10 and 10), $qr/0$ (utility 2), and $nqr/0$ (utility −12). Here as well we test instances with increasing maximum index until we get a memory error or reach the timeout. For example, with 4 decision atoms and 4 probabilistic facts we have the following program:
In a last test (t6), we consider instances of Example7 with an increasing number of people and products involved. We associate a probabilistic fact $shops(i)$ and decision $target(i)$ for every individual $i$ , where $i$ ranges between 1 and $n$ , the size of the instance. Then, for each $i$ , we add a rule with two atoms in the head that define two different items that a person $i$ can buy, as in Example7. Finally, there is an additional rule $rbi(j) \,\,{:\!-}\ item(j)$ for every item $j$ and a utility attribute for every decision atom and $rbi/1$ . For example, the instance of size 10 has 10 probabilistic facts, 10 decision atoms, 10 rules $buy(item(a),i) ; buy(item(b),i) \,\,{:\!-}\ target(i), shops(i)$ where $i$ is the person and $item(a)$ and $item(b)$ represent two different products, randomly sampled from a list of 100 different products, 10 utility attributes of the form $utility(target(i),vt)$ , one for every person $i$ where $vt$ is a random integer between −5 and 5, a rule $rb(j) \,\,{:\!-}\ buy(item(j),i)$ for every selected product ( $2 \cdot 10$ in total) $item(j)$ and person $i$ , and one utility attribute $utility(rb(j),vu)$ for every such rule where $vu$ is a random integer between −10 and 10.
Figures 3, 4, and 5 show the execution times of the six tests: PASTA cannot scale over programs with more than 20 decision atoms and probabilistic facts combined. aspmc3 can manage larger instances in terms of decision atoms and probabilistic facts in a fraction of time w.r.t. PASTA. To better assess the performance of aspmc3, we grouped and reported the results also in Figure 6. From Figure 6a we can see that the instances of $t1$ take more time than those of $t2$ with the same index, so the number of decision atoms impacts more than the number of probabilistic facts (being their total number the same) on the execution time. For $t3$ and $t4$ , the execution times are similar, reaching the memory limit at instance 19. Nevertheless, for size 18, aspmc3 took approximately 200 s (Figure 6b) to complete: this shows that the program is able to manage up to 18 decision facts and probabilistic facts relatively quickly. For $t5$ , the execution time is almost constant: this is due to the knowledge compilation step since the program is composed of only three rules with bodies with increasing length. For $t6$ , aspmc3 reaches the memory limit at instance 16. For the instance of size 15, the execution time is approximately 360 s. Table 6 lists all the tests with the number of decision atoms, probabilistic facts, and utility attributes for the corresponding largest solvable instance. Overall, the adoption of 3AMC makes a huge improvement over naive answer set enumeration, even if it requires a non-negligible amount of memory.
8 Related work
This work is inspired to DTProbLog (Van den Broeck et al. Reference Van den Broeck, Thon, van Otterlo and De Raedt2010). If we only consider normal rules, the decision theory task can be expressed with both DTProbLog and our framework, but our framework is more general, since it admits a large subset of the whole ASP syntax.
The possibility of expressing decision theory problems with ASP gathered a lot of research interest in the past years. The author of (Brewka Reference Brewka2002) extended ASP by introducing Logic Programming with Ordered Disjunction (LPODs) based on the use of a new connective called ordered disjunction that specifies an order of preferences among the possible answer sets. This was the starting point for several works: Brewka (Reference Brewka2003) proposes a framework for quantitative decision-making while Confalonieri and Prade (Reference Confalonieri and Prade2011) adopt a possibilistic extension of LPODs. Also Graboś (2004) adopts LPODs and casts the decision theory problem as a constraint satisfaction problem with the goal of identifying the preferred stable models. Differently from these works, we define uncertainty using probabilistic facts, possible actions using decision atoms, and associate weights (utilities) to (some of the) possible atoms. Moreover, we do not define preferences over the answer sets, rather over the possible combinations of decision atoms (strategies) and combinations of probabilistic facts (worlds). Preferences among the possible answer sets can be encoded through weak constraints (Buccafurri et al. Reference Buccafurri, Leone and Rullo2000; Calimeri et al. Reference Calimeri, Faber, Gebser, Krennwallner, Leone, Ricca and Schaub2020) which are a standard feature of ASP systems. $\mathrm {LP}^{\mathrm {{MLN}}}$ (Lee and Wang Reference Lee and Wang2016) and P-log (Baral et al. Reference Baral, Gelfond and Rushton2009) are two languages to represent uncertainty with ASP, whose relationship, also with weak constraints, has been extensively explored in (Balai and Gelfond Reference Balai, Gelfond and Kambhampati2016; Lee and Yang Reference Lee and Yang2017). The former associates weights with atoms and rules while the latter adopts the so-called random selection rules. $\mathrm {LP}^{\mathrm {{MLN}}}$ supports the computation of the preferred stable models, according to a weight obtained by considering the weighted atoms present in an answer set. However, it does not consider the decision theory task. Similar considerations hold for P-log.
We discuss normalization to handle programs where some worlds have no stable models, which has already been proved effective in related scenarios (Fierens et al. Reference Fierens, Van den Broeck, Bruynooghe and De Raedt2012). There are other solutions that can be adopted in case of inconsistent programs, such as trying to “repair” them (Lembo et al. Reference Lembo, Lenzerini, Rosati, Ruzzi and Savo2010; Eiter et al. Reference Eiter, Lukasiewicz and Predoiu2016). However, these are often tailored to the considered program. The authors of Rocha and Gagliardi Cozman (Reference Rocha and Cozman2022) propose the adoption of least undefined stable models, that is partial models where the number of undefined atoms is minimum. With this, it is still possible to assign a semantics to programs where some worlds are inconsistent, and the probability of a query is still defined in terms of probability bounds (lower and upper). However, they do not discuss and develop an inference algorithm. Another possibility is the one proposed by Totis et al. (Reference Totis, De Raedt and Kimmig2023): the authors present smProbLog, an extension of the ProbLog2 framework (Dries et al. Reference Dries, Kimmig, Meert, Renkens, Van den Broeck, Vlasselaer and De Raedt2015) that allows worlds to have more than one model and even no models. Their approach consists in uniformly distributing the probability of a world into its stable models, and to consider three truth values for an atom (true, false, and inconsistent), with three associated probabilities. In this way, every atom has a sharp probability value. They also propose a practical algorithm to perform inference. In general, these two approaches do not consider the whole ASP syntax.
We propose how to find the strategy that maximizes the lower expected reward and the one that maximizes the upper expected reward. More generally, Augustin et al. (Reference Augustin, Coolen, De Cooman and Troffaes2014) distinguish between non-sequential and sequential decision problems. In non-sequential problems, the subject must choose only one of a number of possible actions, each of which leads to an uncertain reward. If a subject can express her beliefs through a probability measure, then a common solution is for her to choose the act that maximizes her expected utility. However, there are occasions when information cannot be represented through a linear prevision, but instead by a more general uncertainty model. In such circumstances, we would like to identify a best act in any choice. For example, suppose that one must choose between two gambles. To identify the preferred gamble from a particular set, we can specify gambles that we do not want to choose. The optimal set of gambles is what remains after all unacceptable gambles are eliminated. In sequential decision problems, the subject may have to make more than one decision, at different times: this problem is displayed on a decision tree, having branches representing acts and events. At the end of each path there is a number representing the utility reward of that particular combination of acts and events.
Markov Decision Processes (MDP) represent a class of sequential decision-making problems in a stochastic environment. A planning agent has to deliberate over its model of the world to choose an optimal action in each decision stage in order to maximize its accumulated reward (or minimize the accumulated cost) given the immediate and long-term uncertain effects of available actions. There are situations in which it is not easy (or even impossible) to define a precise probability measure for a given transition. In this case, it is necessary to consider a more general version of an MDP known as Markov Decision Processes with Imprecise Probabilities (Bueno et al. Reference Bueno, Mauá, Barros and Cozman2017): in this model, the probability parameters are imprecise and therefore the transition model cannot be specified by a single conditional distribution, but it must be defined by sets of probabilities for each state transition, which are referred to as transition credal sets. Bueno et al. (Reference Bueno, Mauá, Barros and Cozman2017) propose a novel language based on Probabilistic Logic Programming, enhanced with decision theoretic constructs such as actions, state fluents and utilities. They consider interval-valued probabilities attached to independent facts, that is a fact has to be associated with a probability interval $[\alpha, \beta ] :: p$ , where $p$ is an atom and parameters $\alpha$ and $\beta$ are probability bounds such as $0 \leq \alpha \leq \beta \leq 1$ . In the case of $\alpha =\beta$ , one has a standard probabilistic fact. The semantics of a probabilistic logic program with interval-valued facts is the credal set that consists of all probability distributions that satisfy the constraints (that is, whose marginal probabilities for facts lies within given intervals). In our work, we share the use of the credal semantics; however, probabilistic facts are annotated with a single probability value.
9 Conclusions and future works
In this paper, we discussed how to encode and solve a decision theory task with Probabilistic Answer Set Programming under the credal semantics. We proposed the class of decision theoretic probabilistic answer set programs, that is probabilistic answer set programs extended with decision atoms, representing the possible actions that can be taken, and utility attributes, representing the rewards that can be obtained. The goal is to find the two sets of decision atoms that yield the highest lower bound and the highest upper bound for the overall utility, respectively. We developed an algorithm based on three layers of Algebraic Model Counting and knowledge compilation and compared it against a naive algorithm based on answer set enumeration. Empirical results show that our approach is able to manage instances of non-trivial sizes in a reasonable amount of time. A possible future work consists of proposing a formalization of the decision theory task also for other semantics for probabilistic ASP. Moreover, we could consider decision problems also in the case that the probabilistic facts are annotated with probability intervals, exploiting the results for inference in Azzolini and Riguzzi (Reference Azzolini and Riguzzi2024).
Acknowledgements
This work has been partially supported by Spoke 1 “FutureHPC & BigData” of the Italian Research Center on High-Performance Computing, Big Data and Quantum Computing funded by MUR Missione 4 – Next Generation EU (NGEU) and by Partenariato Esteso PE00000013 – “FAIR – Future Artificial Intelligence Research” – Spoke 8 “Pervasive AI” – CUP J33C22002830006, funded by MUR through PNRR – M4C2 – Investimento 1.3 (Decreto Direttoriale MUR n. 341 of 15th March 2022) under the Next Generation EU (NGEU). Damiano Azzolini and Fabrizio Riguzzi are members of the Gruppo Nazionale Calcolo Scientifico – Istituto Nazionale di Alta Matematica (GNCS-INdAM).
Competing interests
The authors declare none.