I've been puttering for a while with a VS Code extension for yscript, and I wanted to talk a bit about my motivations, yscript, and the technicalities of the project. This post is a companion to my short talk at ProLaLa 2023.
It looks like this:
yscript is developed by Andrew Mowbray at AustLII as part of DataLex, a broader toolkit for rule-based legal AI. Beyond yscript, DataLex also includes tools for running dialogs through a friendly chat interface, automatically processing legislation into “first-cut” yscript code, visually cross-referencing facts, and more cool stuff.DataLex has good ideas, and a lot of what I'm trying to do is to just tighten the feedback loop on things DataLex does already.
For this project, we deal with a relatively small subset of yscript:
- A yscript program is a list of rules. Rules have names, which are fragments of natural language called “descriptors”.
- A rules contains a list of statements, which are either
ONLY IF. For our purposes these correspond roughly to unidirectional and bidirectional implication, respectively.
- Statements are composed of boolean expressions over facts, which are logical propositions. Facts, like rules, are named by descriptors.
Recall that execution of a yscript program results in a dialog. The dialog begins with a rule declared as a
GOAL RULE take an umbrella PROVIDES... or the first rule in the program if no
GOAL exists. Each statement in the rule is executed in order. If a fact is encountered that has no recorded value, the user will be asked to provide one. After the user answers a question, if the new fact provides enough information to infer additional facts, those inferences are made. Evaluation then restarts from the beginning with the new facts,Coding in yscript, 73. which means we don't need to worry about retractions. yscript also doesn't allow recursion,Coding in yscript, 77. so negation semantics are simple.
Explanation and understanding
Explainability is a bullet point you'll find on the website of any self-respecting system for computational legal reasoning. It's easy to agree with the idea that we ought to be able to explain the findings of this kind of system. But I do get the feeling that we're talking past each other a bit, that we don't quite agree on what “explanation” means. The most common thing seems to be evaluation traces, which are helpful but limited, as I'll mention later. I think of my own project here as bearing on the idea of explainability, so I want to be clear on what I mean by it, and what I think explanations ought to do in the legal domain.
Here's one (paraphrased) definition of explanation:Daniel A. Wilkenfeld, “Functional Explaining: A New Approach to the Philosophy of Explanation,” Synthese 191, no. 14 (September 1, 2014): 3367–91, https://doi.org/10.1007/s11229-014-0452-z.
- To explain X is to produce a representation of X that, when internalized by the audience, produces improved understanding of X.
- Whether a representation suffices for understanding is determined by what the understander can do with that representation in contextually relevant counterfactuals.
So understanding is constitutive of explanation: to count as an explanation, it has to improve understanding. We can consider understanding to be constituted by internalized representations which are useful. Useful representations help you to accomplish something, e.g. make predictions, in meaningfully different circumstances (that is, in “contextually relevant counterfactuals”).
This is, to my mind, a helpful definition of explanation.A useful representation of explanation, even. But the concept needs some extra colour in the domain of legal reasoning, where we aren't (or shouldn't be) concerned so much with correctness as judgment. Some insight emerges from the LKIF and Carneades projects on this point:Thomas F. Gordon, “Constructing Legal Arguments with Rules in the Legal Knowledge Interchange Format (LKIF),” in Computable Models of the Law: Languages, Dialogues, Games, Ontologies (Berlin, Heidelberg: Springer-Verlag, 2008), 162–84, https://doi.org/10.1007/978-3-540-85569-9_11.
- “[…] legal reasoning cannot be viewed, in general, as the application of some deductive logic, such as first-order predicate logic, to some theory of the facts and relevant legal domain.”
- “Legal rules are neither material implications nor procedures for updating variables in working memory, but rather schemes for constructing legal arguments.”
For law, the issue isn't merely explanation but justification. It is precisely the lack of exact, inarguably correct logic that makes law work: argument, judgment, and contestation are foundational to rule of law.A critique of computational legalism is in Laurence Diver, “Digisprudence: The Design of Legitimate Code,” Law, Innovation and Technology 13, no. 2 (July 3, 2021): 325–54, https://doi.org/10.1080/17579961.2021.1977217. The thing we want to explain isn't so much the logical validity of a particular legal finding (which usually isn't even provable per se), it's the structure of the space of potential arguments.
This is why I think evaluation traces don't go far enough as explanations. They do supply correct and relevant information, but not all of it: they don't tell you anything about counterfactuals or about other potential lines of argument. It's somewhat helpful, when you're trying to demonstrate q, to be told that q must be true because p is true and p implies q. It is much more helpful to be told that:
- p also implies r (which you didn't know you should ask about).
- Even if p were false, s (which you didn't know existed) also implies q, so you could try that instead.
- There are ways to show t (which is the thing you actually need to demonstrate, although you didn't know that), without needing to say anything about q.
This is the holistic kind of explanation, raising relevant counterfactuals whether you thought to ask about them or not, that we expect from lawyers. It's the kind of explanation we really ought to expect from our computational systems as well.
Declarative code lends itself nicely to visualization. For logic programs in particular, the classic boxes + wires approach is convenient:
We get the wires for free! It's easy to analyze a logic program and figure out the dependencies between rules - that's the whole idea.This is also easy for predicate logic, and I've done it for Epilog, but I want to stick with propositions for now. Plot your rules along such a dependency graph and hey presto.
As for the rules themselves, we use a Tree-sitter parser to build a concrete syntax tree from the source text, then derive HTML views of each rule in that tree. Tree-sitter is very fast, so we can run it on every keystroke and maintain the correspondence between the graph and the source text. Because Tree-sitter produces a concrete syntax tree, we know exactly what range in the source text each syntax node originates from, so we have enough information to navigate from a rule or fact in the graph back to its source. This is useful on a larger program:
Rules are only half the story. Alongside each referenced fact in the rules, we show the current value for that fact. Fact values might be either explicitly asserted by the user or inferred from assertions. Explicit assertions are rendered in bold. We can also select a goal rule and start looking for fact values. This will eventually lead us to derive a value for the destination fact(s) of our goal rule in the same manner as a usual yscript dialog:
But we can also assert a value for any fact we want, not just the one yscript is asking about. We can even assert a value for a derived fact, and skip its dependencies. If, in this way, we give yscript enough information to answer its own question, it can move on to the next:
This is powered by Z3, which can solve quantifier-free finite domain logic, which has declarative semantics pretty compatible with yscript's procedural semantics. It's also very convenient to set up sessions with Z3 every time a fact changes, so we don't need to rewind yscript evaluation or maintain a bunch of incremental state.Although Z3 does have incremental modes, if we wanted to use them. Z3 can also prove unsatisfiability, which makes it easy to prevent contradictory assertions: when we assert a fact, we first check that after doing so, our constraints will remain somehow satisfiable. This is what lets us jump around the graph and enter facts wherever we want.
This coordination of procedural and declarative representations of rules is powerful. We get the benefit of being guided through the rules, but we don't need to give up control. If we see something we're not being asked about but which we find interesting, we can always take the wheel, and give it back when we're done. If we're being asked about things we aren't interested in, we can assert something lower down the tree and move on. We get to use the computer, rather than the other way around. There are lots of promising directions to take this kind of tooling:
- More sophisticated logics could be supported. I started with propositional logic because it's easiest, but I have the bones of similar tooling for predicate logicVia Epilog. as well.
- The graph could be part of an ordinary web page. VS Code provides useful features for developers, but for people who don't want to edit the rules, there's no need for it.Even for developers, VS Code is probably not open enough to be relied upon as a member of the rules as code ecosystem.
- Dynamic forms could be created all at once rather than, as with the dialog-based approach, one question at a time. We have, at any given time, enough information to compute which facts could change the outcome, and no inherent need to impose an order on answering them.
- Quality of life improvements: mousing over a fact could highlight all of its dependents and dependencies, subtle animations could make moving parts easier to track, the list of known fact values could be displayed all in one place, etc.
In general, ease of static analysis is one of the major advantages of rules as a control structure, and I think we could lean on it a lot harder in our interfaces. I've found this to be a really nice way of interacting with rules, and I hope to see (and do!) more projects along these lines.