[MUSIC]
to introduce Madhusudan Parthasarathy who's visiting
us from Illinois, University of Illinois Urbana-Champaign.
He's been doing research in logics and decision procedures
for program verification, building verified systems.
Of late he's doing synthesis, and
there's a machine learning thing for invariance.
So today he's gonna talk about natural proofs and
quantifier instantiation.
>> Okay, so I'm gonna talk about this work
which the results are fairly new.
So the slides are raw, and the result's also a bit raw
in the sense that we probably haven't understood consequences
of the results completely to present it in a nice way.
Okay, so with that caveat,
this is joint work with Christof Loding at RWTH and Lucas Pena,
who's student at University of Illinois.
So I'm gonna be sticking to this verification context,
this deductive verification.
The problem is you have a program,
let's say with assertions, that you wanna prove correct.
And the method we're going to follow is annotated with strong
enough inductive pre and post conditions and loop invariants.
And then we generate verification
conditions which involve loop free code and
this will mean the underlying logic can vary.
And then we want to validate the verification
conditions using a logic solver.
So what logic do we use for these specifications?
Actually there's quite a lot of choices.
The first kind of logics that come to mind are first order
logics.
But as we will see, this is not quite enough.
But traditionally if you really want to automate everything,
then you would like the verification conditions to be
essentially in a decidable logic.
And so if it's quantifier free or
are the logics that are desirable,
then that will be fully automatable and the idea is that
this qualifier free logics with some background theories.
Especially if they're quantifier free then they are combinable.
If they meet some conditions, they're combinable using those.
And that gives you a wide variety of theories that you
could use and build decision procedures that are kind
of independent of each other.
Now that attraction of decidable logics is that verification
after annotation is fully automated.
And that is actually quite useful, because when you
are actually verifying, you know that you will either get back,
well, time is one thing.
But you'll get back either an error, or
it won't just keep on searching.
If it keeps on searching, then I don't know whether it's actually
maybe I wrote something wrong.
And there's a little bit of uncertainty when you use
the verification tool when you don't have decidable objects.
But I think where decidability can also be a trap.
We need powerful enough logics to do program verification.
And I think, at least the natural logics
that we can use to express properties that we
wanna prove will turn out to be undecidable.
So for instance we do need quantifiers.
We need logics perhaps over certain
theories that are undecidable.
And typically currently, I would say, the reasons why we need
more expressive power than desirable logics are the state
properties about unbounded parts of your program configuration,
like arrays and data structures, and
the dynamically changing heap.
Or to express, and even if you kind of go away from this by
assuming certain libraries, which manipulate state.
And keep these structures for you, in some abstract way,
you still need to argue about sequences, stats, functions,
pure recursive functions of your data structure and so on.
And these are typically not going to
be in decidable periods.
So there is this obsession with decidability, and
I've been part of that.
We've been trying to work with decidable logics as much as
possible.
And decidability, as I said, when achievable is very good.
It gives you predictability.
It also gives you engineering of efficient tools because you
can have competitions.
And you don't have things which I
think it's a little harder to have competition if on some
things you are going to fail even proving.
I think it kind of directs the engineering task better when you
know it's a decidable problem.
>> So decidability doesn't mean efficiency, right?
>> No it does not, it's more like a heuristic for
building better tools.
It kind of guides you.
And we have found that to be the case that
over time we've got better solvers with decidable logics.
We don't know how these mechanics work, but
it seems to be true.
>> Can we say that the provers operate in a more predictable
way on deciding these fragments?
>> Yeah, I think predictability.
>> Can we quantify that?
>> I don't know.
I don't know if you can quantify that.
>> Temperatures are predictable.
>> [LAUGH] Well it depends on what you call predictable.
I think you have a lower bar for
predictability than what that we've meant.
[LAUGH] >> Let's just talk about
variance.
>> [LAUGH] >> So
it the variance greater on a collection of?
If you keep doing runs with different random
on undecidable [INAUDIBLE].
>> So this agenda of building decidable theories, the ultimate
question is, can we, over time, really obtain a powerful enough
decidable logic using which we can prove all that we need to?
And I think that question is not clear.
Everyone makes an epsilon step,
does one more decidable logic which is slightly more powerful.
It's not clear where this whole thing is headed.
So in my view, the answer seems to be no.
I don't think we are heading towards
the wonderful decidable logics that will solve all
the verification problems [INAUDIBLE].
>> I think your last slide it's not the slight [INAUDIBLE] but
they're often incomparable.
>> They're incomparable as well.
Well typically you kind of branch out.
Maybe then comparable and then it'll kind of merge and
sometimes you do get reasonably robust things.
>> I'm not even sure what this means.
You're saying you want a decidable logic for
your verification conditions, but
the verification conditions depends on the proof you do.
>> Yes.
>> And so you're really saying you want to be able to get your
proof by some manual steps down to the decidable fragment as
easily as possible.
In which case you can never really say you solved
the problem.
You just say it gets easier and easier.
>> Right, but yeah, so I'm being vague here so
I hope that's okay.
I'm not making any concrete statements,
I'm just trying to motivate natural proofs,
which is a different way of looking at things.
So the statements I make are gonna be
a little vague at this point.
Okay, so we worked on decidable logics for
heaps for a long time and we had this logical strand and
there'd been lots of logics for heaps.
And their logics are also based on separation logics for heaps.
Several decidable fragments have emerged.
But I don't see the spark towards a powerful enough logic,
which is also decidable.
So, two things which I notice is that, as you
get more and more expressive in your logic, the constraints that
you need to put in your syntax become more and more awkward.
You're not just given a set of operators and
say combine whatever you want.
You should have some
bizarre conditions that need to be satisfied.
The second thing which I find is more troubling is that
the decision procedures get more and more complex, right?
By that I mean that
even if you had something which you could solve earlier, right?
It is true that the new decision procedure which handles a larger
logic, right.
It takes much longer to that same problem,
that same instance, right?
So the general way of solving something, right?
Makes the procedures tend to work slower, okay?
They have to solve a larger class of problems more uniformly
and so you get, at least in our experience [INAUDIBLE]
you start using all kinds of background things,
like monadic second order logic, and stuff like that.
Stitching these things up to get more expressiveness, and
then they work really badly on even simple instances, right?
And I think one should remember that decidability is a heuristic
here, automation is the goal,
decidability is not the goal, right?
So here's the other way of looking at it,
which I've been working on for several years.
The idea in decidable logics is that you want to curtail your
expressive power of your logic so that it becomes automatable.
We automate reasonining.
Now natural proofs, the idea is different.
It's that you allow a logic to be quite powerful, right?
You might restrict the style, but
you don't typically restrict the expressive power, right?
But in order to automate what you do,
is you look for a sub class of proofs, right?
And these proofs are going to be simpler to search for, right?
And if you find a simple proof for your theorem,
you're gonna say, yes, I found it.
And if you don't find a proof, then it's not gonna work,
so you're gonna give up, right?
So these are more like sound, in complete ways, right,
of solving problems, right?
But we like to build robust classes nevertheless and
possibly even a hierarchy of classes.
I look for the simplest proofs and if I don't find it,
I enlarge my, the class of proofs I search for.
So two things here.
Logic doesn't get awkward, right,
you write what you want, given the style.
And the proofs are not going to get complex because we're kind
of searching for simple proofs only.
>> So when you say if you have a rolling class,
does that mean the user's involved with the proof?
>> No. >> So you're not gonna interact
>> No, so,
typically you search for a simple proof, you don't find it.
Like, quantifier instantiation is one, right?
You could quantify more, right?
You could instantiate more, right?
>> That's the natural proof of- >> So when you say a simple
proof, does it mean the simpler the proof is, the easier it is
for the user to sort of have confidence in the proof, or-
>> No, it's just that decision
wise, it's simple to search for a proof of that type, right?
The complexity of searching for
a proof of that kind is- >> But
that would depend on the search right?
>> True. Right.
But
it's gonna be a subclass of proofs.
I think that's the main thing, right?
Okay so we've been working on natural proofs for some time and
this is not practice as in industry practice, right?
It's basically trying to make this work on a set of
benchmarks.
So what we have settled on is really first order logic
with recursive definitions and background theories.
And we've found that this is good enough to express
properties about heaps.
So we have a dialect of separation logic called dryad
that compiles into this logic.
Right, I don't know how to do automation on separation logic
or other logics directly.
I know only classical logic, so
I'm going to stick to classical logic.
Right, so but we do have more than first order logic.
We do have first order logic with recursive definitions,
right?
And the heuristics that we use to prove
a theorem are what we found in practice people do.
If you give it manually to a person to prove it,
what tactics would they use?
And typically what they use in the world of recursive
definition is, unfold these recursive definitions.
Right, typically a few times, right?
And then they do some kind of unification, right?
At the level of term rewriting,
you can think of this as unification.
But it's basically,
you can kind of mimic this as interpreting the recursive
definitions that you have as uninterpreted functions, right?
That kind of mimics the unification things that they do.
So we looked at proofs of how people normally prove things and
the simple proofs are really based on this.
This was enough to prove most of the things.
And so the resulting formula is, so
what we do is basically unfold the recursive definitions.
If I have a recursive definition for list,
I will just unfold it a few times, right.
And then I would basically interpret the recursive
definitions as unintegrated functions.
And I'm in a refutation based context.
So I'm looking at a formula and trying to prove it's valid
by showing its negation is unsatisfiable.
So if doing this already gives me an unsatisfiable formula,
I'm done because I've shown that the theorem is valid.
So I don't need to actually unfold.
And to use an abstraction,
but that abstraction is fine and kind of mimics what people do.
Moreover, the resulting formulas are actually in decidable
theories.
Right, so you can unfold and
then send it off to an SMT solver.
By the way, this is not at all, that only we do,
everyone essentially does this.
Natural proofs is more like a name for
talking about this in some way.
Daphne does it.
If you have recursive definitions in Daphne which
are pure functions, Daphne unfolds these functions and
gives it to assembly solver.
Everyone does it.
>> The simplicity is measured by [INAUDIBLE],
is that the definition?
>> I think the simplicity is got to do with,
yeah you could think about it that way.
But it's basically the fact that you are abstracting from what
you wanna prove and thereby you are looking for
a simpler proof that that thing holds.
So if you don't succeed in this doesn't mean that it doesn't
hold, you don't get concrete counter examples or
anything like that, right?
Maybe that the theorem is false or
maybe it need to do more, right?
>> So one way maybe to think about this is cooling for
something for an object with the heat.
How much of the rest of heat do you really care about, right?
>> Right, so the variant of actually has it,
it forces you to write recursive definition so
that you don't rely on something very far away.
Right.
So typically because of data structures you can have
the quantification restritc in a guarded fashion.
So it refers to something you already have in your hand,
right?
And so Dryad actually does all that.
So make sure that you don't do arbitrary quantifications so
that they are more amenable to recursive proofs,
inductive proofs.
So here's an example, so I have the definition of a list here,
right.
Either it's nil and heaplet of this list is empty,
I don't really need this.
But all the excess not nil.
And the next element to x is a list and x does not belong to
the heaplet contained in this list, in the next of list.
So this is the way you would do in separation logic, right?
I'm just writing it using first-order logic.
And hlist is defined using again recursion, so
either the empty side of x is nill or it's x union this.
Now, this turns out to properly define list, right?
If hlist, however, makes sense only if x points to your list.
Otherwise, you shouldn't use it, it may be spurious, right.
But the intended
semantics is the least fixed point semantics, right.
So we need first-order logic with these fixed point to
express these kind of properties, right.
And we are using background theories here it sets, right.
I'm using union and so on, right?
But once I unfold things then this will actually be in
our decidable theory because this kind of theory of sets is
decidable.
So the question we wanted to ask was why do these natural
proofs work so well, right?
When they do, right, and why do they fail so
miserably on simple properties?
So they don't actually work well all the time, okay.
There are very simple properties on which they don't work well.
And is there any completeness theorem we can prove about
the simple heuristic that we have, right?
That is, if we unfolded the definitions larger and larger,
can we prove every theorem, for example?
Okay, so what we have now is a set of theorems where
we can answer such questions, all right?
So this has got to do with quantifier instantiation, right?
Now, it turns out that unfolding recursive definitions is really,
very similar to quantifier instantiation done,
which is formula-driven quantifier instantiation.
For instance, if I have this definition of list, right?
So unfolding this for one more time, which is next of x, right.
I want to write the definition for next of x that is the same
as replacing x with next of x in this definition, right?
So unfolding is really like taking this
universally qualified variable x and instantiating with something
else which is another term which can be n of x.
Now, formula driven quantify instantiation is
a common strategy for handling universal quantification.
It's incomplete but
e-matching existed for a, has existed for a long time.
And roughly, I'm going to follow this instantiation strategy that
I'm going to take any terms that I get in my formula.
I am gonna instantiate the universal variables using
that, okay?
And that's similar to unfolding as well.
So here are the main contributions of this
work, right?
We explore various logics, like pure first-order logic,
first-order logic with background theories, and
first-order logic with recursive definitions.
And show the completeness of formula-driven term
instantiations for some of these fragments, for
fragments of these logic.
And in particular for
first-order logic with background theories, right?
We have a fragment called a safe fragment.
Work that admits completeness, right?
And we can show that
the verification that we are doing kind of falls in that
class, modular recursive definitions, right?
Once you get rid of recursive definitions, and
I'll tell you how, right, it kind of falls in this class,
and this roughly explains why natural proofs are actually so
powerful because they're actually complete in that sense.
The second thing we have is that we will not
have complete this results for
first-order logic with recursive definitions.
It's clear that you cannot build a complete procedure for
first-order logic with recursive definitions, right. And so
we mitigate this using certain
techniques which are related to invariant
synthesis as well, right? And so
it seems to be a new kind of way.
But a lot of people have been doing this in
the theorem community already.
I think what we are doing is we are doing it in the first-order
context, right?
So that we reduce this to first-order reasoning for
which we can prove completeness, okay.
So first a reminder of classical results.
And I'm sorry if you know this very well.
It's just that I've been with audiences where they may not
remember it.
So first of all, first-order logic admits sound and
complete reasoning.
This is a Godel's completeness thereom, right?
First-order logic is undecidable, right?
And validity of first-order logic is RE
because it admits a sound, it has a complete proof system.
You know that you can enumerate proofs and
prove any theorem in first-order logic, right?
So validity is in RE and satisfiability is not in RE
because it's not decidable, all right?
And it also turns out that recursive definitions and least
fixed point cannot be expressed in first-order logic, right?
And there is no complete procedure for
first-order logic with least fixed points.
If you throw in that you can do a non whole thing or something
and then show that you cannot have an RE procedure for it.
And so first-order logic with least fixed points in a full
generality, there cannot be a complete procedure for
it, right?
You can't hope to prove every theorem But
for first-order logic, right, building complete definition
procedures is theoretically possible, right?
You can just animate proofs.
And if you have your axioms, and axioms are enumerable, right,
which they are typically, even for
the background theories, right.
You can just keep on enumerating proofs and
they will find it at some point, but it's not useful in practice.
You're not gonna do this, so we are not interested in
any enumeration which will be complete.
It's clearly true what we want to show is quantify
instantiation, which is particularly useful and
seems to be particularly useful, right?
Whether that is complete.
Now, though I just now said it should be possible,
there's actually a fundamental problem of doing this with
verification logics, right?
And because consider our logic for verification conditions
that is powerful enough that it's undecidable, okay.
Now, unlike first-order models, the models that deal within
verification are finite and enumerable, right.
The program configuration is not an infinite object.
It is a finite object, right.
It can have integers, it can have unbounded integers but
still, it's finite, right.
And any sort of finite structures, you can enumerate.
So I can enumerate all models which means satisfiability
of the problem and of the logic is in RE, right.
So validity cannot be in RE, okay?
So the fact that these models are finite means validity
becomes harder.
Right?
And I kind of lose the lovely thing I had in first-order
logic, that validity is actually,
you had a complete procedure for validity, for first-order logic.
I cannot actually take it down to program verification. So-
>> Bug finding is very easy.
>> What was that?
>> Bug finding is very easy.
>> Bug finding is easy, yes.
>> Did you? >> So yes [LAUGH].
So this is why you can test, but
you can't verify, right?
>> So what about sequences?
Are those [INAUDIBLE] finite models or?
>> So it all depends on your axiomatization.
So I'm looking at only first-order logic.
So if you would want it for sequences,
you have axiomatize it in first-order logic.
Could be, I mean, an infinite axiomatization, but
it should be in recursively enumerable one, right?
So presto arithmetic, for example,
is a infinite axiomatization of- >> So the model of a sequence,
is that a finite model or is that a?
>> Yeah, I mean, in program verification,
it's finite, right?
I'm just saying that infinite things are actually easier
than finite things for proving validity, right?
But we don't have infinite things here.
So there is inherently a problem in building complete engines for
verification.
It will come back to this.
>> [INAUDIBLE]?
>> Sorry? >> What
do you consider a fractals finite?
Say, fractals with the finite area or the infinite.
>> We don't do that in verification,
but I don't know about fractals and that theory.
Here's a simpler setting with first-order logic.
No recursive definitions and no background theories.
This is pure first-order logic.
Is quantifying instantiation complete?
And it turns out that it is.
So first consider pure first-order logic.
And we're gonna do refutation based, so negate it and
show unsatisfiability.
First, we skolemize and we remove all existential
quantification by replacing the functions, right?
And then we do systematic formula driven term
instantiation, right?
So create an intuitively growing conjunct of quantifier-free
formulas where and every time you replace any
universal quantified variable with any of
the terms that are current in your formula so far, okay?
So more precisely, let's say, we wanna show,
we have some kind of finite axiomatization, A1 through An.
And lets say, it's universally qualified.
And I wanna show naught phi is implied by it, right?
So I negated then I saw I have this formula to show as
unsatisfiable, right?
Then what I do is I start with phi,
the thing which I wanna prove under the axiomatization, right?
And that every time I take the universally quantified formulas
and I instantiated using a term in what I had before, right,
in Xn, right?
So that every time I get a quantifier-free formulas, right,
and the X* is the union.
If I go on forever, X* is the union of all of that, okay?
And indeed, it's true that this is a complete procedure if I
generate Xn iteratively and I check, right,
whether it's satisfiable.
If it is unsatisfiable, there will be some point at which I
halt and find it up to be unsatisfiable.
This is a combination of Herbrand's theorem.
So Herbrand's theorem really says that you can always build
a universe with only your terms.
Once you skolemize, you can only build it with your terms, right?
And you need to what we had to do was slightly modify it.
Because this is really driven by the formulas that you have.
The instantiation is driven by the formulas.
So you had to kind of prove it again.
And then compactness, so with Herbrand's theorem,
you really get X* that the set of all things will be unsat.
And then by compactness theorem for first-order logic, right,
we actually get that at some point, Xn, it must hold, okay?
So quantify instantiation works really well for
first-order logic.
Okay, now,
let's move to many-sorted first-order logic, right?
So this is a setting which we had to define,
because it's kind of best fit are, yeah.
>> Can you go back?
So there's something on local theories, right?
Local theories [INAUDIBLE], local still related to that or?
>> I think so, but I'm not sure.
Maybe we can take it offline?
>> Mm-hm.
>> Yeah, so here, I have a foreground universe, right?
So I have many sorts, right?
I have a foreground universe which is uninterpreted, and
I have a background theories that are interpreted.
So this could be arithmetic, sets, whatever you will, right?
So I have a set of background theories.
But it's very important that
these sorts are completely disjoint, right?
>> And what I have is a universally qualified formula,
right, over this vocabulary, right?
And I have uninterpreted functions that kind of map
across these available vocabularies, okay?
And a rough idea is that I'm gonna model my heap here, right,
these locations are modelled here.
And maybe a function from here to integers will tell me what
data is stored there, right?
A function from a pair of locations could tell me
the heaplet that is characterized by the less
segment or some data structure and so on.
So roughly,
this is going to be the logics that, it's kind of uninterpreted
world the program configuration that I'm looking at.
But the contents of the program configuration in terms of
primitive data types are gonna be kind of captured using
the background theories.
Now, so this is the world we want to belong to, right?
And then I wanna take this formula which is only
universally quantified,
the ultimost quantifications only of the foreground universe.
Because I will want to talk about my program configuration
using universal quantification.
And internally, I could have more quantification which is In
the respective theories, right?
Let me just allow that, but
the outer quantification is on the foreground.
And now, I could ask the same question, right?
I could fix some theories and I could say,
I could do quantify instantiation, right?
And when I do quantify instantiation,
these variables will be replaced.
But I'm not gonna touch also, right?
So I'm just gonna do quantifier instantiation in my foreground
universe, all right?
And I ask whether that term instantiation
is actually complete, right?
Now, the background theories, if I had the axioms for
them, I could do instantiation using those axioms as well, and
then everything will be complete.
But I don't want to do this.
I really want to use Black Book service here.
So I don't wanna start instantiating here, right?
Arithmetic should be solved by an arithmetic solver.
And it turns out that this is not true, right?
There are simple examples that demonstrate this.
But quantifier instantiation, just doing blindly
on the foreground, is not going to be complete.
And the reason is that the background theory can actually
restrict the foreground theory in ways
that will preclude you from finding the proof if you just
look at the terms that you get by instantiating the foreground.
So their main technical result we is that quantifier
instantiation is complete in this setting for
what are called safe formulas or safe fragment.
And this is very easy to understand what they
say fragment is.
So a term is called foreground guarded.
Okay, notice the foreground universe.
So a term is called foreground guarded if contains
variables only of the foreground sort.
And the formula is safe if every subterm that you have is
guarded, right?
Which basically means that you cannot derive a term in
the foreground universe.
Right?
Using functions over universally qualified variables in
the background.
You can't say there's a list associated with or
a location associated with an integer.
You can go the other way around, right, but
you can't come back to the.
How does it related to
essentially uninterpreted fragments?
>> Essentially uninterpreted fragment doesn't have
background theories, typically.
But it is true we can prove the- >> It does allow
background theory. Yeah.
>> Yeah so
if you have two thoughts it'll allow right.
So we can prove as a corollary the UF thing as well.
The beneficial class as well.
>> So this >> Generalizes the graph
>> Yes, yes.
>> So, there are two simple restrictions that you can follow
to meet the fragment's conditions.
One is not to have functions that go from
the background sort to the foreground sort.
If you don't have any functions,
there's no way you can construct anything.
Anything wrong.
If you had only forward function than it's fine.
The other way is to have no explicit
universal quantification on the background.
If you say I won't quantify integers explicitly then also
you're fine, because you're then not gonna generate this.
When in practice what we found is that The VCs that we
generate on heaps actually need both the conditions.
You need to meet only one, but
it meets actually both the conditions.
You don't need to quantify our integers, right?
You quantify all the elements of a list, and
then you can look at the keys associated with that, but you're
still using a forward function, and you're not quantifying over.
So if you want to say that two lists have the same length?
>> Same what?
>> Two lists, if you want to say two lists have the same length?
>> Yeah, so
you'll have a length function which is recursively defined.
Now we can't define length with first order logic yet.
Then you would say that the length
of these two functions are the same That's fine,
length is a function which goes from nodes to integers.
>> [INAUDIBLE] that's why.
[INAUDIBLE] it's not uncommon
to quantify over the at particular positions, right?
>> Right.
So array is a little bit done differently,
you will have a program array names, let's a or whatever.
Then you'll have a function that maps a and
an index to some value.
And that's okay So you can have, you're
domain can be a cross product of any of the universes, right?
That's fine.
But your range has to be always background one.
So now rate can be defined this way as well.
>> I typically want to go from array and index-
>> Yes.
>> To say- >> A value.
>> Location or value or something, right?
>> So that's right.
That's fine because that's
the location is not necessarily background, right?
>> If you have a pointer as in a true pointer as in like
an integer thing, that's harder.
I'm talking about here pointer,
locations being abstract locations.
>> So if you have A sub, B subject like,
we're using an array to index another array.
>> Yeah, you need.
You can do that too but.
Yeah, so- >> You can?
>> Yeah, but you don't have to, you don't write it the same way.
But you can do it.
>> So there you're using the array
content as a pointer into another array?
>> You don't write it that way, you can model it differently.
So that So, you still >> Yeah.
Okay.
So I would then want to go into the proof of the main
where I'm gonna look at some ramifications.
One is that, if you do not have
Any quantification of the background theory, then term
instantiation gives you only quantified free formulas, right.
And over the background theory if they are nice and
have the [INAUDIBLE] combinable then you can actually have
a decision procedure.
And this decision procedure, not a decision procedure,
complete procedure Right?
The complete procedure will just do this.
Repeat till it's unsatisfiable.
Do one more level of term instantiation and check whether
that set on instantiation holds or not, right?
And this actually gives you a complete procedure for
the safe fragment, okay?
Now, though the first thing we had when we had the result is
that this must be wrong.
All right, because surely if I have a background theory
of natural numbers.
I can define list a finite list, and
therefore recur a least fixed point easily
using arithmetic as a background theory.
So I can have, I can say a nil is a list, it's length is zero.
And then a non-nil element is a list,
if the next of x is a list.
And that is one, length of x is now one more than the other one.
You can define, what you're doing, is you're
using recursion, but now you're piggybacking your recursion on
the background theory, which supports recursion, right?
But then this actually goes wrong, right?
Because it lists numbers as incomplete, right?
You can not build a complete procedure for it, right?
So it will contradict our completeness result if you can
actually do this, okay?
So what is going on It turns out that
the whole thing has to do with non-standard models.
All right, it turns out that Presburger arithmetic though
a complete axiomatization right admits non-standard models.
And your reasoning only with respect to a background and
not a background model, right?
You don't have a two model.
You only.
So I could have this kind of a list right?
Where x reaches, I can claim x reaches nil but
it doesn't reach any finite time, right, and to justify it
I could have the length function map x to a non standard element
So nonstandard arithmetic is where you have your normal,
natural numbers.
And let's say you can have a copy of Z after that, okay?
And it turns out using first order logic you cannot
distinguish between that and just natural numbers, right?
But using that nonstandard model I can actually Create one
which doesn't have finite length and, so
you're encoding machines to show incompleteness.
So, it's actually in fact complete.
The result.
And it doesn't contradict our theory.
So, now we also get a kind of understanding of natural
proofs, right?
Formulas given in, with least fixpoint, right.
Let say this is the definition, right.
Now what you notice is that if I gave you something with least
fixpoint definition, and you just did unfolding on it.
Then really you do not distinguish.
You're not capturing the least fixpoints [INAUDIBLE] at all,
right, when you unfold.
If all that you're going to do with your recursive definition
is to unfold it, then what you're really treating it as
If and only if operator, right?
The fact that it's a least point is kinda lost because you never
said anywhere that you're gonna use it that way.
Okay, so
you can look at natural proofs actually doing two things.
One is replace least fixed point definitions with and if and
only if, right?
Now, this are semantic laws, you're not [INAUDIBLE] point,
right?
And phase two is doing quantify instantiation,
which is essentially doing recursion definition unfolding.
And what we now know is that This is actually,
I'm gonna argue now that phase two is actually complete, okay?
And remember, for completeness we need safety.
So I'm gonna argue
that what we're actually generating is safe.
So it turns out that when you model heaps, right,
you have locations now within the foreground universe.
You have pointers modeled as uninterpreted functions in
the foreground universe.
You have data stored in data structures,
those go from foreground to background like a key stored
in a data structure, right?
Recursive functions and
all are from, like sets of keys in a data structure
in a binary search tree would be from foreground to background.
Heaplets are foreground to background.
And recursive definitions like BST are predicates
which are defined using these other functions, but are still
predicates on the foreground which are uninterrupted.
>> My question still applies, sorry.
If I have an array with indices, and
indices are integers of the background, right?
So then I have a function from the background theory to
the contents of the array which is foreground.
>> So it's array crossed into integers to integers?
>> No, I re-cross integers to something in the foreground like
values.
>> To values?
So we are not doing font,
we are now doing locations as two locations that you can do
point arithmetic on the abstract locations?
>> I mean, let's say, I have an array of tuples.
>> Right.
>> Okay, so I have a function from array cross integer, two
pairs, two [CROSSTALK] something in the foreground, right?
So, that doesn't seem to respect this,
>> Okay, so
maybe I've not thought about that.
Maybe it's true, it's possible.
So in the natural proof work we don't do that.
We don't have arrays.
We deal with purely recursive data types, right?
And that's the context which I'm saying that this is true.
So arrays, we do consider in paper, but that's only for
the decidable fragments.
So we also get decidability results on the way.
So there might be things which we cannot model yet, right?
>> So if you have lists of lists, can you model that?
>> Yeah. >> Okay.
>> But the point is, I mean,
you're not doing arithmetic on it, but you're just doing.
So these are actually safe, right?
Because you have one-way functions.
And you don't need quantification on integers
typically, right?
So it's actually safe by the outer condition as well.
And so,
we actually have this term instantiating that we do for
natural proofs, and that work is actually complete, right?
Because it belongs to a same fragment.
Provided you have replaced least fixed point with fixed point
definitions.
So this phase is actually complete, right?
And there is a semantic gap introduced in Phase 1.
So why can't we prove things?
In natural proofs,
there are lots of simple things we cannot prove.
And the user has to give axioms for it.
But our work, it usually turns out that if you define two data
structures, the relation between them is typically not
provable and you actually give that.
And you have a disjoint list y then you get list x.
If you unfold,
you'll never be able to prove this theorem, right?
And it turns out that the reason is the Phase 1,
that semantic distance is the one, and again non-standard
model is the reason why you get into this problem, right?
The reason why you can't prove it Is
because it's not actually true in the if and
only if semantics of list segments and list, right?
There's a semantic gap between least fixed points and
fixed point semantics, right?
So this is actually, if you had x,
even though it never reaches y, right?
This is the next relation, never reaches y, it can declare y to
be a list segment as long as I declare all of these to be list
segments, right, I can still do that and there's a model for it.
So we couldn't prove it,
because it wasn't valid in the first order world.
So nonstandard models crop up all the time in this work.
We found problems in the literature and
PhD theses and so on.
There were theorems which are now actually again open.
Because of this, I won't comment on this.
>> So all this stuff around shape analysis and
logics with the reachability predicate is
assuming a least fixed point semantics.
>> Yes. >> Right, and so
you are abstracting somehow over that.
>> Yes.
>> So what is your abstraction?
>> It's a fixed point and not a least fixed point.
So when you're doing unfolding [INAUDIBLE] if and only if.
>> All these logics of certain restricted forms of heap for
which you can have completeness, right?
So I'm trying, so how would, so he has abstractions for
that, or he has classes of logic.
So I just wonder what is the relation here
between all the stuff have done?
Where they basically don't have any background theories,
but they have just very hard problems because of reachability
and the heap.
>> Right. >> Those fall into your other
class of being increasingly idiosyncratic logic fragments.
>> I see, okay.
>> So, I mean, I think there ought to be a name for
this phenomenon, that the harder you push on the decidability
boundary, the more idiosyncratic your logic and models become.
>> Okay, so I think I'm running out of time, so is that correct?
>> Ten minutes.
>> Okay, so I wanna tell you a little bit with fixed points,
because what we have essentially done is removed least
fixed points and said everything works well.
But what with least fixed points.
And the idea is that how are we gonna prove these things after
all if we are not going to do this unfold and match, right?
So this brings to this other part which is, so
remember that first order logic with least fixed point cannot
have a complete procedure, right?
So anything that you introduce is going to look like
a heuristic, right?
And what we do is we introduce induction principles, right?
That bridge this gap between least fixed point and
fixed point, okay?
And we do this kind of uniformly, systematically,
guided by the formula that we wanna prove, okay?
And obviously this has got to do with invariant generation as
well, because a program is just, you an look at it as first order
logic with least fixed points, the entire program, right?
And so you're proving now that whole program is correct without
loop invariants now, right?
So you're actually generating loop invariants by doing these
induction principles.
The the crucial thing is that we have to stay
within the first of logic framework, right?
Induction principles are typically second order, right?
But you wanna kind of stay in the first order logic framework.
And in some way we're instantiating the formulas that
we want to declare our inductive systematically.
So we have some in the paper, right?
So I'll just give you one, right?
One is this, for every x, if I have to show that r(x)
implies psi(x) where R has some definition phi R.
It is the recursive definition which is using least fixed
point, okay?
And then what I do is introduce this induction principle which
basically says that the partial fixed point of this operator,
right?
If psi, right, what I want to prove is
a pre-fixed point of the definition of R,
then r implies phi.
Okay?
This is essentially saying that whatever I want to prove,
I'm going to prove that inductively based on
the definition of art, right?
I'm gonna show and formally this becomes a prefix point, right?
So if I can show the size of pre-fixed point, of row,
row is the definition of R, right?
Then R implies psy, I mean that is true, right.
And this brings in the least fixed point semantics
into picture, right?
But the induction principle is completely first order.
There's no mention of these fixed points.
There are slightly more gentle versions in this paper, but
the most important thing is to show,
is to see that it is guided by the formula you wanna prove.
You write it in this form that this recursively defines object
implies something else, and then you generate this automatically.
And then introducing this induction principle in turn
leads to other induction principles to be added.
So you will add one induction principle like this one, and
then this might have something else and that will introduce,
if you regroup it, right, you can see it of this form and
you can instantiate one more of another kind, right?
You can kind of keep going and actually,
some examples actually require that, that you do that.
And then what we showed is that all the things that we could not
prove in natural proofs, right?
All the things where user needed, lemmas, right?
The user needed to give lemmas to help it to prove.
We're all proved using a couple of these induction
principles, okay.
There were not too many axioms, right?
There were about 150 programs, but these have got to do only
with the data structure and not with the programs, right?
And there are simple things like if the if binary search tree,
then it's a binary tree, right?
You can't prove that using unfolding, right?
>> So, Dr. John has a couple of papers on this.
>> Yes, so that is correct.
So that's in the CLP framework and, right.
So is a little- >> That's separation and
logic, correct?
>> Right, so the key thing here is that you're doing in
the first round of logic framework, right.
So that once you do this you know you can build a complete
procedure after that, right.
We also tried it on some programs
which are not to do with the heaps like.
>> [INAUDIBLE] >> Yes, exactly.
I mean the same thing we did, it addresses the same.
So we prove some things like McCarthy 91 for example,
it proved it in two iterations.
it was able to find the right loop on their end, right, and
actually prove it correct.
So this is what emerges, right?
If you wanna do natural proofs,
then one thing is to take first order logic or
least fixed point which I think is a good logic to work with.
Right, replace fixed point definitions with,
fixed point definitions, right.
But use induction principles to bridge this gap otherwise,
you're not gonna get the power you need.
And the second phase is, unfold and do quantified instantiation.
And this part is in fact complete, right?
You must ensure safety along the way.
So I don't know whether array of pointers will work, right?
But this is only if you can ensure safety is the complete as
result true, okay?
And if you further ensure that the resulting formulas when you
instantiate terms are qualify free.
Then you have the ability to use assembly solvers with various
theories and you'll combine because they quantify free and
by also they'll combine, so
you'll have something which should work fast, right?
So work fast in the settings we have tried, but
in general it should work fast.
Okay, so that's it, so conclusions, a new theorem that
proves completeness of term instantiation for multisorted
uninterpreted combination of theories for the safe fragment.
And then the systematic introduction of first order
induction principles that seem to work,
in some settings at least.
And I think with these two, we have an understanding and
a robust reformulation of this completely ad hoc strategy
that we were doing.
Which is take everything,
expression first your logic six point.
Just unfold, it seems to work, right?
And it seem to work was gotten from
who's the people on writing and we thought,
they are such simple proofs, why don't we use them, right?
So but now, I think we have a more robust understanding of
what exactly is happening.
Okay, and there is a paper by Gian Damora and this is actually
very similar, we found it after we did this paper, but
it's very similar and very different at the same time.
So the basic difference is that it's not a multisorted universe,
right, and then the logic that for
which you can prove the theorem become very restrictive, right.
You can't even take two universally quantified variables
and check equality between them.
Right, because equality is not a interpret symbol, right, and
they won't allow that.
Right, in fact, you can't really the proof is all different.
Even the first theorem that I said for
first order logic being complete.
Without any background theories,
even that doesn't go through in that setting, so.
And of course there's a huge amount of work in
finding induction principles in the theorem proving community,
ACL too did a lot of work, right?
We're probably mimicking such strategies in that first order
logic world.
Right, but what we do have is this completeness thing, and
production to SMT and using SMT source.
So future work, I would like to build,
SMT-like solvers being efficient solvers for
first order logic with least fixed points, right,
I think there'll be a lot of applications, because first
order logic least fixed points is incredibly powerful, right?
The second thing is,
we don't know these induction principles for
how will they work for programs that do not involve heaps?
We don't know.
It's probably easy to try it on and see how it works.
But, we were little surprised that McCarthy did not take 100
iterations, right?
He just took two iterations and out popped the invariant and
that was a little surprising.
So we should investigate that and of course,
how does it compare, right?
Because all these are gonna be incomplete techniques, right,
how do they compare with other techniques,
such as interpolation, would be good to know.
The third thing is, I think first order logic with least
fixed point and background queries may be the right logic
to do for host style reasoning or deductive verification.
All right, we certainly need this much power, right?
We need data structures and also,
in general, we need this kind of power to do expressiveness
of the weakest pre.
There's a paper by Blass and Gurevich actually that,
a long time back, that argues that we actually should do this,
because you don't need arithmetic.
You don't have to show something complete by mapping
everything to arithmetic with addition and multiplication for
which you don't have any thought [INAUDIBLE] that's how most
completeness proofs go.
Right, but actually with first order logic and
least fixed point you can get that power to express
the weakest pre of any program including loops, right?
And that maybe is useful, right?
And we do have now some ways of handling first order logic or
least fixed point automatically.
The third thing is we probably don't need much more.
I'm not sure about that.
I'm little hesitant to go beyond first order logic because I just
don't know how to deal with automation beyond first
order logic.
And I don't have compactness, I don't have completeness.
I don't know term instantiation, I know nothing to work with So
it's going to be very hard to go beyond first order logic,
at least for me.
So maybe this is a good logic to work with.
Thank you.
>> [APPLAUSE] >> So
you started out talking about the importance of decidability.
And then mostly what you said was actually about completeness.
So what about decidability?
In other words,
if I have completeness, I don't necessarily get countermodels.
And if I'm actually doing a proof, countermodels
interactively, countermodels are very important.
We use them for refining adoptive hypotheses and we use
them when things are going wrong and we have to understand why.
So under what circumstances can I get countermodels?
>> I think countermodels you could do by symbolic guest input
generation, right?
You could bound your heap, you could bound things and
then get countermodels.
>> What if there is no finite countermodel?
[CROSSTALK] >> It's just that I don't know
whether.
It would be a finite,
your program configurations are intended to be finite.
So they should be a finite countermodel.
>> Well, yeah, maybe, but once I sort of abstracted it
into the logic- >> No, no without abstraction,
in the same way that you just prove a VC,
you could try it for- >> Standard model, so
as you said.
>> [LAUGH] I hope you don't want nonstandard models as
a counterexample, right?
So I'm just saying that- >> What I'm saying-
>> Just normal symbolic
execution should do countermodels.
[CROSSTALK].
>> These are countermodels to the invariance.
>> Yeah, but right, I don't want a countermodel to my program.
I want the countermodel to my VC.
Do you know what I'm saying?
It's not the same thing.
>> Yeah, countermodel to your VC.
You could do- >> Because I've got
an extraction.
When I've got to my point of having my VC attached,
obstruction, for example, I got rid of these fixed points,
semantics or whatever.
So if I send my VC into my theorem proover and
it should either eventually come back with a proof,
or it should eventually come back with a countermodel.
In which case you have decidability.
>> I just think that these two procedures might be totally
different.
If you wanna prove something versus fine count examples,
these may be different things.
>> Right, but they're not- >> Because we cannot possibly
target finding complex counter-examples, and
finding simple proofs at the same time, right? So-
>> I think you're missing my
point about the usefulness of countermodels.
>> No, I'm not saying that, I'm just saying that you can do
countermodels by some other technique.
>> Well, no no, if you could do that,
then your logic would be decidable.
Because if I can get countermodels, if countermodels
are RE and if proofs are RE- >> Proofs are not RE,
they're gonna have at least six point, they're not gonna be RE.
>> Right, but
let's say I've [INAUDIBLE] least six point semantics.
>> Then it is complete, it is decidable, yes, I agree.
In fact,
I'd argue that the program configurations are innumerable.
It's satisfiable, it is an RE.
So if you've done away with this-
>> We have to talk about
this offline.
Cuz we [CROSSTALK] misunderstanding.
>> [LAUGH] >> Okay, [LAUGH]
>> Let's thank.
>> [APPLAUSE]
Không có nhận xét nào:
Đăng nhận xét