"TestOnly Development" with the Z3 Theorem Prover
A Simple Problem
Here's the problem I'll use as an example, which is problem #4 from the Project Euler:
A palindromic number reads the same both ways. The largest palindrome made from the product of two 2digit numbers is 9009 = 91 × 99.
Find the largest palindrome made from the product of two 3digit numbers.
The usual approach is to use brute force. Here's a C# example, which I suspect is similar to what many people do:
(from factor1 in Enumerable.Range(100, 899)
, Factor2 = factor2, Product = product}).First()
from factor2 in Enumerable.Range(100, 899)
let product = factor1 * factor2
where IsPalindrome(product) // defined elsewhere
orderby product descending
select new { Factor1 = factor1
This is not a terrible solution. It runs pretty quickly and returns the correct solution, and we can see opportunities for making it more efficient. I suspect most people would declare the problem finished and move on.
However, the LINQ syntax may obscure the fact that this is still a brute force solution. Any time we have to think about how to instruct a computer to find the answer to the problem instead of the characteristics of the problem itself, we add cognitive overhead and increase the chances of making a mistake.
Also, what is that
IsPalindrome(product)
hiding? Most people implement this by converting the number to a string, and comparing it with the reversed value. But it turns out that the mathematical properties of a palindromic number are critical to finding an efficient solution.Indeed, you can solve this problem fairly quickly with pencil and paper so long as you don't do
IsPalindrome
this way! (It would probably double the length of this post to explain how, so I'll skip that. If there's demand in comments I can explain, otherwise just read on.)Solving with Pure Specification
For my purely declarative solution, I'm going to use the language SMTLIB. As a pure specification language, it doesn't allow me to define an implementation at all! Instead, I'll use it to express the constraints of the problem, and then use MSR's Z3 solver to find a solution. SMTLIB uses Lisplike Sexpressions, so, yes Virginia, there will be parentheses, but don't let that scare you off. It's always worthwhile to learn languages which make you think about problems differently, and I think you'll find SMTLIB really delivers!
Since this language will seem unusual to many readers, let's walk through the problem step by step.
First, we need to declare some of the variables used in the problem. I use "variable" here in the mathematical, rather than software, sense: A placeholder for an unknown, but not something to which I'll be assigning varying values. So here are three variables roughly equivalent to the corresponding C# vars above:
(declareconst product Int)
(declareconst factor1 Int)
(declareconst factor2 Int)
In an Sexpression, the first item inside the parentheses is the function, and the remaining items are arguments. So
declareconst
is the function here and the remaining items are the variable name and its "sort" (type).Next, the problem says that
product
must be the, ahem, product of the two factors:
(assert (= (* factor1 factor2) product))
"
assert
" sounds like a unit test, doesn't it? Indeed, many people coming to a theorem prover from a software development background will find that programming them is much more similar to writing tests than writing code. The line above just says that factor1 * factor2 = product
. But it's an assertion, not an assignment; we haven't specified values for any of these variables.The problem statement says that both factors are three digit numbers:
(assert (and (>= factor1 100) (<= factor1 999)))
(assert (and (>= factor2 100) (<= factor2 999)))
Mathematically, what does it mean for a number to be a palindrome? In this case, the largest product of 3 digit numbers is going to be a six digit number of the form abccba, so product = 100000a + 10000b + 1000c + 100c + 10b + a. As I noted above, expressing the relationship this way is key to finding a nonbruteforce solution using algebra. But you don't need to know that in order to use Z3, because Z3 knows algebra! All you need to know is that you should express relationships mathematically instead of using string manipulation.
(declareconst a Int)
(declareconst b Int)
(declareconst c Int)
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))
I implied above that a, b, and c are singledigit numbers, so we need to be specific about that. Also, a can't be 0 or we won't have a 6 digit number.
(assert (and (>= a 1) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))
These 4 assertions about a, b, and c are enough to determine that product is a palindrome. We're not quite done yet, but let's see how we're doing so far.
(checksat)
asks Z3 if there is a solution to the problem we've posed, and (getmodel)
displays that solution. Here's the entire script so far:
(declareconst product Int)
(declareconst factor1 Int)
(declareconst factor2 Int)
(assert (and (>= factor1 100) (< factor1 1000)))
(assert (and (>= factor2 100) (< factor2 1000)))
(assert (= (* factor1 factor2) product))
(declareconst a Int)
(declareconst b Int)
(declareconst c Int)
(assert (and (>= a 1) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))
(checksat)
(getmodel)
When you run this through Z3 (try it yourself!), the solver responds:
sat
(model
(definefun c () Int
1)
(definefun b () Int
0)
(definefun a () Int
1)
(definefun product () Int
101101)
(definefun factor2 () Int
143)
(definefun factor1 () Int
707)
)
That's pretty good!
sat
, here, means that Z3 found a solution (it would have displayed unsat
if it hadn't). Eliding some of the syntax, the solution it found was 143 * 707 = 101101. Not bad for zero implementation code, but also not the answer to the Project Euler problem, which asks for the largest product.Optimization
"Optimization," in Z3 parlance, refers to finding the "best" proof for the theorem, not doing it as quickly as possible. But how do we tell Z3 to find the largest product?
(Update: I had a mistake in the original version of this post, and so I've significantly changed this section.)
Z3 has a function called
maximize
, but it's a bit limited. If I try adding (maximize product)
, Z3 complains:Z3(15, 10): ERROR: Objective function '(* factor1 factor2)' is not supported
After some fiddling, however, it seems
(maximize (+ factor1 factor2))
works, sort of. Adding this to the script above causes Z3 to return:
(+ factor1 factor2) > [1282:oo]
unknown ...
Which is to say, Z3 could not find the maximal value. ("oo" just means ∞, and
unknown
means it could neither prove nor disprove the theorem.) Guessing that a might be bigger than 1, I can change its range to 8..9 and Z3 arrives at a single solution:
(+ factor1 factor2) > 1906
sat
(model
(definefun b () Int
0)
(definefun c () Int
6)
(definefun factor1 () Int
913)
(definefun factor2 () Int
993)
(definefun a () Int
9)
(definefun product () Int
906609)
)
The final script is:
(declareconst product Int)
(declareconst factor1 Int)
(declareconst factor2 Int)
(assert (and (>= factor1 100) (< factor1 1000)))
(assert (and (>= factor2 100) (< factor2 1000)))
(assert (= (* factor1 factor2) product))
(declareconst a Int)
(declareconst b Int)
(declareconst c Int)
(assert (and (>= a 8 ) (<= a 9)))
(assert (and (>= b 0) (<= b 9)))
(assert (and (>= c 0) (<= c 9)))
(assert (= product (+ (* 100000 a) (* 10000 b) (* 1000 c) (* 100 c) (* 10 b) a)))
(maximize (+ factor1 factor2))
(checksat)
(getmodel)
This bothers me just a little, since I had to lie ever so slightly about my objective, even though I did end up with the right answer.
That's just a limitation of Z3, and it may be fixed some day; Z3 is under active development, and the optimization features are not even in the unstable or master branches. But think about what has been achieved here: We've solved a problem with nothing but statements about the properties of the correct answer, and without any "implementation" code whatsoever. Also, using this technique forced me to think deeply about what the problem actually meant.
Can This Be Real?
Comments

Mason, it's not an either/or choice. Most real world use of Z3 and related systems right now uses them hand and hand with a more conventional language. For example, Microsoft uses Z3 and the VCC compiler to look for faults in the HyperV memory manager. I will give several additional examples of realworld use in my next post on this topic.
The sales pitch of largescale, purely declarative systems is, at best, premature, but no single tool can build such systems by itself today anyway. Z3 and other proving systems are a tool in your toolbox. Don't believe anyone who claims that a single tool is right for everything, even, or especially, if it's C++! 
Possibly stupid question:
You say that (maximize (* factor1 factor2)) does not work and you seem to imply it's a temporary limitation of Z3.
Maybe Z3 is using a simplex (or something similar to a simplex) to maximize the function?
If this is true, the objective function as a sum (instead of a product) is sort of mandatory, isn't it? 
Hmmm, calling declarative programs tests, and therefore saying they are better than programming perse is a bit fatuous. Also, code leans on tests and tests lean on code. having one without the other is bad. There is less 'double checking' opportunities. Whats to stop you having faulty assertions/tests? Just as code without tests can have bugs, so 'tests' without code have nothing to verify them.

Michael Leuschel Wednesday, 9 July 2014
This is an interesting post, thanks.
Funnily enough we are also currently looking at encoding some of the Euler problems in the B formal language and solving them using our ProB constraint solver.
A B solution is as follows (you can directly paste this into our online Logic Calculator at: http://www.stups.uniduesseldorf.de/ProB/index.php5/ProB_Logic_Calculator):
num = %s.(s:seq(0..9)SIGMA(i).(i:dom(s)s(i)*10**(i1))) /* convert a sequence of digits into a number */
&
all_palindromes = {x,y,s,p {x,y} <: 100..999 & x<y &
size(s)=6 & p=x*y & num(s) = p & num(rev(s))=p} &
best_palindrome = max(ran(all_palindromes)) &
best_factors = dom(all_palindromes~[{best_palindrome}])
ProB does not (yet) have optimisation via an objective function; so we find all solutions here. Runtime is 0.23 seconds on my MacBook Air. I would be interested in knowing the runtime of the Z3 solution. It would be nice if one could avoid having to hardcode the 6 digits (both in the Z3 as well as in the B solution).
A version of the B model using unicode operators is as follows (in the hope this is more readable):
num = λs.(s ∈ seq(0 ‥ 9)∑(i).(i ∈ dom(s)s(i) * 10 ↑ (i  1))) ∧ all_palindromes = {x,y,s,px ∈ 100 ‥ 999 ∧ y ∈ 100 ‥ 999 ∧ x < y ∧ size(s) = 6 ∧ p = x * y ∧ num(s) = p ∧ num(rev(s)) = p} ∧
best_palindrome = max(ran(all_palindromes)) ∧
best_factors = dom((all_palindromes~)[{best_palindrome}]) 
naugtur Wednesday, 9 July 2014
What follows is an opinion.
This is a step up from Prolog. Prolog has proven itself to be a great tool for reasoning about stuff that's not easily reasoned about in a Clike programming language. But reasoning in it is really hard anyway. And the flexibility and performance and productionuse capabilities are, well, not there. I've been on a project that started as a Prolog program and was quickly ported to C++ because of all reasons I already mentioned.
This idea is headed towards the same fate.
Prolog is an awesome exercise for the brain. This will be as well. Therefore thet're academic (I might event say toys, but I respect people producing those tools very much and they are truly awesome)
Both this and Prolog is about abstracting algorithms away. We're not ready to move away from algorithms. In a future near the Singularity we'll be able to do that and program machines by defining the result. But it's not something that waits around the corner. 
Enrique Thursday, 10 July 2014
Everytime someone talks about automatic code generation without programming they come with just another programming language. SMTLIB, UML or SQL are just programming languages. The fact that they are declarative or graphic is only a matter of representation. It does not make programming easier. Since James Martin "Application programming without programmers" (1982) and the CASE tools myth, people keeps believing in automatic program generation, but the fact is that there are no silver bullets, as Frederick Brooks said.

Paolo: No, not a stupid question at all. The default optimizer is indeed primal Simplex, so you may be right as far as that goes. There are other solvers available, but I don't understand all of the options. There is some discussion on the optimization tutorial.

Peter: I don't think I claimed that pure specification is "better than programming." Indeed, I think this is programming. I quite agree that having a "double entry" system is worthwhile! The interesting part here isn't that there is only specification, but rather than the specification can stand on its own. We want to prove that the specification itself is complete and correct and selfconsistent. Compare with the "specification" in the Project Euler problem which is "just words."

Michael: Nice! Don't think you need x > y; your solver seems to work fine without that term. Noticed you support TLA+, which is something I'd like to discuss in the future. If you want to try Z3 on your own hardware to compare, you can build it from http://z3.codeplex.com/" rel="nofollow">source code.

Laurentiu Nicola Tuesday, 15 July 2014
In theory you could express maximize as the absence of a larger solution, but sadly Z3 doesn't seem to handle it: http://rise4fun.com/Z3/pq6A .

Hodor Monday, 21 July 2014
"Indeed, you can solve this problem fairly quickly with pencil and paper so long as you don’t do IsPalindrome this way! (It would probably double the length of this post to explain how, so I’ll skip that. If there’s demand in comments I can explain, otherwise just read on.)"
I'd like to know how. 
Hodor, short version is you're looking for two factors of three digits, call them def and ghi, which equal abccda. But you know that a is almost certainly 9, and hence d and g are almost certainly 9 as well. Meaning f and i are either both 3 or are 9 and 1. To get the rest, you have to write out the multiplication: abccba = fi + 10ei + 100 di + 10fh + ... and solve. But because we've already guessed so many of the individual digits, the solution is much easier than it seems at first glance.

Please login first in order for you to submit comments
Color me skeptical. The concept of "purely declarative programming" has been around for a looooong time, and it fails to one degree or another every time it's introduced, because the fundamental fact is that programming is complex and complexity doesn't just go away when you sweep it under a layer of abstraction. Possibly the best success story for declarative programming is SQL, and anyone with experience in databases knows all about the warty side of [insert RDBMS of choice here.]