Inductive logic programming: from machine learning to software engineering

hwechtla-tl: Inductive logic programming: from machine learning to software engineering

Mikä on WikiWiki?
nettipäiväkirja
koko wiki (etsi)
viime muutokset

This interesting book by F. Bergadano and D. Gunetti provides a mixture of theoretical results and practical advice / opinions in a way that makes me wonder about their intended audience. Basically, what they seem to have in mind is a major shift in software development methodology: the tools and procedures they develop are aimed at practical program induction. What this means in practice is that programming languages would not be formalisms to specify exact input-output behaviours, but rather sets of constraints (examples of input-output pairs, constraints on program structure etc.) from which the actual program is induced.

One of their main assumptions is that it is easier to specify a set of programs, one of which is the intended one, than to specify the "correct" program only. The situation they have in mind is that of a programmer who already knows what she needs to construct; just the exact form of the program definition is unknown. The formalism Bergadano and Gunetti present requires the programmer to specify a set of examples and the problem space (e.g. the clauses that may occur in the induced program); from these the "compiler" produces a program specification satisfying the constraints.

My main reason for being sceptical with this vision is that from what I know about test-driven programming (http://c2.com/cgi/wiki?CodeUnitTestFirst), it is not easier to write a vague approximation of a program than to write the correct one. The most tedious bugs (i.e. ones that cannot be caught by automatical or semi-automatical tools) are usually uncovered cases of one kind or another; specifying a program by examples makes it easier to make such mistakes, since it is easier to forget an example than to forget a branch of a conditional that is supposed to be exhaustive. Moreover, guessing the parts of the program that are appropriate for a given case is a sophisticated task and picking the correct ones from those is hardly a harder task. Lastly, specifying a set of programs requires more typing (on the keyboard) than specifying one program only, irrespectible of the specification language, and in my experience, program development time is directly proportional to the number of keystrokes you have to type. (http://c2.com/cgi/wiki?NumberOfKeystrokes)

What makes this approach relevant, however, is the possible synergy that can be achieved from the new program specification formalisms. I think that test-driven development is an excellent approach for many kinds of programs (if not all); if the examples were sufficient to induce the program itself, the programming effort in test-driven development would be greatly reduced. On the other hand, programs in practice tend to resemble each other, and it would probably be possible to use the same problem space for many program induction tasks. Under these circumstances, it could be possible to actually get some nontrivial results for "free" (or at the cost of understanding the induction tools).

(((A program, its examples, and the procedures that are used to produce one from the other are naturally intimately connected. A procedure that produces a program from examples is an inductive inference machine IIM(e) = P; a procedure that produces a set of examples from a program is a test set generation procedure TSG(P) = e. ILP usually studies the construction of a known program. Basically, it could study the selection of an optimal example set for a given inductive inference machine, but usually studies the selection of the optimal IIM for a given set of examples (which is usually quite arbitrarily chosen). For induction, we could also study the optimal program to be produced for a given set of examples -- i.e. what the "correct" result of an IIM actually is.)))

kategoria: ohjelmointi

kommentoi (viimeksi muutettu 05.07.2007 20:40)