BakefileNG/Flaws

Flaws in Bakefile 0.2.x design

This page briefly lists (some of) the fundamental problems with Bakefile's design that cannot be easily fixed in its current form and require more or less major redesign to deal with.

Too primitive language

Large part of Bakefile is implemented in Bakefile itself -- initializing targets and its properties is done in rules implemented as Bakefile code. This was meant to ensure great extensibility, but in practice, it has two effects:

  1. Bakefile is slow, because everything requires interpreting bkl code of either user's bakefiles or the internal rules
  2. It's actually hard to make changes to the behavior, let alone extending with independent modules, because the bkl language is very limited and many things are hard to express in it. It would probably be better to implement more things in Python and have easy-to-use (Trac-like) API for extensions.

Imperative language

Bakefile's language is imperative, not declarative. (Almost) all statements are processed in order of appearance and immediately acted upon. For example, <include>foo</include> tag results in -Ifoo being immediately added to cppflags. This was (incorrectly, as it turns out) assumed to be an advantage, because it makes both implementation and reasoning about debugging easier. That's before the workarounds for common problems happened:

Tags reordering

Because of the immediate application of tags, lots of tags implement some logic in them. If the tag is part of a group of related tags (e.g. precompiled headers tags), it's very hard to implement things correctly if the tags can be specified in any order. So tags reordering by means of <tag-info position> was introduced, to simplify implementation.

Exclusive tags

Some tags can only be specified once (<dllname>, ...; let's call this exclusive tags), some multiple times (<include>, <sys-lib>, ...). Templates clearly need to set some "exclusive" things like dllname, so these tags are marked with <tag-info exclusive="1"> and the parser ignores all but the latest definition. This is not a flaw per se, but it's a gross hack that was made necessary by the language.

eval=0

Perhaps the worst offender is <set var="" eval="0"> hack. Because of the in-order immediate processing, it's practically impossible to setup things to reasonable defaults in advance (e.g. in <dll> target's initialization template). For example, we'd want to set compilation command to the default used in most cases and only substitute the right cflags and so on when the user specifies them. But simply doing

<set var="_cflags"/> <!-- user will set this later -->
<set var="_cmd">$(_cflags)</set>

doesn't work, of course, because _cmd is immediately assigned current value of _cflags, which is "". So Bakefile has to be told to defer computing the value of _cmd until much later:

<set var="_cflags"/> <!-- user will set this later -->
<set var="_cmd" eval="0">$(_cflags)</set>

Options' <default-value> has delayed evaluation too, you can say it has implicit eval=0 on it.

Everything is a string

Bakefile uses make-style typeless variables system. All variables and expressions (including conditions) are simply strings. In practice, two types are badly needed: lists and paths. Additionally, enums would be useful for input checking.

Lists

Many tags work with lists or should work with them (e.g. <sources>, <sys-lib>), but because lists are not directly supported, then are implemented by appending a space and the new value to a variable. This leads to problems:

  1. Sometimes, we end up with many spaces in a row (this is, of course, only cosmetic issue).
  2. Even worse, if newlines are used as separators, they have to be filtered out, because such values cannot be directly used on command line.
  3. Support for accepting whitespace-separated list in tags has to be manually written; as the result, tags are inconsistent in this respect (#84, #119, ...).
  4. Paths with spaces continue to be a problem all over the place, because they have to be handled individually (#62). Most of the time, they simply don't work if used in .bkl input at all, because whitespace is treated as list items separator.

All of these problems would go away if there was builtin support for list type.

Filepaths

File and directory names are strings too. Because Unix paths are used internally and the output format may use different (that would be Windows) separators, translation has to be done using $(nativePaths(value)). This is usually done as soon as possible, so there's inconsistent mix of native and Unix paths (#93). Having a path type would help a lot: a path expression could be converted to native syntax only when emitting the output.

Conditions

Condition expressions are stored as strings and parsed when evaluated. Rarely is the parser full Python syntax parser and so only limited forms are recognized (see makeCondition()). This makes it hard to detect some possible optimizations and limits possible conditions even if the output formats you use would support wider range.

Individual makefiles are generated independently

Bakefile maps one input .bkl to one output makefile -- each invocation of Bakefile produces only one makefile. In particular, if you use <subproject> (e.g. for tests' or examples' makefiles), the subprojects are generated without any knowledge of the "parent" bakefile. This makes it impossible to automatically create dependencies using <library>, which is a very common needed.

No object model

Bakefile doesn't provide object model of the parser bakefile that could be manipulated by bakefiles. This makes modifying behavior of presets practically impossible, unless their author accounted for it. Because Bakefile is implemented in itself, it's hard to implement extensions like "make dist" support or xgettext invocations (there's no easy way to get all source files used).

XML

bkl files are written in XML. This is overly verbose (closing tags in particular are annoying) and makes it hard to nicely express some useful concepts, like if...elif...else. Additionally, XML syntax alone is insufficient, so we build extra syntax on top of it anyway -- "$(...)" for substitution in particular. Adding type system will require treating quotes specially too. It might be better to use more compact custom syntax.

Input file format is not separated from the rest of the processing, all of the code is written to work directly with the XML tree (or something similar enough: tree with named nodes, attributes on them and text content).