summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorKlaus Aehlig <klaus.aehlig@huawei.com>2022-04-19 14:36:59 +0200
committerKlaus Aehlig <klaus.aehlig@huawei.com>2022-04-21 13:47:52 +0200
commit9077143d0585ad55f336751a5a5b92d1082438e6 (patch)
treeba39c19eaddf9a2c03ff99ae7ea8e1c3707ab26e
parent80caf4d4566371cf02923abdc26ae2e8258ed376 (diff)
downloadjustbuild-9077143d0585ad55f336751a5a5b92d1082438e6.tar.gz
Document the expression language used in our build tool
-rw-r--r--doc/concepts/expressions.org314
1 files changed, 314 insertions, 0 deletions
diff --git a/doc/concepts/expressions.org b/doc/concepts/expressions.org
new file mode 100644
index 00000000..cfe058ba
--- /dev/null
+++ b/doc/concepts/expressions.org
@@ -0,0 +1,314 @@
+* Expression language
+
+At various places, in particular in order to define a rule, we need
+a restricted form of functional computation. This is achieved by
+our expression language.
+
+** Syntax
+
+All expressions are given by JSON values. One can think of expressions
+as abstract syntax trees serialized to JSON; nevertheless, the precise
+semantics is given by the evaluation mechanism decribed later.
+
+** Semantic Values
+
+Expressions evaluate to semantic values. Semantic values are JSON
+values extended by additional atomic values for build-internal
+values like artifacts, names, etc.
+
+*** Truth
+
+Every value can be treated as a boolean condition. We follow a
+convention similar to ~LISP~ considering everything true that is
+not empty. More precisely, the values
+- ~null~,
+- ~false~,
+- ~0~,
+- ~""~,
+- the empty map, and
+- the empty list
+are considered logically false. All other values are logically true.
+
+** Evaluation
+
+The evaluation follows a strict, functional, call-by-value evaluation
+mechanism; the precise evaluation is as follows.
+
+- Atomic values (~null~, booleans, strings, numbers) evaluate to
+ themselves.
+- For lists, each entry is evaluated in the order they occur in the
+ list; the result of the valuation is the list of the results.
+- For JSON objects (wich can be understood as maps, or dicts), the
+ key ~"type"~ has be be present and has to be a literal string.
+ That string determines the syntactical construct (sloppily also
+ referred to as "function") the object represents, and the remaning
+ evaluation depends on the syntactial construct. The syntactical
+ construct has to be either one of the built-in ones or a special
+ function available in the given context (e.g., ~"ACTION"~ within
+ the expression defining a rule).
+
+All evaluation happens in an "environment" which is a map from
+strings to semantic values.
+
+*** Built-in syntactical constructs
+
+**** Special forms
+
+***** Variables: ~"var"~
+
+There has to be a key ~"name"~ that (i.e., the expression in the
+object at that key) has to be a literal string, taken as variable
+name. If the variable name is in the domain of the environment and
+the value of the environment at the variable name is non-~null~,
+then the result of the evaluation is the value of the variable in
+the environment.
+
+Otherwise, the key ~"default"~ is taken (if present, otherwise the
+value ~null~ is taken as default for ~"default"~) and evaluated.
+The value obtained this way is the result of the evaluation.
+
+***** Sequential binding: ~"let*"~
+
+The key ~"bindings"~ (default ~[]~) has to be (syntactically) a
+list of pairs (i.e., lists of length two) with the first component
+a literal string.
+
+For each pair in ~"bindings"~ the second component is evaluated, in
+the order the pairs occur. After each evaluation, a new environment
+is taken for the subesequent evaluations; the new environment is
+like the old one but amended at the position given by the first
+component of the pair to now map to the value just obtained.
+
+Finally, the ~"body"~ is evaluated in the final environemnt (after
+evaluating all binding entries) and the result of evaluating the
+~"body"~ is the value for the whole ~"let*"~ expression.
+
+***** Conditionals
+
+****** Binary conditional: ~"if"~
+
+First the key ~"cond"~ is evaluated. If it evaluates to a value that
+is logically true, then the key ~"then"~ is evaluated and its value
+is the result of the evaluation. Otherwise, the key ~"else"~ (if
+present, otherwise ~[]~ is taken as default) is evaluated and the
+obtained value is the result of the evaluation.
+
+****** Sequential conditional: ~"cond"~
+
+The key ~"cond"~ has to be a list of pairs. In the order of the
+list, the first components of the pairs are evaluated, until one
+evaluates to a value that is logically true. For that pair, the
+second component is evaluated and the result of this evaluation is
+the result of the ~"cond"~ expression.
+
+If all first components evaluate to a value that is logically false,
+the result of the expression is the result of evaluating the key
+~"default"~ (defaulting to ~[]~).
+
+****** String case distinction: ~"case"~
+
+If the key ~"case"~ is present, it has to be a map (an "object", in
+JSON's terminology). In this case, the key ~"expr"~ is evaluated; it
+has to evaluate to a string. If the value is a key in the ~"case"~
+map, the expression at this key is evaluated and the result of that
+evaluation is the value for the ~"case"~ expression.
+
+Otherwise (i.e., if ~"case"~ is absent or ~"expr"~ evaluates to a
+string that is not a key in ~"case"~), the key ~"default"~ (with
+default ~[]~) is evaluated and this gives the result of the ~"case"~
+expression.
+
+****** Sequential case distinction on arbitrary values: ~"case*"~
+
+If the key ~"case"~ is present, it has to be a list of pairs. In this
+case, the key ~"expr"~ is evaluated. The result of that evaluation
+is sequentially compared to the evaluation of the first components
+of the ~"case"~ list until an equal value is found. In this case,
+the evalaution of the second component of the pair is the value of
+the ~"case*"~ expression.
+
+If the ~"case"~ key is absent, or no equality is found, the result of
+the ~"case*"~ expression is the result of evaluating the ~"default"~
+key (with default ~[]~).
+
+***** Conjunction and disjunction: ~"and"~ and ~"or"~
+
+For conjunction, if the key ~"$1"~ (with default ~[]~) is syntactically
+a list, its entries are sequentially evaluated until a logically
+false value is found; in that case, the result is ~false~, otherwise
+true. If the key ~"$1"~ has a different shape, it is evaluated and
+has to evaluate to a list. The result is the conjunction of the
+logical values of the entries. In particular, ~{"type": "and"}~
+evaluates to ~true~.
+
+For disjunction, the evaluation mechanism is the same, but the truth
+values and connective are taken dually. So, ~"and"~ and ~"or"~ are
+logical conjunction and disjuction, respectively, using short-cut
+evaluation if syntactically admissible (i.e., if the argument is
+syntactically a list).
+
+***** Mapping
+
+****** Mapping over lists: ~"foreach"~
+
+First the key ~"range"~ is evaluated and has to evaluate to a list.
+For each entry of this list, the expression ~"body"~ is evaluated
+in an environment that is obtained from the original one by setting
+the value for the variable specified at the key ~"var"~ (which has
+to be a literal string, default ~"_"~) to that value. The result
+is the list of those evaluation results.
+
+****** Mapping over maps: ~"foreach_map"~
+
+Here, ~"range"~ has to evaluate to a map. For each entry (in
+lexicographic order (according to native byte order) by keys), the
+expression ~"body"~ is evaluated in an environment obtained from
+the original one by setting the variables specified at ~"var_key"~
+and ~"var_val"~ (literal strings, default values ~"_"~ and
+~"$_"~, respectively). The result of the evaluation is the list of
+those values.
+
+***** Folding: ~"foldl"~
+
+The key ~"range"~ is evaluated and has to evaluate to a list.
+Starting from the result of evaluating ~"start"~ (default ~[]~) a
+new value is obtained for each entry of the range list by evaluating
+~"body"~ in an environment obtained from the original by binding
+the variable specified by ~"var"~ (literal string, default ~"_"~) to
+the list entry and the variable specified by ~"accum_var"~ (literal
+string, default value ~"$1"~) to the old value. The result is the
+last value obtained.
+
+**** Regular functions
+
+First ~"$1"~ is evaluated; for binary functions ~"$2"~ is evaluted
+next. For functions that accept keyword arguments, those are
+evaluated as well. Finally the function is applied to this (or
+those) argument(s) to obtain the final result.
+
+***** Unary functions
+
+- ~"nub_right"~ The argument has to be a list. The result is the
+ input list, except that for all duplicate values, all but the
+ rightmost occurence is removed.
+
+- ~"basename"~ The argument has to be a string. This string is
+ interpreted as a path, and the file name thereof is returned.
+
+- ~"keys"~ The argument has to be a map. The result is the list of
+ keys of this map, in lexicographical order (according to native
+ byte order).
+
+- ~"values"~ The argument has to be a map. The result are the values
+ of that map, ordered by the corresponding keys (lexicographically
+ according to native byte order).
+
+- ~"range"~ The argument is interpreted as a non-negative integer as
+ follows. Non-negative numbers are rounded to the nearest integer;
+ strings have to be the decimal representation of an integer;
+ everything else is considered zero. The result is a list of the
+ given length, consisting of the decimal representations of the
+ first non-negative integers. For example, ~{"type": "range",
+ "$1": "3"}~ evaluates to ~["0", "1", "2"]~.
+
+- ~"++"~ The argument has to be a list of lists. The result is the
+ concatenation of those lists.
+
+- ~"map_union"~ The argument has to be a list of maps. The result
+ is a map containing as keys the union of the keys of the maps in
+ that list. For each key, the value is the value of that key in
+ the last map in the list that contains that key.
+
+- ~"join_cmd"~ The argument has to be a list of strings. A single
+ string is returned that quotes the original vector in a way
+ understandable by a POSIX shell. As the command for an action is
+ directly given by an argument vector, ~"join_cmd"~ is typically
+ only used for generated scripts.
+
+- ~"json_encode"~ The result is a single string that is the canonical
+ JSON encoding of the argument (with minimal white space); all atomic
+ values that are not part of JSON (i.e., the added atomic values
+ to represent build-internal values) are serialized as ~null~.
+
+***** Unary functions with keyword arguments
+
+- ~"change_ending"~ The argument has to be a string, interpreted as
+ path. The ending is replaced by the value of the keyword argument
+ ~"ending"~ (a string, default ~""~). For example, ~{"type":
+ "change_ending", "$1": "foo/bar.c", "ending": ".o"}~ evaluates
+ to ~"foo/bar.o"~.
+
+- ~"join"~ The argument has to be a list of strings. The return
+ value is the concatenation of those strings, separated by the
+ the specified ~"separator"~ (strings, default ~""~).
+
+- ~"escape_chars"~ Prefix every in the argument every character
+ occuring in ~"chars"~ (a string, default ~""~) by ~"escape_prefix"~ (a
+ strings, default ~"\\"~).
+
+- ~"to_subdir"~ The argument has to be a map (not necessarily of
+ artifacts). The keys as well as the ~"subdir"~ (string, default
+ ~"."~) argument are interpreted as paths and keys are replaced
+ by the path concatenation of those two paths. If the optional
+ argument ~"flat"~ (default ~false~) evaluates to a true value,
+ the keys are instead replaced by the path concatenation of the
+ ~"subdir"~ argument and the base name of the old key. It is an
+ error if conflicts occur in this way.
+
+***** Binary functions
+
+- ~"=="~ The result is ~true~ is the arguments are equal, ~false~
+ otherwise.
+
+- ~"concat_target_name"~ This function is only present to simplify
+ transitions from some other build systems and normally not used
+ outside code generated by transition tools. The second argument
+ has to be a string or a list of strings (in the latter case,
+ it is treated as strings by concatenating the entries). If the
+ first argument is a string, the result is the concatenation of
+ those two strings. If the first argument is a list of strings,
+ the result is that list with the second argument concatenated to
+ the last entry of that list (if any).
+
+***** Other functions
+
+- ~"empty_map"~ This function takes no arguments and always returns
+ an empty map.
+
+- ~"singleton_map"~ This function takes two keyword arguments,
+ ~"key"~ and ~"value"~ and returns a map with one entry, mapping
+ the given key to the given value.
+
+- ~"lookup"~ This function takes two keyword arguments, ~"key"~
+ and ~"map"~. The ~"key"~ argument has to evaluate to a string
+ and the ~"map"~ argument has to evaluate to a map. If that map
+ contains the given key and the corresponding value is non-~null~,
+ the value is returned. Otherwise the ~"default"~ argument (with
+ default ~null~) is evaluated and returned.
+
+**** Constructs related to reporting of user errors
+
+Normally, if an error occurs during the evaluation the error is
+reported together with a stack trace. This, however, might not
+be the most informative way to present a problem to the user,
+especially if the underlying problem is a proper user error, e.g.,
+in rule usage (leaving out mandatory arguments, violating semantical
+prerequisits, etc). To allow proper error reporting, the following
+functions are available. All of them have an optional argument
+~"msg"~ that is evaluated (only) in case of error and the result of
+that evaluation included in the error message presented to the user.
+
+- ~"fail"~ Evaluation of this function unconditionally fails.
+
+- ~"context"~ This function is only there to provide additional
+ information in case of error. Otherwise it is the identify
+ function (a unary function, i.e., the result of the evaluation
+ is the result of evaluating the argument ~"$1"~).
+
+- ~"assert_non_empty"~ Evaluate the argument (given by the parameter
+ ~"$1"~). If it evaluates to a non-empty string, map, or list,
+ return the result of the evaluation. Otherwise fail.
+
+- ~"disjoint_map_union"~ Like ~"map_union"~ but it is an error,
+ if two (or more) maps contain the same key, but map it to
+ different values.