Why CUE for Configuration
We selected CUE as the configuration language in Holos for a number of reasons described in this post. The process was a combination of process by elimination and the unique way CUE unifies configuration.
We evaluated a number of domain specific and general purpose languages before deciding on CUE. The CUE website, GitHub issues, and Marcel's videos do a great job of explaining most of these reasons, so I'll summarize and cite them here.
DSL or GPL
The first decision was if we should use a turing complete general purpose language, or a domain specific language (DSL). We decided to use a DSL because we knew from hard won experience configuration with general purpose languages invites too many problems over time.
- Configuration easily becomes non-deterministic, especially when remote procedure calls are involved.
- Many general purpose languages support type checking, but few support constraints and validation of data. We must write our own validation logic which often means validation happens haphazardly, if at all.
- Data is usually mutable, making it difficult to know where an output value came from.
- Configuration code is read much more frequently, and at more critical times like an outage, than it's written. I felt this pain and I don't want anyone using Holos to feel that way.
For these reasons we sought a domain specific language that focused on simplicity, readability, and data validation. This quote from Marcel got my attention focused on CUE.
I would argue that for configuration languages maintainability and readability are more important even than for programming languages, because they are ofter viewed by a larger group, often need to be changed in emergency conditions, and also as they are supposed to convey a certain contract. Most configuration languages, like GCL (my own doing), are more like scripting languages, making it easier to crank out definitions of large swats of data compactly, but being harder to comprehend and modify later.
Source: Comparisons between CUE, Jsonnet, Shall, OPA, etc.
Other DSLs
Template Engines
Template engines are not exactly a domain specific language, but they're similar. We already used Go templates in Helm to produce YAML, and previously used Jinja2 and ERB templates extensively for configuration tasks.
The fundamental problem with text template engines is that they manipulate text, not data. As a result, output is often rendered without error or indication the configuration is invalid until it is applied to the live system. Errors need to be handled faster and earlier, ideally immediately as we're writing in our editor.
For these reasons we can set aside all tools based on text templating.
Jsonnet
Marcel and the CUE website explain this much better than I can. We used Jsonnet to configure the kubernetes prometheus stack and experienced Jsonnet's lack of validation features first hand.
Like Jsonnet, CUE is a superset of JSON. They also are both influenced by GCL. CUE, in turn is influenced by Jsonnet. This may give the semblance that the languages are very similar. At the core, though, they are very different.
CUE’s focus is data validation whereas Jsonnet focuses on data templating (boilerplate removal). Jsonnet was not designed with validation in mind.
Jsonnet and GCL can be quite powerful at reducing boilerplate. The goal of CUE is not to be better at boilerplate removal than Jsonnet or GCL. CUE was designed to be an answer to two major shortcomings of these approaches: complexity and lack of typing. Jsonnet reduces some of the complexities of GCL, but largely falls into the same category. For CUE, the tradeoff was to add typing and reduce complexity (for humans and machines), at the expense of giving up flexibility.
Source: CUE Configuration Use Case - Jsonnet / GCL
Marcel answered this question in more depth earlier:
Jsonnet is based on BCL, an internal language at Google. It fixes a few things relative to BCL, but is mostly the same. This means it copies the biggest mistakes of BCL. Even though BCL is still widely used at Google, its issues are clear. It was just that the alternatives weren't that much better.
There are a myriad of issues with BCL (and Jsonnet and pretty much all of its descendants), but I will mention a couple:
- Most notably, the basic operation of composition of BCL/Jsonnet, inheritance, is not commutative and idempotent in the general case. In other words, order matters. This makes it, for humans, hard to track where values are coming from. But also, it makes it very complicated, if not impossible, to do any kind of automation. The complexity of inheritance is compounded by the fact that values can enter an object from one of several directions (super, overlay, etc.), and the order in which this happens matters. The basic operation of CUE is commutative, associative and idempotent. This order independence helps both humans and machines. The resulting model is much less complex.
- Typing: most of the BCL offshoots do not allow for schema definitions. This makes it hard to detect any kind of typos or user errors. For a large code bases, no one will question a requirement to have a compiled/typed language. Why should we not require the same kind of rigor for data? Some offshoots of BCL internal to Google and also external have tried to address this a bit, but none quite satisfactory. In CUE types and values are the same thing. This makes things both easier than schema-based languages (less concepts to learn), but also more powerful. It allows for intuitive but also precise typing.
There are many other issues, like handling cycles, unprincipled workarounds for hermeticity, poor tooling and so forth that make BCL and offsprings often awkward.
So why CUE? Configuration is still largely an unsolved problem. We have tried using code to generate configs, or hybrid languages, but that often results in a mess. Using generators on databases doesn't allow keeping it sync with revision control. Simpler approaches like HCL and Kustomize recognize the complexity issue by removing a lot of it, but then sometimes become too weak, and actually also reintroduce some of this complexity with overlays (a poor man's inheritance, if you will, but with some of the same negative consequences). Other forms of removing complexity, for instance by just introducing simpler forms/ abstraction layers of configuration, may work within certain context but are domain-specific and relatively hard to maintain.
So inheritance-based languages, for all its flaws, were the best we had. The idea behind CUE is to recognize that a declarative language is the best approach for many (not all) configuration problems, but to tackle the fundamental issues of these languages.
The idea for CUE is actually not new. It was invented about 30 years ago and has been in use and further developed since that time in the field of computational linguistics, where the concept is used to encode entire lexicons as well as very detailed grammars of human languages. If you think about it, these are huge configurations that are often maintained by both computer scientists and linguists. You can see this as a proof of concept that large-scale, declarative configuration for a highly complex domain can work.
CUE is a bit different from the languages used in linguistics and more tailored to the general configuration issue as we've seen it at Google. But under the hood it adheres strictly to the concepts and principles of these approaches and we have been careful not to make the same mistakes made in BCL (which then were copied in all its offshoots). It also means that CUE can benefit from 30 years of research on this topic. For instance, under the hood, CUE uses a first-order unification algorithm, allowing us to build template extractors based on anti-unification (see issue #7 and #15), something that is not very meaningful or even possible with languages like BCL and Jsonnet.
Source: how CUE differs from jsonnet
Dhall
Dhall addresses some of the issues of GCL and Jsonnet (like lack of typing), but lacks the detailed typing of CUE. But it still misses the most important property of CUE: its model of composability. Some of the benefits are explained in the above link. Conceptually, CUE is an aspect-oriented and constraint-based language. It allows you to specify fine-grained constraints on what are valid values. These constraints then double as templates, allowing to remove boilerplate often with the same efficacy as inheritance, even if it works very differently.
Source Comparisons between CUE, Jsonnet, Dhall, OPA, etc.
Rego (OPA)
CUE also can be used for policy specification, like Rego (OPA).CUE unifies values, types, and constraints in a single continuum. As it is a constraint-based language first and foremost, it is well suited for defining policy. It is less developed in that area than Rego, but it I expect it will ultimately be better suited for policy. Note that Rego is based on Datalog, which is more of a query language at hart, giving it quite a different feel for defining policy than CUE. Both are logic programming languages, though, and share many of the same properties.
Source Comparisons between CUE, Jsonnet, Dhall, OPA, etc.
PKL
I didn't look deeply into Pkl primarily because CUE, like Holos, is written in Go. It was straight forward to integrate CUE into Holos.
HCL
I have extensive experience with HCL and found it challenging to work with at medium to large scales.
See also: CUE Configuration Use Case - HCL
Editor Integration
CUE has good support today for Visual Studio Code, and better support coming, see the CUE LSP Roadmap
Additional Resources
The video Large-Scale Engineering of Configuration with Unification (Marcel van Lohuizen) motivated me to go deeper and invest significant time into CUE.