Documentation
The build is implemented using bash-scripts, which in turn use various tools to produce the site.
General design principles
- As much sanity checking and validation as possible is performed, and there are few warnings: Mostly, either the build is perfect or it fails. This is annoying at times, when petty details bring the whole thing down, but has the big advantage that minor issues can never accumulate. The idea is to start with a clean site, make changes in small steps, and build often to ensure never being far from a working build.
- Unit test everything (i.e. very local tests with minimal dependencies on the environment).
Architecture
The build consists of:
- A makefile which manages the phases of the build.
- A script for each phase.
- Utility scripts that each accomplish one concrete thing.
- Unit and integration tests of all of the above.
- The build.sh wrapper-script which sets up site-specific environment variables and invokes the build.
Makefile
Just a bunch of phony targets - GNU make simply manages the dependencies between phase scripts. (Integration) Tested by invoking all targets on dummy sites.
Phase scripts
The build is split into phases, each handled entirely by a "phase script": check-src, format-src, transform, minimise-dest, check-dest, etc. Each handling a phase of the build, invoking utility scripts to do the dirty work.
The phase scripts are "aware" of being part of the build, i.e. they explicitly take files from src, modify them in tmp, and place the final result in dest.
Phase scripts easily become enormous spaghetti-beasts, which is why as many discrete operations as possible have been extracted into "utility" scripts, which are more modular and hence testable.
The phase scripts are tested by invoking them via the makefile on dummy site sources. Very little detailed validation is performed - if the build terminates without an error, it is assumed to have worked. Because of the general principle that util scripts fail on the slightest irregularity, the build terminating OK means a lot.
Build Life Cycle Phases
The build life cycle consists of the following phases:
- setup
- Check env configuration, create directories etc.
- format-src
- Format the source files for readability.
- check-src
- Perform extensive checking and validation of the source files (HTML, feeds, JS, CSS, etc.).
- copy-resources
- Copy resources (non-HMTL files in the source content directory) to the destination directory.
- style
- Join and copy stylesheets, embedding images using the data schema.
- transform-pages-setup
- Setup the temporary page tranformation directory, and prepare the HTML fragments for inclusion.
- initialise-feeds
- Create the atom feed files in the destination directory, and the HTML feed files in the source directory (ready for transformation), populating both with entries.
- transform-pages
- Transform the source content pages into deployment-ready pages in the destination directory.
- finalise-feeds
- Finalise the feeds in the destination directory. Delete the generated HTML feeds from the source directory.
- minify-dest
- Minify the destination files to optimise bandwidth consumption.
- check-dest
- Validate and check the destination files (HTML, feeds, JS, CSS, etc.)
Utility scripts
These are relatively tiny scripts, performing a single, simple task: check-html, format-js, etc. Lots of little scripts, which each perform a clearly delimited, welldefined task. This makes them ideal for unit testing.
These scripts are unaware of the build, they simply get passed a file to work on, without knowing anything about src,tmp, dest, etc. If the argument file is to be modified, it is usually done in-place - the invoking phase script is responsible for copying source files before having them modified by utility scripts.
The scripts are unit tested for correct behaviour, and for being as difficult as possible about the input they receive.
Input
- src
- The base directory of all source files.
- src/content
- html sources, images, javascript and page-specific css (all global style info is in src/style).
- src/fragments
- Bits that are used in every page, are called "fragments". They can be modified globally here, and the build includes them in all pages as appropriate. (e.g. the <header>-element, common to all pages, can be customised globally in src/fragments/head.html).
- src/style
- To facilitate keeping an overview over the css layout of the site, each section can be put in a separate file in a subdirectory of src/style/, e.g. src/style/mystyle/crumbs.css. The build will concatenate (and minify) the files and put the result in subdir-name.css. This yields optimal output while allowing independent division of the style of each page element.
Global variables
The build script and the makefile sets a couple of environment variables which are needed in many places and would be a nuissance to pass as arguments to scripts, e.g. MOLK_UTILS, which points to the directory where utility scripts are located:
- MOLK_SRC
- The base directory of the source site files. There are a couple of derived variables which gives the location of specific sources, e.g. MOLK_CONTENT for the .html (and other) source files.
- MOLK_TMP
- The build generates intermediary files in this directory. These files can be discarded when the build is done (e.g. using the "clean" target). They can be useful for analysing failed builds.
- MOLK_DEST
- The location of the deployable site.
- MOLK_BUILD
- The base directory of the molk.ch-build files.
- MOLK_LOG
- The build log file. This is where the build writes warnings to. stdout is used primarily for progress reporting which does not combine well with good warning messages.
- MOLK_UTILS
- The location of the utility scripts. This is usually build-location/util, but to avoid hardcoding that path all over the place, all utilities are referenced via the variable.
Online
The build must be online for some checks to be performed, e.g. the online W3C feed validator is used to validate Atom RSS feeds.
If the build runs without access to the internet, the online checks will be silently skipped.