So much of UNIX's tooling are domain-specific languages to parse, and to make those easier to write in C they (eventually) created YACC which GNU rewrote as Bison. Arguably all commandline all commandline programs are domain-specific languages! It is even used by Gettext (where in any other language I would've written a Shunting Yard or Pratt parser...), and most languages have equivalents!
Today I'll discuss Bison's frontend, & should only take a couple days to finish this.
The first thing Bison does is retrieve $BISON_PROGRAM_NAME if present into argv & a global, followed by initializing internationalization, quoting as per $LC_CTYPE, on-exit cleanup, glyphicons table (preferably Unicode), deduplicating hashmap for strings, "muscles" table + obstack allocation, line truncation data & tables for how seriously to treat different kinds of errors, & obstack/flag for lexing.
Then parses extensive commandline flags preprocessing to extract colouroptions first.
Bison's commandline parsing integrates the internal "complaints" infrastructure with special line-numbering handling, populating the tables I described earlier amongst other globals. Afterwords it validates there's a single extra commandline arg which it deduplicates & stores in a global & the "file_name" interpretor-var. As always --help & --version handled specially.
With extensive & optional time-profiling & some profiling bitflags it parses the input file exitting upon fatal error, ...
2. Output a file reporting rules marked as useless, states marked as conflicting, all useful grammar rules, terminals & all rules referencing them, same for non-terminals, & the state machine
3. Iterates over each state generating a nontrivial label & outputting a graphviz node followed by corresponding edges with reduction in red, all surrounded by a graphviz header & footer. Seperate internal module abstracts graphviz
4. Outputs the grammar & statemachine as XML without an XML serializer
5. For HTML it runs that XML file through bundled XSLT file via a configuable XSLT processor!
6. Unconditionally (as long as there weren't any errors) frees lookahead tokens.
7. Conditionally again outputs the statemachine to C in multiple passes I'll describe later.
From there it frees all the various data it has allocated whilst applying minor fixes keeping a backup file.
6/6 Fin for today! Will continue tomorrow with parsing & optimizing!
For people who care about, support, or build Free, Libre, and Open Source Software (FLOSS).