From bbcef5ae200e735e6faeb43d7ef48a186b1b35b2 Mon Sep 17 00:00:00 2001 From: Rick van Rein Date: Mon, 22 Feb 2016 10:02:50 +0000 Subject: [PATCH] Extensive syntax descriptions for der_walk() and der_unpack() / der_pack() --- PACK-SYNTAX.MD | 243 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ README.MD | 2 +- WALK-SYNTAX.MD | 116 +++++++++++++++++++++++++++ tool/Makefile | 10 ++- 4 files changed, 368 insertions(+), 3 deletions(-) create mode 100644 PACK-SYNTAX.MD create mode 100644 WALK-SYNTAX.MD diff --git a/PACK-SYNTAX.MD b/PACK-SYNTAX.MD new file mode 100644 index 0000000..8bbbd75 --- /dev/null +++ b/PACK-SYNTAX.MD @@ -0,0 +1,243 @@ +# Syntax for Packing Paths + +This specification describes the path format used by `der_unpack()` and +`der_pack()` to pass through a DER binary and map it to (or from) an array +of `dercursor` values. + +A simpler variation of a similar idea is the +[WALK SYNTAX](WALK-SYNTAX.MD), +which limits itself to finding a single element, but also based on a +(different) walking path expression. + +## Declaring and Using a Walking Path + +The walk path is described as a sequence of values that ends in `DER_PACK_END`: + + #include + + derwalk path_demo [] = { + ..., + DER_PACK_END + } + +which is then used to unpack the DER binary data under a `dercursor crs` with + + int prsok; + prsok = der_unpack (&crs, path_demo, outputs, 1); + +The `outputs` is an array of `dercursor` that will be filled with information +found in the `path_demo`. Explicit storage instructions will put the information +there, so the required size of the array is equal to the fixed number of such +instructions in `path_demo`. + +## Diving in Head-First + +The path described by `path_demo` should be a depth-first traversal of a +*static* structure. That means that `SEQUENCE OF` and `SET OF`, the two ways +of expressing dynamically sized structures in ASN.1, require special handling. + +To dive in, use a flag `DER_PACK_ENTER` and to leave a nested structure, use +the `DER_PACK_LEAVE` flag. To store intermediate values, use `DER_PACK_STORE` +as a flag, as in + + derwalk path_demo [] = { + DER_PACK_ENTER | ..., + DER_PACK_STORE | ..., + ..., + DER_PACK_LEAVE, + DER_WALK_END + } + +Note that `DER_PACK_LEAVE` is an instruction on its own. The other two +forms are extended with a tag that is verified to match, for instance + + derwalk path_demo [] = { + DER_PACK_ENTER | DER_TAG_SEQUENCE, + DER_PACK_ENTER | DER_TAG_CONTEXT (0), + DER_PACK_STORE | DER_TAG_INTEGER, + DER_PACK_LEAVE, + DER_PACK_STORE | DER_TAG_OCTETSTRING, + DER_PACK_LEAVE, + DER_WALK_END + } + +An example ASN.1 structure that could be traversed by this would be + + demoStruct ::= SEQUENCE { + demoCounter [0] INTEGER, + demoName OCTET STRING + } + +After invoking `der_unpack()` on this path, there are two values in the `output` +array of `dercursor`, namely for the two `DER_PACK_STORE` instructions. From the +following input data + + 30 16 -- tag SEQUENCE, length 16 + a0 03 -- tag a0 for [0], length 3 + 02 01 07 -- tag INTEGER, length 1, value 7 + 04 09 51 75 69 63 6b 20 44 45 52 -- tag 4, length 9, "Quick DER" + +this would make `output[0]` point to the `INTEGER` value 7, with 1 byte length, +and `output[1]` would point to the `OCTETSTRING` contents "Quick DER", with a +length of 9 bytes. + +Note how nothing remains of the DER tags or lengths. This is what you should +expect from a Quick and Easy DER parser. + +## Dealing with Variable Structures + +It is possible to not store everything we encounter. In the previous situation +we might have used + + derwalk path_demo [] = { + DER_PACK_ENTER | DER_TAG_SEQUENCE, + DER_PACK_STORE | DER_TAG_CONTEXT (0), + DER_PACK_STORE | DER_TAG_OCTETSTRING, + DER_PACK_LEAVE, + DER_WALK_END + } + +to find `output[0]` set to the DER sequence for `[0] INTEGER`, so in hex +the bytes `a0 03 02 01 07` instead of just to `07`. This can be useful at +some times. + +Imagine the ASN.1 structure + + aFewPrimes ::= SET OF INTEGER + +which is a `SET OF` and can thus contain as many `INTEGER` values as desired. +An example hexdump of a DER value listing the first 5 primes would be + + 31 0c -- SET, containing 12 bytes + 02 01 02 -- INTEGER 2 + 02 01 03 -- INTEGER 3 + 02 01 05 -- INTEGER 5 + 02 01 07 -- INTEGER 7 + 02 01 0b -- INTEGER 11 + +If we had to store each `INTEGER` in a separate `output[]` entry, we would need +a variable-sized output array. What `der_unpack()` does in these cases is the +same as demonstrated for `[0] INTEGER` above; it stores the entire structure +of the `SET OF` and leaves it for further processing. + +The path expression to store this set would be + + derwalk path_primes [] = { + DER_PACK_STORE | DER_TAG_SET_OF, + DER_WALK_END + } + +The result would be stored in `output[0]` as the sequence `02 01 02` and so on, +of length 12. It is now possible to do a few things: + + * use `der_iterate_first()` and `der_iterate_next()` to find the individual + values in the set; + * manually skip through the list with `der_skip()` until it hits the end of + the set; + * counting the entries with `der_countelements()` and then allocate an array + `dercursor primal[4]`, in the heap or on the stack, and pass it into + `der_unpack ()` with the last parameter set to the count of 4. + + +## Optionals, Choices and the ANYs + +Not everything declared in ASN.1 is included in the binary DER format. +Some parts are `OPTIONAL` (and may have a `DEFAULT`, which Quick DER does not +capture) and others are a `CHOICE` from variants. + +To encode an option, prefix `DER_PACK_OPTIONAL,` to the optional part. If the +optional part is flagged wth `DER_PACK_ENTER`, then the optionality will +continue to the corresponding `DER_PACK_LEAVE`. Note that ASN.1 ensures that +the DER format can always be parsed based on a singly tag lookahead, which we +exploit in this case. + +A somewhat similar structure is the `CHOICE` which permits choosing a syntax +variety from among alternatives. Again, ASN.1 ensures that parsing can be done +based on the first tag. We use this once more, for paths between +`DER_PACK_CHOICE_BEGIN` and `DER_PACK_CHOICE_END`. Note that the programmer +of the walking path is responsible for proper nesting, also with respect to +`ENTER`/`LEAVE` structure. + +Finally, the forms `ANY` and `ANY DEFINED BY` are used in ASN.1 to describe +wildcard typing. These have no representation in DER either, but at the +point where it comes across the `DER_PACK_STORE | DER_PACK_ANY` instruction, +it will match anything, and store the result in the output array of +`dercursor`. The stored result is special however, in that it includes the +entire DER structure including tag and length bytes. This is because you +will have to do further processing. + + +## Overlay structures + +The idea of static structure is a great benefit to us as programmers, because +we can create overlay structures that consist solely of `dercursor` and other +overlay structures. These give us a way to navigate through the data using +ASN.1 labels. + +The first ASN.1 structure + + demoStruct ::= SEQUENCE { + demoCounter [0] INTEGER, + demoName OCTET STRING + } + +could be overlaid with the C structure + + typedef struct { + dercursor demoCounter; + dercursor demoName; + } ovly_demoStruct; + +and the program could declare + + ovly_demoStruct output; + +and pass that to `der_unpack()` as `(dercursor *) &output` for type correctness. + +The datafields could then be addressed with something like + + printf ("Found \"%.*s\"\n", output.demoName.derlen, output.demoName.derptr); + + +## Repacking + +There is a function `der_pack()` that does the exact opposite of `der_unpack()`, +using the same walking paths. + +*Something to ignore until you run into trouble:* +You may need to `der_prepack()` first if you have nested elements that are not +a `SET (OF)` or `SEQUENCE (OF)` or other form that is always Constructed. +Without `der_prepack()` your DER representation may end up being Primitive. + + +## ASN.1 Compiler and RFC Library + +The intention is to generate this syntax automatically from ASN.1 files, including +the overlay structures. The result would be a headerfile that provides macros +that can fill paths, and structures that capture the structure of ASN.1 and +specifically the labels used. + +Once we have a compiler, it is our intent to collect the RFCs and perhaps +other specifications that use ASN.1 syntax, and to derive their header files +for distribution in the developer version of Quick and Easy DER. + +These things combined should enable you to specify things like + + #include + #include + + typedef asn1_rfc5280_Certificate Certificate; + derwalk path_cert = { DER_RFC5280_CERTIFICATE, DER_PACK_END }; + + void print (dercursor *input) { + Certificate crt; + if (der_unpack (&crs, path_demo, outputs, 1) == 0) { + ...crt.tbsCertificate.issuer... + } + } + +In short, you will be up and running with DER-encoded PKIX Certificates. + +And it will be + +Quick... and Easy! diff --git a/README.MD b/README.MD index a3d7f71..b9a8ef8 100644 --- a/README.MD +++ b/README.MD @@ -43,7 +43,7 @@ in the respective `dercursor` variables to be NULL values; specifically, the function `der_isnull()` returns a true value for these elements. -## Extra Code Facilities +## Extra Coding Facilities There are routines `der_iterate_first()` and `der_iterate_next()` routines to manually iterate over a DER structure's components. This can be used to diff --git a/WALK-SYNTAX.MD b/WALK-SYNTAX.MD new file mode 100644 index 0000000..d4d51ff --- /dev/null +++ b/WALK-SYNTAX.MD @@ -0,0 +1,116 @@ +# Syntax for Walking Paths + +This specification describes how to make `der_walk()` traverse the path in DER +binaries that you intend it to take. + +## Declaring and Using a Walking Path + +The walk path is described as a sequence of values that ends in `DER_WALK_END`: + + #include + + derwalk path_demo [] = { + ..., + DER_WALK_END + } + +which is then used to move a `dercursor crs` with + + int prsok; + prsok = der_walk (&crs, path_demo); + +The output is -1 for hard errors or 0 for success. If it fails to parse the path +at some point, the return value is a positive integer, indicating how much of +`path_demo` was left unprocessed before `DER_WALK_END`. + +The `crs` value is updated by this call to point to the end of the `path_demo` +walk. + +## Entering and Skipping + +There are two basic actions that `der_walk()` takes at each position along the +path; it may either enter or skip a DER element. This is defined in the path +with `DER_WALK_ENTER` and `DER_WALK_SKIP`, respectively, as in + + derwalk path_demo [] = { + DER_WALK_ENTER | ..., + DER_WALK_SKIP | ..., + ..., + DER_WALK_END + } + +## Matching tags + +The tag found in the DER code must be matched, or otherwise a validation error +is raised (and a positive integer returned from `der_walk()` to indicate where +the problem was encountered). + +Tags are quite simply matched by mentioning them after the enter-or-skip choice, +as in + + derwalk path_demo [] = { + DER_WALK_ENTER | DER_TAG_SEQUENCE, + DER_WALK_SKIP | DER_TAG_CONTEXT (0), + DER_WALK_ENTER | DER_TAG_OCTETSTRING, + DER_WALK_END + } + +The two statements shown could be used to get to the `OCTET STRING` in + + demoStruct ::= SEQUENCE { + demoCounter [0] INTEGER, + demoName OCTET STRING + } + +This relates to the DER sequence for this structure; let's say the `INTEGER` +value is `7` and the `OCTET STRING` is `Quick DER`, then the encoding would be + + 30 16 -- tag SEQUENCE, length 16 + a0 03 -- tag a0 for [0], length 3 + 02 01 07 -- tag INTEGER, length 1, value 7 + 04 09 51 75 69 63 6b 20 44 45 52 -- tag 4, length 9, "Quick DER" + +From the start of this structure, we need to: + + * Enter the `SEQUENCE` + * Skip the `[0]` rather then entering it + * Enter the `OCTET STRING` + * Stop processing + +This is precisely what the path walk describes. Although *some* understanding +of the mapping to DER is helpful, you can generally derive the path to walk +directly from the ASN.1 structure. + +When done, `der_walk()` returns the pointer to the string "Quick DER" with a +length of 9, and you can continue to process it: + + printf ("Found \"%.*s\"\n", crs.derlen, crs.derptr); + +## There is more + +Also have a look at the individual steps that can be taken with +`der_enter()` and `der_skip()`. And take a look at +`der_iterate_first()` and `der_iterate_next()` if you need iterators. + +Where `der_walk()` is ideally suited to retrieve a single bit of information +from the repository, the `der_unpack()` routine can unpack a complete DER +structure (only deferring dynamically-sized parts to later calls). The latter +also has a reverse routine `der_pack()`. You will want to read the +[PACK SYNTAX](PACK-SYNTAX.MD) for the walking paths used with those routines. + + +## Optionals, Choices and the ANYs + +There is a possibility in ASN.1 to specify an element as `OPTIONAL`, perhaps +even having a `DEFAULT` value (which is ignored by Quick DER). To mark an +entry as optional, precede it with `DER_WALK_OPTIONAL`. + +Choices are barely interesting during a walk; in fact, the only purpose they +serve is as something to skip over (since we obviously have no idea how to get +into a structure if we don't know yet what that structure is like). So, +specify `DER_WALK_SKIP | DER_WALK_CHOICE` to skip an arbitrary element; +there will be no validation of that particular tag. + +The forms `ANY` and `ANY DEFINED BY` receive the same treatment as `CHOICE`, +but can be declare with a separate symbol `DER_WALK_ANY`. + diff --git a/tool/Makefile b/tool/Makefile index 6fdfff9..3c56b77 100644 --- a/tool/Makefile +++ b/tool/Makefile @@ -5,8 +5,14 @@ all: clean: install: - install -m 0755 hexio/derdump.py "$(PREFIX)/bin/derdump" + # + # The hexio submodule did not arrive well in GIT + # + # If you need a good DER dumping utility, check it out yourself: + # https://github.com/vanrein/hexio + # + @#NOTYET# install -m 0755 hexio/derdump.py "$(PREFIX)/bin/derdump" uninstall: - rm -f "$(PREFIX)/bin/derdump" + @#NOTYET# rm -f "$(PREFIX)/bin/derdump" -- 1.7.10.4