
-------------------------------------------------------------------------------
What is a parser and what does it do exactly?

A parser is a function whose primary purpose is creating and initializing
a node that is described by the field being parsed. This node may be
provided as the 'node' argument, but otherwise the parser is expected to
create it. A parser is essentially an __init__ method for a field.

A parser does a series of initialization steps with a few branching paths
depending on the provided arguments and what the parser is initializing.
While parsers can be designed to operate very differently from one another
(look at the Switch parser), most follow a certain series of steps.

Below is a list of those steps and their branching paths followed by a few
detailed descriptions of each step. Remember, these are simply general
guidelines for how most parsers will operate, and should only be used as a
template for what a parser should do and in what order it should be done.

  1: Block node creation
    If field.is_block is True and node is not provided:
      Create the node.
      Parent the node to the provided parent.
    Else:
      Do nothing

  2: Steptree parsing prep work
    If field.is_block is False:
      Do nothing
    Else:
      If field.is_container is True  or  "steptree_parents" is not in kwargs:
        Set the "is_steptree_root" variable to True.
        Set the "parents" variable to a new list.
      Else:
        Set the "is_steptree_root" variable to False.
        Set the "parents" variable to kwargs['steptree_parents']

      Set kwargs['steptree_parents'] to parents

      If STEPTREE is in the descriptor:
        Append the node to parents

  3: Offset adjustment
    If there is a valid parent and POINTER is in the descriptor:
      Set the "offset" variable to parent.get_meta('POINTER', attr_index, 
                                                   **kwargs)
    Elif ALIGN is in the descriptor:
      Align the "offset" variable using the given alignment.

  4: Parsing the node
    This step is pretty open ended since anything specific to
    this parser should probably be done here.

    If field.is_data:
      Set the "size" variable to   parent.get_size(attr_index, **kwargs)
      Seek the buffer to offset + root_offset
      Read "size" number of bytes from the buffer
      Decode the bytes that were read and set parent[attr_index] to it.
    Else:
      Call the parser of each child node in the node

  5: Parsing steptrees
    If field.is_block is True  and  is_steptree_root is True:
      Loop over all nodes in parents and call their steptrees parser.

  6: Return the most recent offset.


###  Block node creation  ###
#############################
If the node a parser is meant to build will be a Block but it isnt provided,
node must be set to an instance of the correct Block class, usually like so:
    node = (desc.get(BLOCK_CLS, self.py_type)\
        (desc, parent=parent, init_attrs=rawdata is None))

where self is the FieldType instance and desc, parent, and rawdata are
arguments that were provided to the parser.

After creation, the parser must also parent the node to the parent *


###  Steptree parsing prep work  ###
###################################
If the node being read is a Block, there is the possibility that the
node will have a steptree, or that its inner nodes could have steptrees.
If the node is a container, or a "steptree_parents" item isn't in kwargs,
the parser will consider itself a steptree_root and a new list will be
placed in kwargs['steptree_parents']. If a STEPTREE entry exists in the
descriptor(meaning the node being initialized will have a STEPTREE) the
node will be appended to kwargs['steptree_parents'].


###  Offset adjustment  ###
###########################
Before reading from the buffer, parsers may check for certain descriptor
entries that adjust the read offset. For example, if a pointer exists in the
descriptor, the offset may need to be set using the following code:
    offset = node.get_meta('POINTER', **kwargs)
or
    offset = parent.get_meta('POINTER', attr_index, **kwargs)

If a pointer doesnt exist, an alignment entry may need to be checked next.
If ALIGN exists, the offset must be adjusted using the following code:
    align = desc.get('ALIGN', 1)
    offset += (align - (offset % align)) % align

The reason for saying 'may' is because alignment and pointers are
never used in certain circumstances. For example, the elements in a
Struct have predefined offsets with the alignment already calculated
into them, so neither alignment nor pointers matter inside a struct. 
Pointers have no meaning for a Switch or BitStruct and alignment
has no meaning for a Switch, BitStruct, bytearray, or bytes.

Pointers should also never be used if there isn't a valid parent.
This is partially because pointers are usually previously parsed data
higher up in the node tree, and if there is no parent, there is no tree.
But also if a node that uses a pointer is serialized, its pointer must be
ignored so as to prevent the buffer from becoming excessively large due
to the pointer changing the initial write position to a non-zero offset.


###  Parsing the node  ###
##########################
This is the step that is the most varied and open ended.
If the node will be a Block, it will already exist by this point and this
section of the parser will be committed to calling the parsers of any nodes
within it and doing anything else specific to this field. If the node wont
be a Block instance, it should instead be created in either of two ways:
    1: converting bytes from the provided buffer(rawdata) into a python object
    2: copying a default value (specified in the descriptor or field)

When the node is created from bytes, the parser is responsible for
seeking to the proper offset in the rawdata, reading the appropriate
number of bytes from it, and the decoding them into a python object.
Decoding will often be relegated to the fields decoder function, with the
parser instead simply calling the decoder while providing the bytes read.

Determining the number of bytes to read is done by calling the parents
get_size method while supplying the attr_index and any other needed data.
This is what it usually looks like:
    size = parent.get_size(attr_index, root_offset=root_offset,
                           offset=offset, rawdata=rawdata, **kwargs)

If the node is being made from a default value, the descriptor will be checked
for a DEFAULT entry. If it exists, the node should be set to a copy of it.
If it doesnt, field.default should be called to get a default value like so:
    node = self.default()

After creation, the parser must also parent the node to the parent *


###  Parsing the steptrees  ###
##############################
Look at the "steptree" section in terminology.txt to better understand this.

If the node being parsed is a Block, it can contain a STEPTREE as well
as other nodes which can also contain a STEPTREE. After the node has
had all its subnodes read, the parent of all steptrees within it will
have been collected in the kwargs["steptree_parents"] list.

The parser should now loop over all entries in the parents list and
call the parser of each nodes STEPTREE. The list of parents must be
removed from kwargs before the kwargs are passed to the child parsers.


###  Returning the offset  ###
##############################
After the parser has finished calling the parser of each of its fields,
it must return the last offset that the last parser it called returned.
This is so the function that called this parser can know where the reading
stopped, usually for the purpose of calling another parser to start there.



* For assigning an attribute to a parent, parsers must use the index notation:
    parent[attr_index] = node

where "attr_index" is another provided argument. 

Using __setattr__ is not allowed since attr_index can be an integer index
into an array. All Block classes are designed to call their __setattr__ if
the given index is not an int or slice. As a result, index notation is much
more flexible and doesnt require somewhat obfuscating usage of magic methods.


-------------------------------------------------------------------------------
Where and why are they used?

Parsers are only called from two locations in the library; from within the
parse method of Block classes, and from within other parsers. When a block
is parsed it needs to call its fields parser, which handles the majority of
the parsing. When a field contains several fields within it and it's being
parsed, the fields within it need to be parsed as well. Thus the parser of
the outer field will need to call the parser of the fields within it.

Parsers exist as a simple way to tie a specific type of parsing method to
a specific type of field. While specific details of a field may vary
(such as its name, offset, size, etc), general aspects of parsing it dont.
The way in which bytes are pulled from a buffer, decoded, placed into their
parent node, and subfields parsers are called are all the same across fields
of the same type.

An example of this is how parsing a Struct always requires creating a Block
to represent the struct, giving it the descriptor of the field being parsed,
and calling the parsers of each field within the struct. Even though the
size, number of fields, offset of each field, and many other details vary
from struct to struct, the general method pf parsing it remains the same.


-------------------------------------------------------------------------------
What positional and keyword arguments should a parser
function expect and what are their purposes?

required positional:
    self --------- The FieldType instance whose parser function is being run.
                   The fields actual parser method calls its parser function
                   while providing self and passing on all args and kwargs.
                   Essentially the parser function acts like a class method
                   of self.

    desc --------- The descriptor that describes the field being parsed.

positional/keyword:
    node --------- The node that this parser would build.
                   If provided, instead of building a new node, the
                   parser will simply re-parse this provided one.

    parent ------- The object that this parser must parent the node that it
                   builds to. For parent to be considered valid, attr_index
                   must also be valid. Typically the node is parented to
                   parent using the index notation:
                       parent[attr_index] = node

    attr_index --- The index that the node will be parented to parent.
                   This is typically done using the index notation:
                       parent[attr_index].
                   If attr_index is None it is considered to be invalid and
                   parent is also invalid. This should only ever occur when
                   the topmost parser is called, as there is no parent.
                   In this case, node MUST be provided and valid.

    rawdata ------ The buffer being read from and parsed into python
                   objects such as Blocks, ints, floats, strings, etc.
                   This must either be None(meaning there is no data to parse),
                   or it must have read, seek, peek, and write methods.

    root_offset -- The root offset that all reading is done from.
                   Pointers and other offsets are relative to this value.

    offset ------- The initial offset that reading is done from.

Parser functions are allowed to be given arbitrary keyword arguments
in order to be as versatile as they need to be. Because of this they must
make use of **kwargs since unused keyword arguments may be provided.
kwargs must be passed to any meta_getter_setters and parsers being called.