Files @ 5b40134e833d
Branch filter:

Location: NPO-Accounting/import2ledger/CODE.rst

tbm
CODE: Fix typo
import2ledger code structure
============================

Concepts
--------

The main workflow of the program passes through three different types with different responsibilities.

Entry data
~~~~~~~~~~

Data for an output entry is kept and passed around in a dict with the following contents:

``date``
  A datetime.date object (if this is omitted, the ``default_date`` hook will fill in the default date from the user's configuration)

``payee``
  A string

``amount``
  A string or other object that can be safely converted to a decimal.Decimal

``currency``
  A string with a three-letter code, uppercase, identifying the transaction currency

It can optionally include additional keys for use as template variables.

Importers
~~~~~~~~~

At a high level, importers read a source file, and generate data for output entries.

Class method ``can_handle(source_file)``
  Returns true if the importer can generate entries from the given source file object, false otherwise.

``__init__(source_file)``
  Initializes an importer to generate entries from the given source file object.

``__iter__()``
  Returns an iterator of entry data dicts.

Hooks
~~~~~

Hooks make arbitrary transformations to entry data dicts.  Every entry data dict generated by an importer is run through every hook before being output.

``__init__(config)``
  Initializes the hook with the user's configuration.

``run(entry_data)``
  This method can make arbitrary transformations to the entry data, or filter it so it isn't output.

  If this method returns ``None``, processing the entry data continues normally.  Most hooks should do this, and just transform entry data in place.

  If this method returns ``False``, processing the entry data stops immediately.  The entry will not appear in the program output.

  If this method returns any other value, the program replaces the entry data with the return value, and continues processing.

Class attribute ``KIND``
  This should be one of the values of the ``hooks.HOOK_KINDS`` enum.  This information determines what order hooks run in.

Loading importers and hooks
---------------------------

Importers and hooks are both loaded and found dynamically when the program starts.  This makes it easy to extend the program: you just need to write the class following the established pattern, no registration needed.

import2ledger finds importers by looking at all ``.py`` files in the ``importers/`` directory, skipping files whose names start with ``.`` (hidden) or ``_`` (private).  It tries to import that file as a module.  If it succeeds, it looks for things in the module named ``*Importer``, and adds those to the list of importers.

Hooks follow the same pattern, searching the ``hooks/`` directory and looking for things named ``*Hook``.

Technically this is done by ``importers.load_all()`` and ``hooks.load_all()`` functions, but most of the code to do this is in the ``dynload`` module.

Main loop
---------

At a high level, import2ledger handles each input file this way::

  usable_importers = importers where can_handle(input_file) returns true
  for importer_class in usable_importers:
    input_file.seek(0)
    for entry_data in importer_class(input_file):
      for hook in all_hooks:
        hook_return = hook.run(entry_data)
        if hook_return is False:
          break
        elif hook_return is not None:
          entry_data = hook_return

Note in particular that multiple importers can handle the same input file.  This helps support inputs like Patreon's earnings CSV, where completely different transactions are generated from the same source.

Running tests
-------------

Run ``./setup.py test`` from the source directory.