Changeset - cc8c03ab626a
[Not reviewed]
0 5 0
Brett Smith - 7 years ago 2017-12-19 14:13:44
brettcsmith@brettcsmith.org
main: Provide template variables about the file being imported.
5 files changed with 55 insertions and 18 deletions:
0 comments (0 inline, 0 general)
README.rst
Show inline comments
 
import2ledger
 
=============
 

	
 
Introduction
 
------------
 

	
 
import2ledger provides a Python library and CLI tool to read financial data from various popular sources and generate entries for text-based accounting books from them.
 

	
 
Installation
 
------------
 

	
 
This is a pretty normal Python module, so if you have a favorite way to install those, it will probably work.  If you just want to install it to run the script locally, try running from the source directory::
 

	
 
  $ python3 -m pip install --user -U --upgrade-strategy only-if-needed .
 

	
 
Configuring import2ledger
 
-------------------------
 

	
 
Since different organizations follow different accounting rules, you need to define an entry template for each kind of data that you want to be able to import.  You do that in a configuration file.  By default, import2ledger reads a configuration file at ``~/.config/import2ledger.ini``.  You can specify a different path with the ``-C`` option, and you can specify that multiple times to read multiple configuration files in order.
 

	
 
Writing templates
 
~~~~~~~~~~~~~~~~~
 

	
 
A template looks like a Ledger entry with a couple of differences:
 

	
 
* You don't write the first payee line.
 
* Instead of writing specific currency amounts, you write simple expressions to calculate amounts.
 

	
 
Here's a simple template for Patreon patron payments::
 

	
 
  [DEFAULT]
 
  template patreon income =
 
    ;Tag: Value
 
    Income:Patreon  -{amount}
 
    Accrued:Accounts Receivable:Patreon  {amount}
 

	
 
Let's walk through this line by line.
 

	
 
Every setting in your configuration file has to be in a section.  ``[DEFAULT]`` is the default section, and import2ledger reads configuration settings from here if you don't specify another one.  This documentation explains how to use sections later.
 

	
 
``template patreon income =`` specifies which entry template this is.  Every template is found from a setting with a name in the pattern ``template <SOURCE> <TYPE>``.  The remaining lines are indented further than this name; this defines a multiline value.  Don't worry about the exact indentation of your template; import2ledger will indent its output nicely.
 

	
 
The first line of the template is a Ledger tag.  The program will leave all kinds of tags and Ledger comments alone, except to indent them nicely.
 

	
 
The next two lines split the money across accounts.  They follow almost the same format as they do in Ledger: there's an account named, followed by a tab or two or more spaces, and then an expression.  Each time import2ledger generates an entry, it will evaluate this expression using the data it imported to calculate the actual currency amount to write.  Your expression can use numbers, basic arithmetic operators (including parentheses for grouping), and imported data referred to as ``{variable_name}``.
 

	
 
import2ledger uses decimal math to calculate each amount, and rounds to the number of digits appropriate for that currency.  If the amount of currency being imported doesn't split evenly, spare change will be allocated to the last split to keep the entry balanced on both sides.
 

	
 
Refer to the `Python documentation for INI file structure <https://docs.python.org/3/library/configparser.html#supported-ini-file-structure>`_ for full details of the syntax.  Note that import2ledger doesn't use ``;`` as a comment prefix, since that's the primary comment prefix in Ledger.
 

	
 
Template variables
 
~~~~~~~~~~~~~~~~~~
 

	
 
import2ledger templates have access to a few variables for each transaction that can be included anywhere in the entry, not just amount expressions.  In other parts of the entry, they're treated as strings, and you can control their formatting using `Python's format string syntax <https://docs.python.org/3/library/string.html#formatstrings>`_.  For example, this template customizes the payee line and sets different payees for each side of the transaction::
 

	
 
  template patreon income =
 
    {date}  Patreon payment from {payee}
 
    Income:Patreon  -{amount}
 
    ;Payee: {payee}
 
    Accrued:Accounts Receivable:Patreon  {amount}
 
    ;Payee: Patreon
 

	
 
Templates automatically detect whether or not you have a custom payee line by checking if the first line begins with a date variable.  If it does, it's assumed to be your payee line.  Otherwise, the template uses a default payee line of ``{date} {payee}``.
 

	
 
Every template can use the following variables:
 

	
 
================== ==========================================================
 
Name               Contents
 
================== ==========================================================
 
amount             The total amount of the transaction, as a simple decimal
 
                   number (not currency-formatted)
 
------------------ ----------------------------------------------------------
 
currency           The three-letter code for the transaction currency
 
------------------ ----------------------------------------------------------
 
date               The date of the transaction, in your configured output
 
                   format
 
------------------ ----------------------------------------------------------
 
payee              The name of the transaction payee
 
------------------ ----------------------------------------------------------
 
source_abspath     The absolute path of the file being imported
 
------------------ ----------------------------------------------------------
 
source_name        The filename of the file being imported
 
------------------ ----------------------------------------------------------
 
source_path        The path of the file being imported, as specified on the
 
                   command line
 
================== ==========================================================
 

	
 
Specific importers and hooks may provide additional variables.
 

	
 
Supported templates
 
~~~~~~~~~~~~~~~~~~~
 

	
 
You can define the following templates.
 

	
 
Patreon
 
^^^^^^^
 

	
 
``template patreon income``
 
  Imports one transaction per patron per month.  Generated from Patreon's monthly patron report CSVs.
 

	
 
``template patreon cardfees``
 
  Imports one expense transaction per month for that month's credit card fees.  Generated from Patreon's earnings report CSV.
 

	
 
``template patreon svcfees``
 
  Imports one expense transaction per month for that month's Patreon service fees.  Generated from Patreon's earnings report CSV.
 

	
 
``template patreon vat``
 
  Imports one transaction per country per month each time Patreon withheld VAT.  Generated from Patreon's VAT report CSV.
 

	
 
  This template can use these variables:
 

	
 
  ============== ============================================================
 
  Name           Contents
 
  ============== ============================================================
 
  country_name   The full name of the country VAT was withheld for
 
  -------------- ------------------------------------------------------------
 
  country_code   The two-letter ISO country code of the country VAT was
 
                 withheld for
 
  ============== ============================================================
 

	
 
Stripe
 
^^^^^^
 

	
 
``template stripe payments``
 
  Imports one transaction per payment.  Generated from Stripe's payments CSV export.
 

	
 
  This template can use these variables:
 

	
 
  ============ ============================================================
 
  Name         Contents
 
  ============ ============================================================
 
  description  The description given to the payment when it was created
 
  ------------ ------------------------------------------------------------
 
  fee          The amount of fees charged by Stripe for this payment, as a
 
               plain decimal number
 
  ------------ ------------------------------------------------------------
 
  payment_id   The id assigned to this payment by Stripe
 
  ------------ ------------------------------------------------------------
 
  tax          The amount of tax withheld by Stripe for this payment, as a
 
               plain decimal number
 
  ============ ============================================================
 

	
 
Other output options
 
~~~~~~~~~~~~~~~~~~~~
 

	
 
Various options control how import2ledger formats dates and currency amounts.  Run ``import2ledger --help`` for a full list; they're listed under the "default overrides" section.  You can specify these in your configuration file, too.  Just change any dashes (``-``) in the option name to underscores (``_``), and separate the switch name from your value with ``=``.  For example, add this to your configuration file to use ISO 8601-formatted dates in your Ledger entries::
 

	
 
  date_format = %%Y-%%m-%%d
 

	
 
(``%`` is a special character in the configuration file syntax.  ``%%`` represents a literal percent sign.)
 

	
 
Configuration sections
 
~~~~~~~~~~~~~~~~~~~~~~
 

	
 
If you keep different sets of books, it might be helpful to have different configuration settings for each.  For example, you might adjust the templates for each set of books to use different accounts or tags.  Or you might just need to change the formatting of dates or currency amounts.
 

	
 
import2ledger makes this easy by letting you define a new section of options, and then using them all when you import transactions.  For example, say you keep two sets of books, one for US accounts and another for European accounts.  You could write a configuration file like::
 

	
 
  [us]
 
  date_format = %%m/%%d/%%Y
 
  signed_currencies = USD
 

	
 
  [eu]
 
  date_format = %%d.%%m.%%Y
 
  signed_currencies = EUR
 

	
 
When you run import2ledger, you can tell it to load options from one of these sections with the option ``--use-config NAME``, or ``-c NAME`` for short.  In this example, ``import2ledger -c eu`` would load the settings for your European books.  import2ledger looks for configuration settings in the following places, and uses the first setting it finds:
 

	
 
1. The configuration section you specify with ``-c``
 
2. Command-line options
 
3. The ``[DEFAULT]`` configuration section
 
4. import2ledger defaults
 

	
 
Running import2ledger
 
---------------------
 

	
 
Once you've written a configuration file with your templates and any other configuration you need, you can run it with data to import::
 

	
 
  $ import2ledger [options …] file1.csv [file2.csv …]
 

	
 
import2ledger will read all the data sources and generate bookkeeping entries from them.
 

	
 
import2ledger needs to be able to seek within the source files.  Because of that, your input files need to be normal files, or symlinks to them.  Special files like devices and FIFOs aren't supported by import2ledger.
 

	
 
Exit status
 
~~~~~~~~~~~
 

	
 
import2ledger reports the following exit codes:
 

	
 
========= =================================================================
 
Exit code Meaning
 
========= =================================================================
 
0         Imported data from all source files
 
--------- -----------------------------------------------------------------
 
1         Internal error (probably a bug)
 
--------- -----------------------------------------------------------------
 
2         Error in the user's settings, such as a formatting error in a
 
          template or date, or a reference to a nonexistent file or
 
          configuration file section
 
--------- -----------------------------------------------------------------
 
11-99     Failed to import data from at least one source file.  The exit
 
          code is 10 + the number of files that couldn't be imported, up
 
          to the maximum exit code of 99.
 
========= =================================================================
import2ledger/__main__.py
Show inline comments
 
import collections
 
import contextlib
 
import logging
 
import sys
 

	
 
from . import config, errors, hooks, importers
 

	
 
logger = logging.getLogger('import2ledger')
 

	
 
class FileImporter:
 
    def __init__(self, config, stdout):
 
        self.config = config
 
        self.importers = list(importers.load_all())
 
        self.hooks = [hook(config) for hook in hooks.load_all()]
 
        self.stdout = stdout
 

	
 
    def import_file(self, in_file):
 
    def import_file(self, in_file, in_path=None):
 
        if in_path is None:
 
            in_path = pathlib.Path(in_file.name)
 
        importers = []
 
        for importer in self.importers:
 
            in_file.seek(0)
 
            if importer.can_import(in_file):
 
                try:
 
                    template = self.config.get_template(importer.TEMPLATE_KEY)
 
                except errors.UserInputConfigurationError as error:
 
                    if error.strerror.startswith('template not defined '):
 
                        have_template = False
 
                    else:
 
                        raise
 
                else:
 
                    have_template = not template.is_empty()
 
                if have_template:
 
                    importers.append((importer, template))
 
        if not importers:
 
            raise errors.UserInputFileError("no importers available", in_file.name)
 
        source_vars = {
 
            'source_abspath': in_path.absolute().as_posix(),
 
            'source_name': in_path.name,
 
            'source_path': in_path.as_posix(),
 
        }
 
        with contextlib.ExitStack() as exit_stack:
 
            output_path = self.config.get_output_path()
 
            if output_path is None:
 
                out_file = self.stdout
 
            else:
 
                out_file = exit_stack.enter_context(output_path.open('a'))
 
            for importer, template in importers:
 
                default_date = self.config.get_default_date()
 
                in_file.seek(0)
 
                for entry_data in importer(in_file):
 
                    entry_data['_hook_cancel'] = False
 
                    for hook in self.hooks:
 
                        hook.run(entry_data)
 
                        if entry_data['_hook_cancel']:
 
                            break
 
                    else:
 
                        del entry_data['_hook_cancel']
 
                        print(template.render(**entry_data), file=out_file, end='')
 
                        render_vars = collections.ChainMap(entry_data, source_vars)
 
                        print(template.render(render_vars), file=out_file, end='')
 

	
 
    def import_path(self, in_path):
 
        if in_path is None:
 
            raise errors.UserInputFileError("only seekable files are supported", '<stdin>')
 
        with in_path.open(errors='replace') as in_file:
 
            if not in_file.seekable():
 
                raise errors.UserInputFileError("only seekable files are supported", in_path)
 
            return self.import_file(in_file)
 
            return self.import_file(in_file, in_path)
 

	
 
    def import_paths(self, path_seq):
 
        for in_path in path_seq:
 
            try:
 
                retval = self.import_path(in_path)
 
            except (OSError, errors.UserInputError) as error:
 
                yield in_path, error
 
            else:
 
                yield in_path, retval
 

	
 

	
 
def setup_logger(logger, main_config, stream):
 
    formatter = logging.Formatter('%(name)s: %(levelname)s: %(message)s')
 
    handler = logging.StreamHandler(stream)
 
    handler.setFormatter(formatter)
 
    logger.addHandler(handler)
 

	
 
def main(arglist=None, stdout=sys.stdout, stderr=sys.stderr):
 
    try:
 
        my_config = config.Configuration(arglist)
 
    except errors.UserInputError as error:
 
        my_config.error("{}: {!r}".format(error.strerror, error.user_input))
 
        return 3
 
    setup_logger(logger, my_config, stderr)
 
    importer = FileImporter(my_config, stdout)
 
    failures = 0
 
    for input_path, error in importer.import_paths(my_config.args.input_paths):
 
        if error is None:
 
            logger.info("%s: imported", input_path)
 
        else:
 
            logger.warning("%s: failed to import: %s", input_path or error.path, error)
 
            failures += 1
 
    if failures == 0:
 
        return 0
 
    else:
 
        return min(10 + failures, 99)
 

	
 
if __name__ == '__main__':
 
    exit(main())
tests/data/test_main.ini
Show inline comments
 
[DEFAULT]
 
default_date = 2016/04/04
 
loglevel = critical
 
signed_currencies = USD
 

	
 
[One]
 
template patreon cardfees =
 
 Accrued:Accounts Receivable  -{amount}
 
 Expenses:Fees:Credit Card  {amount}
 
template patreon svcfees =
 
 ;SourcePath: {source_abspath}
 
 ;SourceName: {source_name}
 
 Accrued:Accounts Receivable  -{amount}
 
 Expenses:Fundraising  {amount}
tests/data/test_main_fees_import.ledger
Show inline comments
 
2017/09/01 Patreon
 
  Accrued:Accounts Receivable  $-52.47
 
  Expenses:Fees:Credit Card  $52.47
 

	
 
2017/09/01 Patreon
 
  Accrued:Accounts Receivable  $-61.73
 
  Expenses:Fundraising  $61.73
 

	
 
2017/10/01 Patreon
 
  Accrued:Accounts Receivable  $-99.47
 
  Expenses:Fees:Credit Card  $99.47
 

	
 
2017/09/01 Patreon
 
  ;SourcePath: {source_abspath}
 
  ;SourceName: {source_name}
 
  Accrued:Accounts Receivable  $-61.73
 
  Expenses:Fundraising  $61.73
 

	
 
2017/10/01 Patreon
 
  ;SourcePath: {source_abspath}
 
  ;SourceName: {source_name}
 
  Accrued:Accounts Receivable  $-117.03
 
  Expenses:Fundraising  $117.03
tests/test_main.py
Show inline comments
 
import io
 
import pathlib
 

	
 
import pytest
 
from . import DATA_DIR, normalize_whitespace
 

	
 
from import2ledger import __main__ as i2lmain
 

	
 
ARGLIST = [
 
    '-C', (DATA_DIR / 'test_main.ini').as_posix(),
 
]
 

	
 
def run_main(arglist):
 
    stdout = io.StringIO()
 
    stderr = io.StringIO()
 
    exitcode = i2lmain.main(arglist, stdout, stderr)
 
    stdout.seek(0)
 
    stderr.seek(0)
 
    return exitcode, stdout, stderr
 

	
 
def iter_entries(in_file):
 
    lines = []
 
    for line in in_file:
 
        if line == '\n':
 
            if lines:
 
                yield ''.join(lines)
 
            lines = []
 
        else:
 
            lines.append(line)
 
    if lines:
 
        yield ''.join(lines)
 

	
 
def entries2set(in_file):
 
    return set(normalize_whitespace(e) for e in iter_entries(in_file))
 
def format_entry(entry_s, format_vars):
 
    return normalize_whitespace(entry_s).format_map(format_vars)
 

	
 
def expected_entries(path):
 
def format_entries(source, format_vars=None):
 
    if format_vars is None:
 
        format_vars = {}
 
    return (format_entry(e, format_vars) for e in iter_entries(source))
 

	
 
def expected_entries(path, format_vars=None):
 
    path = pathlib.Path(path)
 
    if not path.is_absolute():
 
        path = DATA_DIR / path
 
    with path.open() as in_file:
 
        return entries2set(in_file)
 
        return list(format_entries(in_file, format_vars))
 

	
 
def path_vars(path):
 
    return {
 
        'source_abspath': str(path),
 
        'source_name': path.name,
 
        'source_path': str(path),
 
    }
 

	
 
def test_fees_import():
 
    source_path = pathlib.Path(DATA_DIR, 'PatreonEarnings.csv')
 
    arglist = ARGLIST + [
 
        '-c', 'One',
 
        pathlib.Path(DATA_DIR, 'PatreonEarnings.csv').as_posix(),
 
        source_path.as_posix(),
 
    ]
 
    exitcode, stdout, _ = run_main(arglist)
 
    assert exitcode == 0
 
    actual = entries2set(stdout)
 
    assert actual == expected_entries('test_main_fees_import.ledger')
 
    actual = list(format_entries(stdout))
 
    expected = expected_entries('test_main_fees_import.ledger', path_vars(source_path))
 
    assert actual == expected
 

	
 
def test_date_range_import():
 
    source_path = pathlib.Path(DATA_DIR, 'PatreonEarnings.csv')
 
    arglist = ARGLIST + [
 
        '-c', 'One',
 
        '--date-range', '2017/10/01-',
 
        pathlib.Path(DATA_DIR, 'PatreonEarnings.csv').as_posix(),
 
        source_path.as_posix(),
 
    ]
 
    exitcode, stdout, _ = run_main(arglist)
 
    assert exitcode == 0
 
    actual = entries2set(stdout)
 
    expected = {entry for entry in expected_entries('test_main_fees_import.ledger')
 
                if entry.startswith('2017/10/')}
 
    actual = list(format_entries(stdout))
 
    valid = expected_entries('test_main_fees_import.ledger', path_vars(source_path))
 
    expected = [entry for entry in valid if entry.startswith('2017/10/')]
 
    assert actual == expected
0 comments (0 inline, 0 general)