This is a PyPI mirror client according to `PEP 381
<http://www.python.org/dev/peps/pep-0381/>`_.


.. contents::

Build status
============

bandersnatch
    .. image:: https://builds.gocept.com/job/bandersnatch/badge/icon
       :target: https://builds.gocept.com/job/bandersnatch/

Packaging and PIP install
    .. image:: https://builds.gocept.com/job/bandersnatch-packaging-pip/badge/icon
       :target: https://builds.gocept.com/job/bandersnatch-packaging-pip/


Installation
============

The following instructions will place the bandersnatch executable in a
virtualenv under ``bandersnatch/bin/bandersnatch``.

pip
---

This installs the latest stable, released version.

::

  $ virtualenv-2.7 bandersnatch
  $ cd bandersnatch
  $ bin/pip install -r https://bitbucket.org/ctheune/bandersnatch/raw/stable/requirements.txt


zc.buildout
-----------

This installs the current development version. Use 'hg up <version>' and run
buildout again to choose a specific release.

::

  $ hg clone https://bitbucket.org/ctheune/bandersnatch
  $ cd bandersnatch
  $ virtualenv-2.7 .
  $ bin/python bootstrap.py
  $ bin/buildout


Configuration
=============

* Run ``bandersnatch mirror`` - it will create an empty configuration file
  for you in ``/etc/bandersnatch.conf``.
* Review ``/etc/bandersnatch.conf`` and adapt to your needs.
* Run ``bandersnatch mirror`` again. It will populate your mirror with the
  current status of all PyPI packages - roughly 50GiB at the time of writing.
* Run ``bandersnatch mirror`` regularly to update your mirror with any
  intermediate changes.

Webserver
---------

Configure your webserver to serve the ``web/`` sub-directory of the mirror. For nginx it should look something like this::

    server {
        listen 127.0.0.1:80;
        server_name <mymirrorname>;
        root <path-to-mirror>/web;
        autoindex on;
        charset utf-8;
    }

* Note that it is a good idea to have your webserver publish the HTML index
  files correctly with UTF-8 as the carset. The index pages will work without
  it but if humans look at the pages the characters will end up looking funny.

* Make sure that the webserver uses UTF-8 to look up unicode path names. nginx
  gets this right by default - not sure about others.


Cron jobs
---------

You need to set up one cron job to run the mirror itself. If you run a public
mirror, then you need a second job that will create access statistics for
aggregation on the master PyPI.

Here's a sample that you could place in ``/etc/cron.d/bandersnatch``::

    LC_ALL=en_US.utf8
    */2 * * * * root bandersnatch mirror |& logger -t bandersnatch[mirror]
    12 * * * * root bandersnatch update-stats |& logger -t bandersnatch[update-stats]

This assumes that you have a ``logger`` utility installed that will convert the
output of the commands to syslog entries.


Maintenance
===========

bandersnatch does not keep much local state in addition to the mirrored data.
In general you can just keep rerunning ``bandersnatch mirror`` to make it fix
errors.

If you delete the state files then the next run will force it to check
everything against the master PyPI::

* delete ``./state`` file and ``./todo`` if they exist in your mirror directory
* run ``bandersnatch`` mirror to get a full sync

Be aware, that full syncs likely take hours depending on PyPIs performance and
your network latency and bandwidth.


Migrating from pep381client
===========================

* remove old status files, but keep actual data (everything under ``web/``)
* create config file, port command parameters from old cronjobs
* update cron jobs


Contact
=======

If you have questions or comments, please submit a bug report to
http://bitbucket.org/ctheune/bandersnatch/issues/new.

Also, I'm reading the `distutils sig
<http://mail.python.org/mailman/listinfo/distutils-sig>`_ mailing list.

Support this project
====================

If you'd like to support my work on PyPI mirrors, please consider a `gittip
<https://www.gittip.com/theuni/>`_. I'm planning to run a couple more
international mirrors if I get enough support.


Kudos
=====

This client is based on the original pep381client by Martin v. Loewis.

Richard Jones was very patient answering questions at PyCon 2013 and made the
protocol more reliable by implementing some PyPI enhancements.
