===========
Job Workers
===========

The actual processing of the jobs in a queue is handled by a spearate
component, known as a job worker. This component usually runs in its own
thread and provides its own main loop.

  >>> import time
  >>> import transaction

The ``worker`` module provides a job worker which executes one job at a
time. Another worker is scheduling new jobs items beased on scheduler item
settings. Let's create the necessary components to test the job worker:

1. Create the remote processor:

  >>> from m01.remote import testing
  >>> rp = root
  >>> rp.isProcessing
  False

  >>> rp.isScheduling
  False

2. Register a job that simply sleeps and writes a message:

  >>> data = {'retryDelay': 1}
  >>> sleepJob = testing.SleepJob(data)
  >>> rp.addJobFactory(u'sleep', sleepJob)


SimpleJobWorker
---------------

This worker executes one job at a time. It was designed for jobs that would
take a long time and use up most of the processing power of a computer.

Let's first register a few jobs:

  >>> jobid1 = rp.addJob(u'sleep', (0.04, 1))
  >>> time.sleep(0.2)
  >>> jobid2 = rp.addJob(u'sleep', (0.1,  2))
  >>> time.sleep(0.2)
  >>> jobid3 = rp.addJob(u'sleep', (0,    3))
  >>> time.sleep(0.2)
  >>> jobid4 = rp.addJob(u'sleep', (0.08, 4))
  >>> time.sleep(0.2)
  >>> transaction.commit()

Now let's first check if we can aceess the jobs:

  >>> job = rp._jobs.get(jobid1)
  >>> job
  <SleepJob u'...' ...>

And let's try if the job is ready for processing:

  >>> rp.getJobStatus(jobid1)
  u'queued'

  >>> rp.getJobStatus(jobid2)
  u'queued'

  >>> rp.getJobStatus(jobid3)
  u'queued'

  >>> rp.getJobStatus(jobid4)
  u'queued'

Let's start by executing a job directly. The first argument to the simple
worker constructor is the remote processor instance. All other arguments are
optional and can be defined as worker rguments in the RemoteProcessor class,
see jobWorkerArguments and schedulerWorkerArguments:

  >>> from m01.remote.worker import SimpleJobWorker
  >>> worker = SimpleJobWorker(rp, waitTime=0.0)

Let's now process the first job. We clear the log and we also have to end any
existing interactions in order to process the job in this thread:

  >>> log_info.clear()

  >>> from zope.security import management
  >>> management.endInteraction()

  >>> worker.doProcessNextJob()
  True

  >>> print log_info
  m01.remote INFO
    Job: 1

Let's now use the worker from within the remote processor. Since the worker
constructors also accept additional arguments, they are specified as well:

  >>> rp.jobWorkerFactory = SimpleJobWorker
  >>> rp.jobWorkerFactory
  <class 'm01.remote.worker.SimpleJobWorker'>

  >>> rp.jobWorkerArguments
  {'waitTime': 0.0}

The wait time has been set to zero for testing purposes only. It is really set
to 1 second by default. Let's now start processing jobs, wait a little bit
for all the jobs to complete and then stop processing again:

  >>> rp.startProcessor()
  >>> transaction.commit()

  >>> time.sleep(0.5)

  >>> rp.stopProcessor()
  >>> transaction.commit()

  >>> time.sleep(0.5)

The log shows that all jobs have been processed. But more importantly, they
were all completed in the order they were defined. Note the first job get
processed before we started the remote processor. And yes this means a remote
processor can process jobs if the queue is not started. Starting a remote
processor only means that the job get processed as jobs without to do it
manualy.

  >>> print log_info
  m01.remote INFO
    Job: 1
  m01.remote INFO
    Processor 'root-worker' started
  m01.remote INFO
    Job: 2
  m01.remote INFO
    Job: 3
  m01.remote INFO
    Job: 4
  m01.remote INFO
    Processor 'root-worker' stopped

  >>> log_info.clear()


Transactions in jobs
--------------------

With the SimpleJobWorker, jobs _should_ not change the transaction status, since
both the administration of the jobs by the RemoteProcessor and the job itself
run in the same transaction, so aborting it from inside the job could mess up
the administrative part.

This is a regression test that aborting the transaction inside the job does not
lead to an infinite loop (because SimpleJobWorker pulls the job inside the
transaction, so if it is aborted, the job remains on the queue):

  >>> testing.testCounter
  0

  >>> counter = 0
  >>> data = {'counter': counter}
  >>> abortJob = testing.TransactionAbortJob(data)
  >>> rp.addJobFactory(u'abortJob', abortJob)
  >>> jobid = rp.addJob(u'abortJob', (1))
  >>> time.sleep(0.5)
  >>> jobid = rp.addJob(u'abortJob', (2))
  >>> transaction.commit()

  >>> rp.startProcessor()
  >>> transaction.commit()
  >>> time.sleep(0.5)

  >>> rp.stopProcessor()
  >>> transaction.commit()
  >>> time.sleep(0.5)

  >>> transaction.abort() # prevent spurious conflict errors
  >>> testing.testCounter
  2

  >>> print log_info
  m01.remote INFO
    Processor 'root-worker' started
  m01.remote INFO
    Job: 1
  m01.remote INFO
    Job: 2
  m01.remote INFO
    Processor 'root-worker' stopped

Reset test counter

  >>> testing.testCounter = 0


MultiJobProcessor
-----------------

The multi-threaded job worker executes several jobs at once. It was designed
for jobs that would take a long time but use very little processing power.

Let's add a few new jobs to execute:

  >>> jobid1 = rp.addJob(u'sleep', (0.04, 1))
  >>> time.sleep(0.2)
  >>> jobid2 = rp.addJob(u'sleep', (1.0,  2))
  >>> time.sleep(0.2)
  >>> jobid3 = rp.addJob(u'sleep', (0,    3))
  >>> time.sleep(0.2)
  >>> jobid4 = rp.addJob(u'sleep', (0.2,  4))
  >>> time.sleep(0.2)
  >>> transaction.commit()

Before testing the worker in the remote processor, let's have a look at every
method by itself. So we instantiate the worker:

  >>> from m01.remote.worker import MultiJobWorker
  >>> worker = MultiJobWorker(rp, waitTime=0, maxThreads=2)

The maximum amount of threads can be set as well:

  >>> worker.maxThreads
  2

All working threads can be reviewed at any time:

  >>> worker.threads
  []

  >>> from zope.security import management
  >>> management.endInteraction()

Let's pull a new job:

  >>> job = worker.doPullNextJob()
  >>> job
  <SleepJob u'...' ...>

We need to pull a job before executing it, so that the database marks the job
as processing and no new thread picks up the same job. As you can see the job
get marked with the processing status:

  >>> job.status
  u'processing'

Once we pulled a particular job, we can process it:

  >>> log_info.clear()
  >>> print log_info

  >>> worker.doProcessJob(job.__name__)

  >>> print log_info
  m01.remote INFO
    Job: 1

Let's now have a look at using the processor in the task service. This
primarily means setting the processor factory:

  >>> management.newInteraction()

  >>> rp.jobWorkerFactory = MultiJobWorker
  >>> rp.jobWorkerArguments = {'waitTime': 1.0, 'maxThreads': 2}
  >>> transaction.commit()

  >>> log_info.clear()

Let's now process the remaining jobs:

  >>> rp.startProcessor()
  >>> transaction.commit()
  >>> time.sleep(1.5)

  >>> rp.stopProcessor()
  >>> transaction.commit()
  >>> time.sleep(0.5)

As you can see, this time the jobs are not completed in order anymore, because
they all need different time to execute:

  >>> print log_info
  m01.remote INFO
    Processor 'root-worker' started
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    Job: 3
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    Job: 4
  m01.remote INFO
    Job: 2
  m01.remote INFO
    Processor 'root-worker' stopped

Let's now set the thread limit to four and construct a new set of jobs that
demonstrate that all jobs will run at the same time:

  >>> rp.jobWorkerArguments = {'waitTime': 0.0, 'maxThreads': 4}

  >>> jobid1 = rp.addJob(u'sleep', (0.3, 1))
  >>> time.sleep(0.2)
  >>> jobid2 = rp.addJob(u'sleep', (0.4, 2))
  >>> time.sleep(0.2)
  >>> jobid3 = rp.addJob(u'sleep', (0.1, 3))
  >>> time.sleep(0.2)
  >>> jobid4 = rp.addJob(u'sleep', (0.5, 4))
  >>> time.sleep(0.2)
  >>> transaction.commit()

If all tasks are processed at once, job 3 should be done first. You can also
see that the job 4 get processed ASAP even before the worker logs processing:

  >>> log_info.clear()

  >>> rp.startProcessor()
  >>> transaction.commit()

  >>> time.sleep(1.0)

  >>> rp.stopProcessor()
  >>> transaction.commit()
  >>> time.sleep(0.5)

  >>> print log_info
  m01.remote INFO
    Processor 'root-worker' started
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    Job: 3
  m01.remote INFO
    Job: 1
  m01.remote INFO
    Job: 2
  m01.remote INFO
    Job: 4
  m01.remote INFO
    Processor 'root-worker' stopped

Let's now set the thread limit to two and construct a new set of jobs that
demonstrate that not more than two threads run at the same time:

  >>> rp.jobWorkerArguments = {'waitTime': 0.0, 'maxThreads': 2}
  >>> transaction.commit()

  >>> jobid1 = rp.addJob(u'sleep', (0.3, 1))
  >>> time.sleep(0.2)
  >>> jobid2 = rp.addJob(u'sleep', (0.4, 2))
  >>> time.sleep(0.2)
  >>> jobid3 = rp.addJob(u'sleep', (0.2, 3))
  >>> time.sleep(0.2)
  >>> jobid4 = rp.addJob(u'sleep', (0.5, 4))
  >>> time.sleep(0.2)
  >>> transaction.commit()

If all tasks are processed at once, job 3 should be done first, but since the
job has to wait for an available thread, it will come in third. We can now run
the jobs and see the result:

  >>> log_info.clear()

  >>> rp.startProcessor()
  >>> transaction.commit()

  >>> time.sleep(1.5)

  >>> rp.stopProcessor()
  >>> transaction.commit()
  >>> time.sleep(0.5)

  >>> print log_info
  m01.remote INFO
    Processor 'root-worker' started
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    Job: 1
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    Job: 2
  m01.remote INFO
    MultiJobWorker: processing job ...
  m01.remote INFO
    Job: 3
  m01.remote INFO
    Job: 4
  m01.remote INFO
    Processor 'root-worker' stopped