Monitoring celery – your distributed task queue

Hey. I recently noticed I needed some kind of celery monitoring. Here’s a summary of what I learned while searching for a solution

TL;DR: Sign up at healthchecks.io, add checks and integrations then add a single celery task that runs every minute. Configure the Period and have it drop an email or a Slack message when the check is not invoked. Also, this post is not sponsored in any way.

Here’s a situation. I’m running a web app that offloads work to celery tasks. This web app uses the task engine to run code that potentially uses a lot of time, like language detection, label scanning, calculating similarity, indexing into a search engine, adding synonyms, adding emoji names, etc after receiving content via an API call. Then, after some time, you start to notice that the celery tasks are no longer running while the processes are running. Simple process monitoring would not have caught this as the processes were still running, but tasks were not being picked up.

Monitoring your website is easier: just invoke an endpoint and test the results. Tools like visualping.io are really good at that. Celery tasks are different as they’re running in the background and typically not directly available via the internet.

Here’s what I do, in a simple list:

    • Sign-up at healthchecks.io and add a check via Add Check. Make sure you configure the expected period and grace period and add the integrations you need.
    • Copy the newly created URL and store it somewhere in the projects settings.py
    • Add a task that invokes the endpoint for your service. An example implementation is below
    • Use a celerybeat schedule that runs the task at an interval.
    • Wait.

Now here’s some example code which you should modify to your needs. It’s in use for some time now, but I wouldn’t say it’s battle tested.

from celery.utils.log import get_task_logger
from django.apps import apps
from requests import post, get, ConnectTimeout, status_codes
 
from brownpapersession.celery import app as celery_app, CeleryMailingTask
 
 
log = get_task_logger(__name__)
 
 
@celery_app.task(name='integration.send_healthcheck', ignore_result=True, base=CeleryMailingTask)
def send_healthcheck(**kwargs):
  config = apps.get_app_config('integration')
  if not config.HEALTHCHECKS_URI:
    log.warning('service=healthchecks, reason=configuration')
    return
  try:
    response = get(
      config.HEALTHCHECKS_URI,
      timeout=3.3,
      headers={
        'User-Agent': 'brwnppr.com/0.6',
      },
    )
  except ConnectTimeout as e:
    log.warning('service=healthchecks, uri={uri}, detail={detail}, reason=timeout'.format(
      uri=config.HEALTHCHECKS_URI,
      detail=e,
    ))
    raise AssertionError(e)
  if status_codes.codes.OK == response.status_code:
    log.debug('service=healthchecks, uri={uri}, http.status={status}'.format(
      uri=config.HEALTHCHECKS_URI,
      status=response.status_code,
    ))
  else:
    log.warning('service=healthchecks, uri={uri}, http.status={status}'.format(
      uri=config.HEALTHCHECKS_URI,
      status=response.status_code,
    ))

Now what’s the beauty in this? Well, this actually notifies you when something is wrong and it will not bother you when everything is running properly. Integrating is lightweight and easy. On top of that the healthchecks website shows you when the last trigger was received. Take these integrations one step further and add a webhook which resolves the issue or displays a message on your website.

Wondering what CeleryMailingTask is? It will be the subject of a next post.

Really happy to take suggestions or improvements via the comments.

Fixing NonResponsiveChannel on Broadsoft XSI

It has been a while since I last wrote on my experiences implementing code that consumes Broadsoft events. I’ve been reworking the code, upgrading it to version 19 of the spec, and I noticed an improvement since version 18. That’s what this post is about.  Continue reading Fixing NonResponsiveChannel on Broadsoft XSI

Creating a Windows 7 demonstration environment using an Intel Core i5 NUC

A different topic for this blog: infrastructure. I’ve written a couple of blogs about consuming Broadsoft XSI events on C# and I’ve had a couple of developers asking me to demonstrate what I’ve build. I actually showed the software on my development desktop but that didn’t feel right, so I looked for options and found plenty. I decided to go for a self-hosted environment using an Intel Core i5 NUC running Debian 8 Continue reading Creating a Windows 7 demonstration environment using an Intel Core i5 NUC

Using a plain javascript array with AngularStrap’s Selects

The web application I’m currently developing is using https://angularjs.org/ and http://getbootstrap.com/ because, well, I’m an angularjs fan and I suck at layout. Guess the latter is pretty obvious from the theme the site is running right now. Continue reading Using a plain javascript array with AngularStrap’s Selects

weasyprint and glyphicons font on Windows 7: couldn’t load font “Glyphicons Halflings Not-Rotated 21px”

I recently developed tooling using python 2.7. It runs on Windows 7 (using portable python) and Linux, will do something and produce both HTML and PDF output. I used Twitter Bootstrap to format the HTML page and I created a PDF using that HTML page. Should be easy, as weasyprint is pretty mature and actually works on Windows, but pango was emitting couldn't load font messages. Simply installing the font was impossible, as it required administrator access to the operating system Continue reading weasyprint and glyphicons font on Windows 7: couldn’t load font “Glyphicons Halflings Not-Rotated 21px”

First steps on building PDF files from a django app

I’ve been building a web application that operates on information extracted from a storage-engine used by GnuCash. GnuCash 2.6 comes with python-bindings, allowing a developer to rapidly build applications working on data maintained by the application Continue reading First steps on building PDF files from a django app

Building a Broadsoft XSI event consumer on .NET

I wrote about the specifics I encountered with the Broadsoft XSI API before, in this post. I didn’t mention the software I used to support the development of the application. This post will be about what I did to consume data produced by the XSI environment Continue reading Building a Broadsoft XSI event consumer on .NET

Broadsoft XSI event consumer on .NET

I’ve been developing a Windows application that consumes Broadsoft XSI events. Broadsoft has a “comprehensive range of VoIP Applications in a Single Platform” and XSI is a way to consume data produced by their platform, allowing integration of a modern switchboard service with other types of applications. Continue reading Broadsoft XSI event consumer on .NET