Creating a truly distributed microservice architecture with Celery and Docker Swarm — Part 1
This will be a series of 3 blog posts where we’ll use Celery to create workers as independent microservices and also discuss some of the implementation difficulties and their solutions. We’ll deploy this stack using Docker Swarm.
- We’ll be using a microservice architecture. If you are new to the concept, please read about it before you proceed.
- If you are new to Celery, you can read about it here.
- The examples here would be in Python, but the concepts learnt here could be used in any language of your choice.
- This blog also assumes that you are familiar with Docker and Docker Swarm basics.
Let’s dive right in!
You can find the code for all the examples on this Github repo.
Basic project structure:
The basic project structure was borrowed from this post. In this series of blog posts, we’ll focus more on the problems we face in the implementation and deployment.
In this post, we’ll create a simple calculator using Celery, where every operation will be an independent microservice.
Our project folder structure would look like this:
First, we create a package called celery_tasks which will store the task definitions, celery and other configs. We’ll import this in all other microservices.
If you are using PyCharm and the imports are happening at the project root level:
but if you wanna do it at a subfolder level, you can add/remove content roots.
The tasks (classes) we declared in celery_tasks
will be implemented in the respective microservice.
The docker image will be built at the parent folder level (‘calculator’ folder in our case) so that we can copy contents from celery_tasks folder. Hence the Dockerfile mentions the file paths with respect to the calculator folder. We’ll set the build context in the docker-compose file later.
We’ll create a producer microservice within the same stack. This will expose a Flask API for us to create celery tasks.
For production scenarios, you can use the following approaches:
- Open RabbitMQ port for external usage, such that a producer in some other stack can write messages to the same RabbitMQ inside the consumer stack.
- You can have an API microservice inside the consumer stack which gets requests from the producer and puts messages into RabbitMQ (You might need to worry about the bottleneck created by this API microservice).
- You can have an externally hosted RabbitMQ (Fully managed service or as a part of a different stack) and configure it in both consumer and producer stack.
Flower is a dashboard for Celery, which we’ll use to monitor our workers and tasks. Usually the latest version of Flower won’t be compatible with the latest version of Celery, so it is always better to pin down the image to a specific version. (For production scenarios, you should also pin down versions of every requirement in requirements.txt file)
So our docker-compose.yml file will look like this:
Now, we’ll first build the local images and pull the required images from docker hub. Make sure your Docker daemon is running before you run any of the following commands. Also, make sure you run the commands at the folder level which contains the docker-compose.yml file.
➜ calculator git:(dev) ✗ docker-compose build➜ calculator git:(dev) ✗ docker pull rabbitmq:management➜ calculator git:(dev) ✗ docker pull mher/flower:0.9.5
Now we’ll deploy the Swarm stack.
➜ calculator git:(dev) ✗ docker stack deploy --compose-file=docker-compose.yml calculator
The output should look like this:
Now if you visit http://localhost:5555 in your browser, you should see something like this:
and if you visit http://localhost:15672, you’ll be asked to enter the username and password which we set in our docker-compose file as environment variables. After logging in, you should see something like this:
Now, we’ll create 10k tasks using the API we created: http://0.0.0.0:5000/create_tasks/10000
If we see the Flower dashboard, it should start showing the live status of tasks being processed.
and if you go to RabbitMQ dashboard, you should see stats appearing there as well.
If you play around with the Flower dashboard, you can see details of all the tasks along with their results and if some tasks failed, you can also see the error logs there.
Note: Some of the tabs in Flower may not work, so you should just use it as a dev/internal dashboard.
This completes our basic Calculator stack. In the next post, we’ll modify the same stack to add advanced features. Make sure to remove the stack after you are done experimenting with it.
➜ calculator git:(dev) ✗ docker stack rm calculator
In the next post, we’ll learn some of the more advanced stuff like Celery task retries, task chains, configuring result backend, RabbitMQ with bind mount, working of Celery in solo, multiprocessing and multithreading mode, problems with MongoDB in multiprocessing mode, long running tasks and their associated problems with RabbitMQ.