About¶

B3LB is based on the Django Python Web framework and is designed to work in large scale-out deployments with 100+ BigBlueButton nodes and high attendee join rates.

Architecture¶

To scale for a huge number of attendees it is possible to:

scale-out the B3LB API frontends
scale-out the B3LB polling workers
scale-out your BBB nodes

Features¶

multiple b3lb frontend instances
backend BBB node polling using Celery
extensive caching based on Redis
robust against high BBB node response times (i.e. due to ongoing DDoS attacks)

BBB Clustering¶

supports a high number of BBB nodes
different load balancing factors per cluster
load calculation by attendees, meetings and CPU load metrics
maintenance mode allows to disable BBB nodes gracefully

BBB Frontend API¶

deployed on ASGI with uvicorn
HTTP call-outs are implementated async using aiohttp
support API key rollover using a second secret
prebuild responses for expensive API calls (getMeetings)
limiting attendees or meetings per tenant
does not implement but blocks recording API calls

Multitenancy¶

per-tenant API hostnames
start presentation injection
branding logo injection
multiple API secrets per tenant

Monitoring¶

simple health-check URL
simple json statistics URL
prometheus metrics URL

Load Calculation¶

To select a BBB node for new meetings B3LB calculates a load value for the BBB nodes. The BBB node with the lowest load value is choosen. The load is based on three metrics:

number of attendees
number of meetings
cpu utilization (base 10.000)

Each of the metrics is important for deciding where to spawn new meetings. The cpu utilization depends on the current load caused by running meetings and also respects external effects of the BBB nodes. The number of meetings is important since it is an indicator that more attendees may join and cause even more load in the future.

\[ \begin{align}\begin{aligned}\begin{split}\begin{array}{clc} \mathbf{\text{Metric}} & \mathbf{\text{Description}} & \mathbf{\text{Origin}} \\ cpu_{15s} & \text{cpu utilization in the last 15s} & \text{node} \\ cpu_{1m} & \text{cpu utilization in the last minute} & \text{node} \\ n_{atn} & \text{number of active attendees} & \text{node} \\ n_{mtg} & \text{number of active meetings} & \text{node} \\ \end{array}\end{split}\\\begin{split}\\ \\\end{split}\\\begin{split}\begin{array}{cclc} \mathbf{\text{Tunable}} & \mathbf{\text{Default}} & \mathbf{\text{Description}} & \mathbf{\text{Origin}} \\ cpu_{max} & 5.000 & \text{target max cpu utilization} & \text{cluster} \\ cpu_{order} & 6 & \text{order of the polynomial} & \text{cluster} \\ f_{atn} & 1 & \text{load factor for a single attendee} & \text{cluster} \\ f_{mtg} & 30 & \text{load factor for a single meeting} & \text{cluster} \\ \end{array}\end{split}\\\begin{split} \\ \\\end{split}\\load_{node} = f_{atn} * n_{atn} + f_{mtg} * n_{mtg} + \frac{cpu_{max}}{cpu_{order}} * \sum_{n=1}^{cpu_{order}} {\left[\frac{ \max {\left(cpu_{1m}, cpu_{15s}\right)} }{10.000}\right]}^{n}\end{aligned}\end{align} \]

The cpu utilization is reinforced to get a slow increase as long the cpu utilization is low and increases more and more steep. The following plot shows the load value for a BBB node depending on it’s CPU utilization (base 10.000) for different attendee and meeting counts.

(Source code, png, hires.png, pdf)

Tuning the polynomial order changes the load balancing to be more or less cpu load sensitive:

(Source code, png, hires.png, pdf)

Container Images¶

B3LB provides in three different docker image provided on Quay.io and GitHub Packages. The images can be build from source using the provided Dockerfiles.

Hint

It is intentional that there are no b3lb:latest nor b3lb-static:latest image tags available. You should always pick a explicit version for your deployment.

Warning

Since Docker has stopped to support OSS no images on Docker Hub are provided any more for b3lb ≥2.2.1!

b3lb¶

This image contains the Django files of b3lb to run the ASGI application, Celery tasks and manamgenet CLI commands.

Quay.io

docker pull quay.io/ibh/b3lb:2.2.2

GitHub Packages

docker pull docker.pkg.github.com/de-ibh/b3lb/b3lb:2.2.2

b3lb-static¶

Uses the Caddy webserver to provide static assets for the Django admin UI and can be used to publish per-tenant assets.

Quay.io

docker pull quay.io/ibh/b3lb-static:2.2.2

GitHub Packages

docker pull docker.pkg.github.com/de-ibh/b3lb/b3lb-static:2.2.2

b3lb-pypy¶

This image contains the Django files of b3lb and uses PyPy <https://www.pypy.org/>_ instead of CPython. This boosts the performance for the celery worker if the need to process a huge number of nodes or attendees.

Quay.io

docker pull quay.io/ibh/b3lb-pypy:2.2.2

GitHub Packages

docker pull docker.pkg.github.com/de-ibh/b3lb/b3lb-pypy:2.2.2

Warning

It is recommended to use b3lb-pypy for the celery workers, only. It is not well-tested for any other task and is known to waste memory. You should run it only with cgroup based memory limits engaged to prevent excessive memory swapping or OOM killing.

b3lb-dev¶

This is the development build of b3lb using Djangos single threaded build-in webserver. You should never use this in production.

Quay.io

docker pull ibhde/b3lb-dev:latest

GitHub Packages

docker pull docker.pkg.github.com/de-ibh/b3lb/b3lb-dev:latest