Building a Queue in Mendix
A few months ago I posted a new module in the Mendix App Store, called the Queue module: https://appstore.home.mendix.com/link/app/106628/WebFlight/Queue. In this blog I will take you through the development process and elaborate on the design trade-offs that resulted in a Queue that is 7,4 times faster than the Process Queue (see https://testqueueperforman.mxapps.io).
So no high level talk, let’s take a deep dive!
If you’re not a fan of technical stuff, please stop reading right around this point.
Why would you use a Queue?
A queue basically refers to a list of jobs waiting to be executed. Normally, if we execute an action in Mendix by calling a microflow, the action is executed immediately and we wait for a response (do not confuse with asynchronous front-end action execution).
Using a queue, we separate the creation of a job from the actual execution, like the execute async functionality using the Mendix Core Java API that uses an internal queue system in the Mendix runtime (more on the internal queue later in this article).
This brings us a number of nice advantages:
– The system does not need to allocate resources instantly.
– The queue allows for individual control of jobs instead of a single long-running process that runs them all.
– We can run jobs in parallel, which is called concurrent execution. This can lead to performance improvements when executing a large number of jobs.
But we have the Process Queue already?
Most of you will know the existing Process Queue (credits to Jaap Pulleman! https://appstore.home.mendix.com/link/app/393/Mendix/Process-Queue) that we have often used in projects. It does a good job and has lots of incremental development incorporated. Nevertheless, after a while I noticed some things that triggered me:
– Queue controls (stop or initialize queue) are not working properly and cause the queue to get stuck sometimes.
– Removing large amounts of jobs can take a while due to removing associated entities.
– Sometimes the queue gets stuck for no particular reason.
– Running new microflows in the queue is quite tedious. I have to create a new Process on startup that is persisted in the database and connects to a shared pool.
– There is no retry mechanism in case errors occur, which is very common behaviour in many existing products (e.g. AWS SDK: https://docs.aws.amazon.com/general/latest/gr/api-retries.html)
– Force rollback on all errors, no possibility to throw errors without rollback.
I looked into the options of modifying the Process Queue, but that would mean e a lot of work with no guarantee of backwards compatibility. I decided to build a Queue myself and agreed on that with Jaap, which enables me to learn how this works in Java and what challenges you face during the process.
Before I started developing, I wrote down a number of requirements the new Queue module should meet in addition to the points that were lacking in the current Process Queue:
– Simple to install and use
– Lightweight and fast, no additional functionalities that are not required for its primary functioning
– Unit tested logic
– Limit database interaction as much as possible
– Simple front-end administration snippets using default Atlas themes
The process of designing a Queue…
First some terminology…
Mendix’ runtime runs on the Java Virtual Machine, so when implementing a piece of custom modules, we cannot continue without covering some technical terminology. The most important term in this context, is the “thread”. Citing Wikipedia: “a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system.” We need different threads to run different jobs in parallel.
Each job or line or code that is executed on a computer runs within a thread, a piece of context that has some memory allocated and a dedicated program counter. Threads in Java correspond to native threads that are managed by an operating system like Microsoft Windows or Linux. If we start a simple Mendix application in idle state, the Java Virtual Machine has around 40 threads with different purposes. See the following Mendix documentation for additional details on memory usage: https://docs.mendix.com/refguide/java-memory-usage-with-mendix.
Managing threads is quite complex and creating new ones requires system resources. Therefore, Java offers thread pools designed to manage a thread lifecycle and share threads between different parts of the application (similar to connection pooling).
Persistent or not?
The point where I started developing was the domain model of the Queue module. Is the Job the most important entity here? Do we need Job entities for a working queue? In fact, the answer is no. A Java Queue (https://docs.oracle.com/javase/tutorial/collections/implementations/queue.html) lives in memory and can contain jobs without exposing them to the Mendix model or persisting the jobs in a database. The internal action queue in Mendix is non-persistent as far as I know. That means jobs in the queue are not persisted. When the application is restarted, the queue is empty and there is no way to reconstruct the state of the queue before it was restarted.
We can look into the Java code of the Mendix runtime and see how Mendix handles this. The class com.mendix.basis.actionmanagement.ActionManagerBase contains the heart of Action Management. The following lines of code give the answer:
private final ScheduledThreadPoolExecutor scheduledPool = new ScheduledThreadPoolExecutor(10); protected final ThreadPoolExecutor pool; …………. this.pool = new ThreadPoolExecutor( 10, mxRuntimeConfiguration.asyncActionMaxPoolSize(), 60L, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(mxRuntimeConfiguration.asyncActionMaxQueueSize()));
This means that by default Mendix has two pools of 10 threads that execute jobs or actions: one for asynchronous actions and one for scheduled actions.
In contrast to the internal Mendix queues, we would like to visualize our Jobs in the front-end and persist the status of the Queue. When the application restarts, we can continue execution and store the execution logs in a user-friendly way. Due to this requirement we need to create a persistent entity: the Job!
All other information (Queue name, number of threads, status etc.) regarding the queue can be exposed to the Mendix model using non-persistent entities. I can’t come up with any reason to store the queue information in the database, so we just render that entity non-persistent.
Which Queue implementation satisfies our needs?
Java provides a default implementation for the so-called Java ThreadPoolExecutor, which consists of a queue with jobs and a thread pool to execute those jobs. Like Mendix did, I would like to implement a ScheduledThreadPoolExecutor (https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ScheduledThreadPoolExecutor.html), because it allows us to schedule jobs with a predefined delay, which can be used to rerun jobs after execution fails.
It should be possible for a user to create multiple thread pools. For that reason, we should keep track of the ScheduledThreadPoolExecutor and store them in a ConcurrentHashMap. Now, we can search the different Queues by name.
A difficulty faced when implementing the Queue is that we would like to expose a Java feature (thread pools) to the Mendix model. To make it work properly, Jobs and Queues in the Mendix model should be in sync with the queue and thread pool in memory. Therefore, we should be careful with the controls that are exposed to users.
The idea behind a Java thread pool is that it remains active during the lifetime of an application to allow re-usage of threads by different parts of the application. If you shutdown a thread pool, all threads will be terminated and you have to initialize a new thread pool and add the jobs again to the queue.
To initialize a thread pool in Mendix in a proper way, we need to use the after startup microflow. A startup microflow will be executed on all nodes (https://docs.mendix.com/refguide/clustered-mendix-runtime) in case of a multi-node setup. This means we cannot include Initialize Queue buttons in the frontend to prevent undesired behavior that thread pools are not initialized on all nodes and jobs could not be added to the Queue.
Cancellation and deletion of jobs is also something that needs attention, because we need to keep the jobs in sync with the in-memory queue. It is not possible to cancel/delete jobs in every stage of execution. Therefore, only the relevant buttons are added to the UI depending on the job status. A Before Delete event is added to ensure queued jobs to be removed when the job object is removed from the database.
Unit tests, unit tests, unit tests… FIRST please
As promised, business logic is unit-tested with 100% coverage. In addition, tests should be written according to the F.I.R.S.T. principles: fast, isolated, repeatable, self-validating and timely.
To get code ready for unit testing, a number of changes were applied.
The Mendix Java API provides static methods for the Core class. When running unit tests, you don’t want the actual methods (changes in attribute values or commits) to be called in the Core class. The easiest way to fix this issue is to wrap the Core methods in another class injected into a constructor or method. Other libraries, such as PowerMockito, can solve the issue but have other dependencies on bytecode manipulators.
Another aspect we have to control is instantiation of objects. We would like to Mock objects using the Mockito library, which can only be done if they are injected in the method under test. Object instantiation is moved to so-called factory and repository classes.
While the 63 unit tests help to check existing functionality when releasing new versions, they also help to create clean code. Mockito, for instance, verifies that a commit is only performed once if desired. Tests fail if they are performed more often.
It took a few months of development during the late hours of the evenings and 4 hours every week that I ended up spending on y personal development plan. The result is a fast and lightweight queue that can be used in projects by the community, and remains reliable throughout the process. We have replaced the Process Queue in a number of applications where we experienced issues and, so far, stability is outstanding. No issues reported whatsoever.
And I forgot to mention a little boon for any developer:: if you make a typo in the microflow name in a Job object, the logger will show the microflow name that is most similar to the one you typed!