Skip to content

Alexey Anufriev Posts

Advanced instrumentation with ByteBuddy Agent

Reading Time: 7 minutes

The Aspect-Oriented Programming paradigm allows to achieve a great level of Separation of Concerns (SoC) in the application’s codebase. All the necessary components that do not relate to the business logic directly, like security, metrics, logging, etc., can be nicely isolated into auxiliary modules that will be wired up with the business logic indirectly, usually during the compilation or in runtime.

In the Java world, there are not so many frameworks for AOP. The notable ones are AspectJ and Spring AOP. Both balance the complexity and functionality. The process of attaching the extra modules to the business logic is called weaving. In short, AspectJ supports compile-time weaving that modifies output class files (variation – post-compile weaving is used for 3rd party classes), and load-time weaving that intercepts the classloader and modifies the classes in the process of defining them to the JVM. Alternatively, Spring AOP has a much more simple model, it allows for runtime weaving only that works just for Spring Beans defined in the Application Context. The weaving is done via proxies generated by Spring.

In addition to these two frameworks, there is one more great library that exists for pretty similar objectives, it is called Byte Buddy, developed by Rafael Winterhalter. This library is pretty well known for its highly simplified and convenient facilities that allow runtime code generation and modification. Byte Buddy makes it easy to create a new type that can redefine existing types, intercept methods, and much more to that. But in addition, it can also change the behavior of existing classes. This is done via Byte Buddy Agent that can operate with Java Instrumentation API to do necessary changes to type definitions. This may sound similar to AspectJ load-time weaving, which in fact is, but practically there is a big difference between the two. The main benefit of using Byte Buddy is simplicity. The API is very intuitive, the extra tooling is not required, and overall integration of it does not force any changes in the build process, run configuration or so.

Now it is time to stop the theory and apply the Byte Buddy in practice. A simple, yet important example could be the detection of certain methods invocations. In fact, this idea has been inspired by the BlockHound project (detector of blocking calls from non-blocking reactive threads), developed by Sergei Egorov.

Leave a Comment

Short Tip on Kafka:Quick and Easy Local Development Setup

Reading Time: 3 minutes

When it comes to a need to setup a local Apache Kafka instance for some experiments or tests it may turn into an exercise that is not easy to complete.

Problems that may arise are usually quite different, it can be connectivity between Kafka and Zookeeper (prior to KIP-500), exposing a listener for incoming connections, or plugging in any friendly UI for handy observability/management of the Kafka instance.

To simplify these preparational steps Docker Compose can be used. It allows to start all the components, wire them up in a single network and expose necessary ports to the host machine. But it also requires a bit of knowledge about Kafka internals and configuration properties. So making it from scratch may require some effort.

And here is a small guide on how this can be done faster.

Leave a Comment

Short Tip on Java: throw multiple Exceptions at once

Reading Time: 5 minutes

Exceptions in Java are intended to interrupt the execution flow in unexpected situations. But there might be a need to continue the execution without interruption, still being able to collect all the errors that occurred before.

An abstract example of such a flow can be a Backup Manager. To increase reliability, it might need to save backups to multiple targets. Those can be a local file system, an external file system, and some cloud provider.

The whole flow will consist of three steps, one for each type of backup target. But the execution of these steps is unpredictable. Exceptions may happen at any moment, and the rest of the flow will be interrupted, which is bad, as one broken target may destroy the whole reliability concept.

Of course, Java has mechanisms to prevent interruption. Using try-catch block for each of the steps will guarantee that all the steps will be executed. The only problem here is that at the end of the execution there will be too little information about what went wrong.

Exceptions may be collected to some list and processed afterward. But in some cases, there is no need for such sophisticated solutions. The only needed outcome may be a single exception that will just explain the whole execution and what was wrong during it.

Leave a Comment

Short Tip on GitHub Actions:workflow_dispatch event input as execution condition

Reading Time: < 1 minute

“Github Actions” supplies a special event for workflows that can be triggered manually from the web interface. This event is workflow_dispatch. By default, it has only one input parameter: git branch that must be a context for the workflow execution. But the set of input parameters can be extended with custom ones.

Leave a Comment

How to understand Class.isAssignableFrom() in Java

Reading Time: 4 minutes

Java has quite a few ways to compare types of objects. Most well know are:

  • operator instanceof;
  • method Class.isInstance(Object obj);
  • method Class.isAssignableFrom(Class<?> cls).

The logic of the first two can be quite easily inferred based on the names. But the last one needs a bit of thinking to completely interpret it. So how to understand it once and forever?

Leave a Comment

Safe way to extend /boot partition on Linux

Reading Time: 4 minutes

The good thing about most of the varieties of Linux OS is that upgrade to the next version is a relatively easy task. Of course, there can be problems with software/drivers compatibility, but in general, these operating systems are designed to support this kind of upgrades. Users of <whatever>buntu that is at least a couple of years old can do a migration to the most recent one. But, the newer the version of the OS – the more resources it requires.

This particular writing is about disk space, and in particular /boot partition. While installing something like Ubuntu 14.04 years ago the installer may suggest a default partitions layout with /boot partition having a size even below a hundred of megabytes. And at that time it was just enough. Then come new versions of OS and software, new kernels, and so on. Upgrades turn to be not that easy anymore. /boot partition is being polluted with old kernels and other leftovers, and needs to be cleaned again and again to complete installation or upgrade.

How to stop this repetitive pain? In case of LVM this could be an easy task but what if it is just a normal filesystem?

2 Comments

Java, let me follow code style

Reading Time: 5 minutes

Developers spend much more time reading the code rather than writing. For that reason readability and “clean code” practices are in general is very important. There is a lot of different rules exists to help developers structure their code in a way so it needs less effort for understanding and in maintenance. On top, there are also tools created to enforce those rules.

Now to the code. Looking at it from the high-level perspective it is easy to distinguish bigger blocks, like classes (not necessarily to follow the OOP paradigm, by “class” any top-level grouping can be mentioned) and their members. And the order of these members in class is very important.

There is a couple of most popular recommendations to that.

First one, it comes from the code style guidelines published by big tech players like Google or Oracle says that it is important to order class members by the level of exposure. This order fits good the code written in the form of a library:

  • public members – class API, is the most important part, must fall into consumer’s eye as soon as possible so it stands at the first position;
  • protected members – internal API, still important as it can be overridden thus needs to be at the second position;
  • private members – pure internals, can be placed at the very last position.

An alternative recommendation suggested by Robert C. Martin can be formulated much more easily: order of the members must be defined like chapters in the fiction book, they must follow the execution flow “story” and explain the code without a need to jump around the text up and down. This style fits good typical services that define pure business logic, set of operations required to execute some user scenario.

So for the second approach there is not much what can be automatically enforced, whereas for the first one it is an opposite situation. But even if the idea is simple, sometimes it is just impossible to overcome language limitations and entirely follow the code style “public” -> “protected” -> “private”.

Leave a Comment

Split Git Repository

Reading Time: 10 minutes

“organizations which design systems … are constrained to produce designs
which are copies of the communication structures of these organizations.”
M. Conway

Mono-repository in software development is a very popular way of organizing the source code and collaboration around it. It has some pros, like easy refactoring or dependency management, but it also has some cons, like a very high level of coupling between components (of course these statements are debatable but this is not the point of the current post). Some IT giants, like Google, Twitter or Facebook are still using mono-repository but this costs them quite a lot, just look at the new build systems like Bazel or Buck, they were invented to minimize the effort required to manage a huge pile of code.

At the same time, there is an alternative approach of making big products still having the projects distributed across multiple repositories. One of the benefits here is “loosely coupling” that leads to very easy scaling.

Practically not much of the projects are started already being split into modules and stored separately. In most of the cases it is a single repository that is growing until some point in time when the decision to split is made. But until this moment it is already a lot of work has been done. In case if the previous history is not relevant and can be neglected it is quite a simple task to make a split: move modules to the new location and tune CI accordingly. But in case if there is a need to preserve changes history and have it relevant to the content of each new module it becomes a non-trivial task, but (spoiler!) still possible to be performed quite fast.

Here is an abstract project with two logical modules: user-related and guest-related. Both are represented by four directories inside the repository. Ideal plan to separate those modules would be following:

So, how to do this?

Leave a Comment

Debug with Git

Reading Time: 11 minutes

Testing shows the presence, not the absence of bugs.
Dijkstra

Apparently, software regression is a very nasty situation in the development process. It usually means that the last delivery contains something breaking. To overcome the situation the whole release must be analyzed. A developer has to write tests, rollback the changes, run tests, and … it is still there, one more step back in the VCS history and the error is still reproducible. And now this bug just got another additional label “legacy”.

Actually, it turns out that this functionality has not been used for a while thus the bug could be introduced not just with the last commit or two but quite some time ago. In case if the codebase is big enough it may lead to some significant amount of time to find an exact change that introduced this bug.

In practice, there is a way how to automate this search. Below there is an example of this operation within the Git repository.

Leave a Comment

HTTP Verbs in REST. POST vs PUT.

Reading Time: 7 minutes

REST is a very flexible and at the same time, powerful methodology used for software architecture design. But this flexibility gives more space for design mistakes. Unlike SOAP that can transfer data via multiple protocols, REST interaction is done completely via the HTTP protocol. It uses HTTP verbs (or methods) to express the intention of operations.

Most commonly used verbs in the REST world are:

  • GET
  • POST
  • PUT
  • DELETE

In many cases, developers have to map those verbs on CRUD operations. Honestly, this mapping is quite a difficult task itself and moreover has no single definition from the REST side. The first and the last verbs are pretty obvious and self-explanatory: data retrieval and data removal. But one very popular problem is related to the remaining POST and PUT. In general, very often there is a decision problem: which operation to use for data insert and which – for data update.

Leave a Comment
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept