Introducing Lisk Codec
Serializing is the process of encoding information into bytes. Much fun! but oh so instrumental in blockchain. The topic of this blog post is a bit more technical than others, however we will try to explain its main components in a clear and easy-to-understand way.
This blog post covers the reason why serialization is needed in the first place, and how it has been implemented up to now in the Lisk project. This is followed by an introduction to Lisk codec and the improvements it will bring to the SDK developer experience. We will conclude by showcasing some of the benefits, coupled with how it will enhance the Lisk network.
- Binary message: A sequence of bytes. A serialization method outputs a binary message.
- Encoding, serializing: Converting a message into a binary message.
- Decoding, deserializing: Converting a binary message into a message.
In the diagram below, on the left-hand side, we show a message corresponding to a balance transfer transaction and on the right-hand side of the corresponding binary message. To improve readability, public keys and signature have been shortened and punctuation has been added to the binary message.
Serialization Goals and Uses
The Lisk blockchain uses several objects, such as transactions, blocks, and accounts. In most non-blockchain projects, the manner in which objects are converted to bytes by your device is not actually critical. The messages can be displayed, stored, copied, and modified without causing any unwanted issues. However, in blockchain projects, it is essential that the same object is converted into the same byte sequence every time and by all users. Otherwise, properties such as signature or ID (obtained by hashing the binary message), would be invalid or change. If you sign a transaction on your device and this same transaction is then serialized differently by someone else, they would reject your signature and consider your transaction invalid.
Besides this critical property, an efficient serialization method can benefit other parts of the ecosystem. One of the problems that blockchains are facing is the growing storage and network requirements. In this regard, transmitting, receiving, and storing less bytes makes the whole application lighter, and allows for faster synchronization. This in turn highlights the advantages of using a serialization method which generates small binaries.
Finally, to improve the decentralization of Lisk, it is important that the Lisk protocol can be implemented in various programming languages, and that different teams can work on creating tools for the ecosystem. Hence, the serialization method used in the Lisk protocol must be well defined, and implementation agnostic.
Shortcomings of the Current Solution
Up to version 4.0.0, the Lisk tool kit did not include a serialization method which satisfied all the above requirements. Binary messages were only used for signing, and not for communication and storage, as such the serialization method was not optimized for speed and size.
Another pain-point in the current architecture is the serialization of custom transactions. To serialize mainnet transactions, Lisk implemented a getBytes function. For custom transactions, the application developers have to specify a custom transaction asset for which they have two serialization alternatives shown below:
- Serialize the asset as a JSON string.
- Write a custom assetToBytes function.
The first option is fast, however, it is not efficient in terms of size, and furthermore different node versions could have slightly different implementations. The second option is time-consuming and could lead to errors. In addition, there is no guarantee that the custom assetToBytes function is efficient in regard to its size and encoding speed.
In a unified ecosystem, different projects should use similar encodings for transmitting objects and for serializing them. If this burden is left on the shoulders of the developer, it is likely that the ecosystem will be filled with all kinds of unorthodox binaries. In the worst-case scenario, this could lead to unsafe behavior and maybe even loss of funds.
Another advantage of Lisk codec is its ability to support the Lisk transition to a key-value store. The binary sequences can directly be stored and only need to be decoded when used by the application. Storing and transmitting binary messages directly enables the network to run smoother and avoid delays due to unnecessary encoding and decoding. Finally, nodes can answer API requests by sending the byte value corresponding to the requested key directly.
Lisk codec is based on Protocol buffers (protobuf), and has been tailored to fit the needs of a blockchain project. Protobuf is a well-known serialization mechanism, meaning that the Lisk codec behavior for encoding and decoding can be reproduced by most protobuf implementations. However, the protobuf serialization is not always deterministic, and as such not directly suited for our use case.
The following example below illustrates serializing an empty block using Lisk codec:
The use of Lisk codec will improve the experience of developing custom applications.
All Lisk transactions now have a set of common baseTransaction properties (such as fee, nonce and sender public key), and a set of transaction-specific properties defined in the asset.
Custom transactions are similarly defined by a JSON schema specifying their asset. The serialization and validation with respect to this schema will happen in the background and does not require any additional developer input. This saves developing time and minimizes the risk of introducing errors. The SDK will also include tools to validate the syntax of your custom asset schemas.
The example below shows the asset schema of the balance transfer transaction:
Alternatively, a custom transaction to announce the return of a bike and its new position is displayed below:
Performance of Lisk Codec
A benchmark of the performances of Lisk codec is available in the Lisk codec repository. Encoding and decoding of regular transactions can be done at a rate of 200,000 transactions per second. Full blocks can be encoded and decoded at a rate of 50,000 blocks per second.
As mentioned above, using Lisk codec is also beneficial in terms of size for communicating between nodes and storing the blockchain. Currently, objects are saved and exchanged in JSON format. A regular balance transfer (no data field, no multisignature) is roughly 500 bytes. If encoded with Lisk codec the same object would be 150 bytes. This is a 70% reduction! Other objects would see a similar reduction in size, which is great news for the Lisk network and will help it scale in the future.
To summarize here, introducing Lisk codec provides the end-user with the ability to streamline the whole Lisk application, reduce network communications, and lower the global size of the blockchain. You can find a precise specification of Lisk codec in LIP 0027, while schemas used for transactions, blocks, and accounts can be found in LIP 0028, LIP 0029, and LIP 0030, respectively.
For further questions and comments on the topic, we will host an AMA onLisk.chat with Maxime Gagnebin (Research Scientist) this Friday, August the 21st at 4 pm CEST. We also invite all community members to go to the Lisk Research forum where we are always happy to hear your feedback and to participate in discussions on this and other topics.
Lisk is on a mission to enable you to create decentralized, efficient, and transparent blockchain applications. Join us: