June 8, 2023

Unlocking Ethereum: Our Journey in Decoding Transactions

Unlocking Ethereum: Our Journey in Decoding Transactions

Written By Eugene Tian

Introduction

At Coherent, our mission has always been clear: transforming complex on-chain data into human-meaningful, actionable insights. At first, we followed a time-tested approach to manage and transform Ethereum data, which many in the data world would recognize - the Extract, Transform, Load (ETL) process. The ETL process involved extracting data from the Ethereum blockchain, transforming it using an Ethereum ABI (Application Binary Interface) library, and then loading it into our chosen data warehouse, Snowflake. While effective, we realized that this approach was too slow and inefficient. It would not allow us to keep up with Ethereum’s rapid block and transaction speeds, so we would not be able to provide real time decoded data.

We needed a faster, more efficient approach, and we found it in the form of the Data Build Tool (DBT) framework. DBT allowed us to move from ETL to ELT - Extract, Load, Transform. Now, our data could be loaded into Snowflake more quickly and transformed right where it was stored!

But our new, efficient approach came with a caveat: we had to leave behind the Ethereum ABI library. This was because Snowflake UDFs (User-Defined Functions) supported only a limited list of libraries, and unfortunately the Ethereum ABI wasn't one of them.

We decided to create our own custom Python EVM (Ethereum Virtual Machine) decoder, one that would work within the constraints of Snowflake UDFs and yet be capable of handling the complexity of Ethereum data. In essence, we were about to create the world's most sophisticated UDF, entirely from scratch, right here at Coherent. This was our new challenge, and this blog post is the story of how we turned this challenge into an opportunity.

Unlocking the Ethereum Puzzle: Transactions, ABIs, and Their Symbiotic Relationship

In the world of Ethereum, an open-source platform renowned for facilitating 'smart contracts', transactions and contract ABIs share a unique symbiotic relationship.

Ethereum transactions are a fundamental unit of interaction on the platform. Each transaction contains an input field, a critical piece of data that communicates the parameters being passed into a contract function at the time of the transaction. It's a set of instructions detailing the specific actions the contract needs to perform.

However, the data within the input field is encoded. It remains an unintelligible string of characters without the necessary tool to interpret it - much like a coded message waiting to be deciphered.

Here is an example of what an input field may look like:

This string actually translates to:

Makes total sense right? Of course not! The input is not supposed to make any sense without the contract’s ABI. ABI stands for Application Binary Interface and they are essential for understanding and interacting with the contracts in Ethereum. They provide a detailed layout of a contract, including its functions and the data types of their parameters. Thus, they serve as the 'key' to decode the input data, translating the encoded parameters back into their original, understandable form.

Just as every lock requires a specific key, every contract requires its unique ABI to decipher the input field data. Understanding this connection is crucial in interacting with Ethereum and other Layer 2s, such as Optimism, Polygon, and the new BASE chain built by Coinbase.

At Coherent, we have built an in-house Ethereum contracts table and a contract service. This robust system manages and stores the ABIs of the contracts we interact with. Additionally, it also polls for new contracts as they are added to Etherscan. With each new contract added to the blockchain, our service updates the table with the new contract's ABI. This live catalog ensures that our transaction decoding capabilities remain current and accurate and our decoded percentage remains the best in the industry.

In a nutshell, understanding the intricate relationship between Ethereum transactions and ABIs, and extending that understanding to the wider EVM ecosystem, was crucial for our journey towards developing a custom decoder. This comprehension allowed us to innovate within our data environment while remaining aligned with the principles of Ethereum and EVM-based chains. Now lets dig into how EVM decoding works!

A step by step on decoding EVM Inputs

Fixed Types

One of the fascinating aspects of Ethereum transaction data decoding is the process of handling different data types. The ABI plays a pivotal role here, as it outlines the types present in the input. Each type comes with its unique decoding rules, setting the stage for a methodical extraction of information.

Let's start by discussing fixed types, a category that includes address, uint256, int256, bytes32, and bool. These are types that have a predetermined, constant size, which simplifies the decoding process significantly.

Each hexadecimal character represents 4 bits or half a byte, and thus a block of 64 characters equates to 32 bytes of data, the size of an Ethereum word. As such, Ethereum encodes and decodes in these 32-byte chunks. We can consider each chunk as a 'block' of data that can be individually decoded based on its type.

For instance, if an address type is detected, it means that the corresponding data block holds the Ethereum address information. But Ethereum addresses are actually only 20 bytes long. So, the first 12 bytes of the block are padded with zeroes, and the remaining 20 bytes hold the address. In the decoding process, we just need to strip away the leading zeroes and convert the remainder from hexadecimal to an Ethereum address.

For uint256 and int256 types, the decoding process is quite straightforward. Ethereum stores these types as unsigned and signed integers respectively. After fetching the corresponding 32-byte block, we convert it from a hexadecimal string to an integer. For int256, we must also check for and handle negative numbers, considering the integer's two's complement if the number is negative.

bytes32 is similar to the address type in terms of decoding. However, instead of skipping the first 12 bytes, we convert the entire 32-byte block from hexadecimal to bytes, as this type is meant to represent a sequence of bytes.

Lastly, for the bool type, the decoding hinges on the last byte of the block. Ethereum encodes True as 1 and False as 0, padding the remaining bytes in the block with zeroes. So, during decoding, we simply need to check the last byte to determine the boolean value.

Once a fixed type block is decoded, we move the 'pointer' 64 characters ahead, ready to decode the next block. The neat 32-byte encoding structure of Ethereum's fixed types allows for an organized, piece-by-piece decoding process that ultimately transforms the input into an interpretable format.

Lets do a quick example:

Suppose we have a simple ABI that specifies an address and a uint256:

And we receive the following hexadecimal-encoded transaction input data:

The first 64 characters (after the 0x prefix) represent the Ethereum address, and the next 64 characters represent the uint256.

Decoding follows the ABI. First, we take the 40 significant characters (last 40 characters after stripping the leading zeroes) from the address section. Adding a '0x' prefix gives us:

Address: 0xca35b7d915458ef540ade6068dfe2f44e8fa733c

Next, we strip the leading zeroes from the next 64 characters to get the uint256 value:

uint256: 0x2710

This is hexadecimal for the decimal number 10000, which completes the decoding. Thus, the decoded input data for the function myFunction would be:

In the next section we will explore the decoding of Ethereum's dynamic types, and the added layer of complexity they bring to the table.

Dynamic Types

With fixed types under our belt, we now venture into the more complex terrain of dynamic types. Dynamic types in Ethereum include strings, bytes, and arrays. These types are dynamic because their length can vary from one transaction to the next. For example, a string can contain one character or a whole novel, and an array can contain zero elements or hundreds.

Decoding dynamic types is a two-step process. First, we look at the first 64 characters at the current offset in the data to find the data offset. This offset tells us where the actual data starts. Note that this is an offset from the start of the input data, not from the current reading position.

Once we have this offset, we use it to find the actual data. For strings and bytes, the first 64 characters at this offset represent the length of the string (or bytes) in bytes. For arrays, the first 64 characters represent the length of the array (i.e., the number of elements). After this, we read the specified number of elements.

Let's illustrate this with an example. Suppose we have the following ABI:

And suppose the corresponding transaction input data is:

The first 64 characters (after the 0x prefix) represent the data offset, which is 32 (0x20 in hex). This means the actual data starts 32 bytes from the start of the input data.

Next, at the data offset, the first 64 characters represent the length of the string. Here, the length is 5 (0x05 in hex).

Finally, the next 5 bytes represent the string "alice" in ASCII. We read one byte for each character, and since we're dealing with hexadecimal, each byte corresponds to two hexadecimal characters. This gives us the string "0x616c696365", which is the ASCII representation of "alice".

So, the decoded data for the function registerName is:

{  "name": "alice"}

The initial offset always moves 64 characters for each dynamic type, even if the actual data starts elsewhere. This is because the initial 64 characters are reserved for storing the data offset.

Tuples and nested dynamic types

Now let's delve into the headache that are nested dynamic types such as tuples and arrays of tuples. Tuples are similar to structures in other programming languages; they can group different types, including dynamic types. When a tuple is present, decoding becomes more intricate due to the variety of types that can be included, and if these types are dynamic, additional steps need to be performed.

When we encounter a tuple, we first split it into its constituent types. However, the challenge here is to manage the dynamic types within the tuples. If the tuple contains dynamic types, an offset is present just as with a standalone dynamic type. This offset points us to where the dynamic data actually starts.

But, unlike standalone dynamic types, when we encounter dynamic types within a tuple, we don't simply jump to the data's location. Instead, we need to consider the offset relative to the start of the tuple. In other words, we calculate a new starting point from the beginning of the tuple and add the offset value to it. This new offset will point to the data we're interested in.

Arrays of tuples add another level of complexity. Just like with other dynamic types, we first encounter an offset that points to the length of the array, followed by another offset that points to the start of the array data. Each element in the array is a tuple, so we handle it as explained above. Importantly, if the tuples in the array contain dynamic types, we calculate the offsets relative to the start of each tuple, not the beginning of the array.

In short, dealing with tuples, especially when they contain dynamic types or are part of arrays, requires careful handling of offsets. The offsets are critical to correctly locate and decode the data, and the relative nature of these offsets in the context of tuples makes the decoding process quite intricate. Our decoder makes recursive calls to our main decode_single function, so it is capable of recursively decoding even the most complicated, nested ABIs. This level of complexity in our decoder makes it a robust and versatile tool capable of handling a wide range of input data structures in Ethereum.

We are finished!

Conclusion

Simple right? Not really! Decoding Ethereum transactions, especially when it involves nested dynamic types, can get very tricky. It was a fun but very challenging problem. With our custom decoder, we've transformed our ability to decode Ethereum input data on the fly within Snowflake, enhancing both the speed and accuracy of our analytics. What used to be a complex process involving several external libraries is now a streamlined operation, executed in a single UDF. By translating Ethereum data into a readily accessible format, we can decode millions of transactions within a few seconds.

Our new decoder not only enables us to keep up with the real-time flow of data but also ensures that we do not miss any critical information. This is especially valuable for industries that rely heavily on up-to-date and accurate data, like machine learning and AI. We can now perform precise and real-time analytics that would have been practically impossible with the prior ETL method.

This newfound ability to "see" the inputs provides a wealth of information about the blockchain's state at any given moment. We can determine what triggered a transaction, track the flow of tokens, or understand what prompted a particular change in a contract's state. The possibilities are endless.

We are excited to provide the web3 community democratized access to human readable EVM data. We believe this step forward significantly enhances the entire ecosystem’s ability to make Ethereum data not just accessible, but also actionable, paving the way for more informed decisions across various applications and sectors.