If you explore the Etherscan page, you’ll notice the Event Logs, State Changes, and Comments tabs. I don’t cover those here, but I do revisit them in Chapter 6. Transaction data isn’t the only data you’ll encounter in a blockchain application. Smart contract developers commonly use events to log notable actions in a smart contract. Data from these events are often of interest in the data analysis process. You’ll see this type of data again.
Categorizing Common Data in a Blockchain
You’ve already seen most of the types of data you’ll use when carrying out blockchain analytics. You’ve seen block header data, basic transaction data, and details contained in some transactions. You may have investigated the Etherscan user interface to view some event data, and even the effect a transaction has on the blockchain state. In this section, you learn more about the main categories of blockchain app data: transaction, events, and state.
Serializing transaction data
The core of blockchain data is contained in the transaction. A blockchain transaction records the transfer of some value from one account to another account. Additional information may be in the transaction, such as input data that records smart contract parameters, but not every transaction includes additional data.
Each transaction does include a timestamp showing the date and time the transaction was mined, so you can create a chronological list of transactions and see how value changed ownership at specific points in time and how value moved among accounts. This movement is serial. The serial nature of data storage can yield interesting information but can also be an obstacle to analyzing the data.
Unlike traditional data storage systems such as relational databases, final tallies or balances often have to be calculated over time. A traditional database can store the current balance of an account, while you may have to trace all blockchain transactions for an account to arrive at its final balance. The data is available, but it may take more work to get to it.
Blockchain gives you the flexibility of tracing transactions by account but doesn’t always make it easy to query a single value. For example, suppose you want to know the balance of a specific account on a specific date. Finding the current account balance is easy, but finding the balance as of a specific date (and time), requires serializing the transactions for that account and calculating account increases and decreases up to the date and time in question.
If you're comfortable with databases and applications that access database data, searching transactions doesn’t sound like such a bad thing to do. However, remember that a blockchain is not a database. The data in a blockchain is not stored in a manner that makes general purpose queries easy and fast. You can get the information you want, but you have to think about the effort to get that data in a different way.
The serialized transaction storage of blockchain data does provide the flexibility to trace and retrieve activity data in several ways. Here are a few types of queries you can satisfy by tracing blockchain transactions:
Find all transactions in which a specific account sent funds.
Find all transactions that resulted in a specific account receiving funds.
Find all transactions that occurred between two specific accounts.
Find all transactions that invoked a specific smart contract function.
After you fetch the data you want, you can trace the transactions, calculating the value change (that is, keeping track of the Value and Transaction Fee fields) to find the information you’re looking for, such as a balance at a specific point.
Logging events on the blockchain
One of the more interesting aspects of blockchain data extends the information you can get from transactions. As mentioned, a transaction is the transfer of some value as a result of a smart contract function. Because the only way to create a transaction is to invoke a smart contract function, you can be sure that a transaction is the result of a function.
The previous statement may sound redundant, but it's extremely important. Smart contract functions can be simple or complex. As smart contracts become more complex, just knowing the function a transaction invoked, along with its input parameter values, isn’t always enough information to describe what’s going on. You need a way to record what happens inside transactions.
Ethereum, and most popular blockchains second generation and beyond, support sophisticated smart contract languages. Ethereum’s EVM (Ethereum virtual machine) is a Turing complete machine, so with enough resources, an Ethereum smart contract can calculate anything. Of course, in the real world, transactions eventually run out of gas, but the point is that your smart contract functions can be as complex as you want.
Go back to Etherscan and dig a little deeper into block 8976776’s transactions. Examine the same transaction in Figure 3-4 (block 8976776 -> Transaction list -> Fourth transaction in the list’s details). Click or tap the Event Logs tab at the top of the page. The Event Logs page shows a list of events that occurred during a smart contract function. Figure 3-9 shows the last two events for the current transaction.
FIGURE 3-9:Ethereum events in Etherscan.
Note that these events have names — LogTransfer()
and LogOrderCancelled()
— and parameters. Smart contract programmers use events to create messages that Ethereum logs and saves. Events make it easy to notify client applications that certain actions have taken place in a smart contract and also to store important information related to a transaction.
Smart contract programmers use events to record internal details of how smart contracts operate. The programmer defines events and the parameters passed when the events are called. Then, during runtime, the smart contract invokes the event when something notable happens in the code. For example, when using the popular language Solidity for writing smart contract code, the emit
command invokes an event. Any time a programmer wants to send a message to the client or record an action, the emit
statement invokes an event to do just that.
Most smart contract programmers use event names that describe the action. So we would expect that the LogOrderCancelled()
event is present because an order was cancelled in the transaction. Smart contract programmers can create events anywhere in their code. The most common purpose of an event is to record the occurrence of an action, such as cancelling an order. The event parameters, orderHash
and by
, provide identifying information for the order that was cancelled and who cancelled it. Events take some effort to analyze but can yield interesting analysis data.
Storing value with smart contracts
The last main category of data associated with a blockchain is the state data. State data is the data that is most like traditional database data. Each smart contract can define one or more variables or structures to store data values. These values can include things such as highest order number (for an order entry contract) or a list of products (for a supply chain contract). State data make it possible to store data that contracts use each time one of their functions is invoked.
Читать дальше