Chatting with Peers

December 2, 2021

Symbol was designed from the start to be a heterogeneous network of nodes. When designing and building Symbol, we expected peer nodes to be the backbone of the network with some api and dual nodes sprinkled in and operated by our biggest supporters, service providers and exchanges. Instead, what happened was quite different.

Today, the vast majority of nodes are dual. Dual nodes certainly expose more capabilities than peer nodes since they include a mongo database and support querying over REST. Unfortunately, the additional feature set also increases the attack surface. For example, a mongo vulnerability could take down the server. In contrast, peer nodes can be thought of as minimum viable nodes. While they're less capable, they still fully support the network and have the smallest possible attack surface.

ℹī¸ While you may have heard the terms peer, api and dual used to describe nodes, they don't actually represent the universe of possible node types. In fact, they are just a handful of possible node configurations. Nodes are customized by the set of extensions they load. An api node simply enables the core set of extensions as well as the mongo and zeromq extensions. Nonetheless, there's nothing preventing a node from enabling the mongo extension but not the zeromq extension or vice versa.

Since peer nodes are fully functional and secure the network just as well as other node configurations, it's important for network tools to support them. In the following sections, we'll write some python code for querying chain statistics from REST nodes AND barebones peer nodes. For our examples, we'll be communicating with xymharvesting.net, so I hope it is strong enough! 😅

Tutorial

First, let's define what we mean by chain statistics. Let's fill in a structure that gives us information about the current height, chain score and finalized height. Together, these give a quick indication of whether or not a node is synced.

class ChainStatistics:
    def __init__(self):
        self.height = 0
        self.finalized_height = 0
        self.score_high = 0
        self.score_low = 0

For illustrative purposes, let's add a formatter for ChainStatistics too:

    def __str__(self):
        score = self.score_high << 64 | self.score_low
        return '\n'.join([
            f'          height: {self.height}',
            f'finalized height: {self.finalized_height}',
            f'           score: {score}'
        ])

REST Query

First, we'll use the Symbol REST API to retrieve the node information. Hopefully, you're already familiar with this API, and this section is a quick refresher! To get the chain statistics, we need to query the /chain/info endpoint.

We can do this easily from the commandline using curl:

curl -s "http://xymharvesting.net:3000/chain/info" | python -m json.tool

This command will give an output like the following:

{
    "height": "747467",
    "scoreHigh": "4",
    "scoreLow": "5750302566595492499",
    "latestFinalizedBlock": {
        "finalizationEpoch": 521,
        "finalizationPoint": 4,
        "height": "747440",
        "hash": "D4EA26FE43937AB9D41A580AA7EFE8865C359F0D0D871C135201A1DD20D5D865"
    }
}

It includes the four properties we want: height, scoreHigh, scoreLow and latestFinalizedBlock.height. We can easily write the equivalent in python using requests:

import requests

def get_chain_statistics_rest(host, port):
    json_response = requests.get(f'http://{host}:{port}/chain/info').json()

    statistics = ChainStatistics()
    statistics.height = int(json_response['height'])
    statistics.finalized_height = int(json_response['latestFinalizedBlock']['height'])
    statistics.score_high = int(json_response['scoreHigh'])
    statistics.score_low = int(json_response['scoreLow'])
    return statistics

Running the code against xymharvesting.net should look something like this:

chain_statistics_rest = get_chain_statistics_rest('xymharvesting.net', 3000)
print(chain_statistics_rest)
          height: 747748
finalized height: 747720
           score: 79568623189124512723

Hopefully, that was a nice easy warmup. Things are about to get a bit more difficult!

Peer Query

Now, we get to the fun stuff! Let's get that same information via the peer node API that is accessible from all nodes, even peer only ones. It's possible, I promise. 😇

Environment Setup

All peer nodes communicate over TLS. In order to connect to any node, a SSL certificate needs to be prepared.

ℹī¸ Symbol-compatible certificates are composed of a two level certificate chain. A two level chain is used because it provides a safe way of associating a node with an importance score.

  1. LEVEL1: CA certificate that is (self) signed by the main account private key. This is used to establish a connection between a node and an importance score. The node importance score is used to weight nodes for certain operations, like node selection and time synchronization. Importantly, only the public key of this certificate is present on the remote server (ca.pubkey.pem). It is possible to use a random key, in which case the node will be associated with zero importance.
  2. LEVEL2: Node/transport certificate that is signed by the LEVEL1 CA. The private key associated with this certificate is present on the remote server (node.key.pem). This private key is used to establish SSH sessions for peer communication as well as decrypting harvest delegation requests.

Symbol requires each node to use a exactly one unique CA network-wide. Neither using multiple CAs from a single node or the same CA from multiple nodes is supported. Accordingly, if you're going to run this code on a node with a running instance of catapult-server, you'll need to use the same certificates used by the server. These can usually be found in a directory named cert or certificates depending on your deployment methods.

If you're running on a node without a running catapult-server, you'll need to do a bit of preparations and generate a Symbol-compatible certificate chain. Luckily, this can be accomplished with the following commands:

# generate CA private key
openssl genpkey -algorithm ed25519 -outform PEM -out ca.key.pem

# get the certtool
git clone https://github.com/symbol/symbol-node-configurator.git

# generate the certificate chain
PYTHONPATH=./symbol-node-configurator python symbol-node-configurator/certtool.py \
    --working cert \
    --name-ca "my cool CA" \
    --name-node "my cool node name" \
    --ca ca.key.pem
cat cert/node.crt.pem cert/ca.crt.pem > cert/node.full.crt.pem

If you've done everything correctly, the following command should write OK to the console:

openssl verify -CAfile cert/ca.crt.pem cert/node.full.crt.pem

If you don't see OK, then something is wrong and you have a legendary fight with OpenSSL ahead of you. 😭 Best of luck and see you on the other side.

Preparing a Connection

Since this example is a bit more complicated than REST, let's start with a class that initializes a connection. This snippet doesn't do too much aside from setting up some variables and initializing an ssl_context. For brevity, we're using a verification mode of CERT_NONE. In production uses, you might want to consider implementing a custom verification handler similar to what is done in the catapult-client and catapult-rest projects.

import socket
import ssl
from pathlib import Path
from symbolchain.core.BufferReader import BufferReader
from symbolchain.core.BufferWriter import BufferWriter


class SymbolPeerClient:
    def __init__(self, host, port, certificate_directory):
        (self.node_host, self.node_port) = (host, port)
        self.certificate_directory = Path(certificate_directory)
        self.timeout = 10

        self.ssl_context = ssl.create_default_context()
        self.ssl_context.check_hostname = False
        self.ssl_context.verify_mode = ssl.CERT_NONE
        self.ssl_context.load_cert_chain(
            self.certificate_directory / 'node.full.crt.pem',
            keyfile=self.certificate_directory / 'node.key.pem')

ℹī¸ node.full.crt.pem is the LEVEL1 and LEVEL2 certificates concatenated to form the Symbol-compatible two level certificate chain. node.key.pem contains the node/transport private key.

In Symbol, all peer communication is wrapped in packets. Each packet is composed of a small header indicating its size and type and an optional payload. Packets are used for both requests and responses. Some requests trigger responses, others do not.

The chain statistics endpoint and most of the interesting endpoints you'll want to interact with have request / response semantics - send a request packet over an SSL socket and wait for a response packet. Some request packets contain data, but many do not. The latter are called simple or header only packets.

Let's write a helper function to orchestrate this request / response flow. For brevity, we'll assume a simple request packet. There is nothing too interesting here. The code is just creating a socket connection and wrapping it in SSL:

    def _send_socket_request(self, packet_type, parser):
        try:
            with socket.create_connection((self.node_host, self.node_port), self.timeout) as sock:
                with self.ssl_context.wrap_socket(sock) as ssock:
                    self._send_simple_request(ssock, packet_type)
                    return parser(self._read_packet_data(ssock, packet_type))
        except socket.timeout as ex:
            raise ConnectionRefusedError from ex

Requests

Let's write a helper function for sending a simple packet. Remember that a simple packet is composed of only a size and a type. So, constructing one is fairly easy using BufferWriter from the symbol python sdk:

    @staticmethod
    def _send_simple_request(ssock, packet_type):
        writer = BufferWriter()
        writer.write_int(8, 4)
        writer.write_int(packet_type, 4)
        ssock.send(writer.buffer)

Responses

Let's write a helper function for receiving a packet and wrapping its data in a BufferReader (from the Symbol Python SDK). First, we read the packet size and then read chunks from the socket until we've received the entire packet data. Next, we wrap a BufferReader around the read bytes and inspect the packet header. Finally, we check that the response packet has the type we're expecting:

    def _read_packet_data(self, ssock, packet_type):
        read_buffer = ssock.read()

        if 0 == len(read_buffer):
            raise ConnectionRefusedError(f'socket returned empty data for {self.node_host}')

        size = BufferReader(read_buffer).read_int(4)

        while len(read_buffer) < size:
            read_buffer += ssock.read()

        reader = BufferReader(read_buffer)
        size = reader.read_int(4)
        actual_packet_type = reader.read_int(4)

        if packet_type != actual_packet_type:
            raise ConnectionRefusedError(f'socket returned packet type {actual_packet_type} but expected {packet_type}')

        return reader

Payoff

If you're still reading, hopefully you aren't totally confused and remember what we set out to do. 😅 For those of you who forgot, it was to get chain statistics from a peer node!

We still need to send a chain statistics request. For querying chain statistics, we need to send a simple request packet of type Chain_Statistics. You can find the value for this and others here. If that link looks familiar, it's because I shared it already, so it's probably important. Looking for chain statistics in that file, we find:

/* Chain statistics have been requested by a peer. */ \
ENUM_VALUE(Chain_Statistics, 5) \

With that we can call our helper function above and we're almost done:

    def get_chain_statistics(self):
        packet_type = 5
        return self._send_socket_request(packet_type, self._parse_chain_statistics_response)

Now, we just need to parse the chain statistics response, but what does it look like? It must be defined somewhere in catapult-client, right? Correct, and it has the completely unsurprising name of ChainStatisticsResponse. Looking at it, it has the exact four fields we want! đŸĨŗ

Knowing that, we can parse out the fields and create a ChainStatistics from it:

    @staticmethod
    def _parse_chain_statistics_response(reader):
        chain_statistics = ChainStatistics()

        chain_statistics.height = reader.read_int(8)
        chain_statistics.finalized_height = reader.read_int(8)
        chain_statistics.score_high = reader.read_int(8)
        chain_statistics.score_low = reader.read_int(8)

        return chain_statistics

Finally, putting it all together, we can make a peer query and hope it works: 🤞

# CERTIFICATE_DIRECTORY should point to a directory containing Symbol-compatible certificates
peer_client = SymbolPeerClient('xymharvesting.net', 7900, CERTIFICATE_DIRECTORY)
chain_statistics_peer = peer_client.get_chain_statistics()
print(chain_statistics_peer)

If you did everything right, you should see the exact same output as from the previous section:

          height: 747748
finalized height: 747720
           score: 79568623189124512723

Hopefully, you learned something new and can now go off and converse with peers.

Reader Exercises

  1. Retrieve node information via the node/info endpoint and Node_Discovery_Pull_Ping packet.
  2. Convert the examples from synchronous to asynchronous.
  3. Make other suggested improvements to the code and make a PR to miscellaneous.