Python Interop and VRF

December 11, 2022

This guide will explore calling basic catapult client C++ functions from Python using cffi. This is an advanced topic, but it opens up a wide range of possibilities. With a little work, Python applications can access functionality already implemented in the catapult client but not in the SDKs - for example VRFs.

For the purposes of this example, we will be calling the following functions from VRF.h:

/// Generates a verifiable random function proof from \a alpha and \a keyPair.
VrfProof GenerateVrfProof(const RawBuffer& alpha, const KeyPair& keyPair);

/// Verifies verifiable random function proof (\a vrfProof) using \a alpha and \a publicKey.
Hash512 VerifyVrfProof(const VrfProof& vrfProof, const RawBuffer& alpha, const Key& publicKey);

/// Generates a verifiable random function proof hash from \a gamma.
Hash512 GenerateVrfProofHash(const ProofGamma& gamma);

You can code along or find fully working, step by step, code here.

C API

CFFI only supports C functions. As a result, in order to use cffi, we need to create a small C wrapper around the C++ functions we want to call.

VrfProof

VrfProof is a struct containing three ByteArray-based fields. ProofGamma and ProofScalar are 32 bytes each and ProofVerificationHash is 16 bytes.

ℹī¸ ByteArray-based types are used to provide enhanced type safety in catapult client. Notice that even though ProofGamma and ProofScalar have the same size (32 bytes) and specialize ByteArray, they are NOT interchangeable because they have different tags.

/// VRF proof gamma.
struct ProofGamma_tag { static constexpr size_t Size = 32; };
using ProofGamma = utils::ByteArray<ProofGamma_tag>;

/// VRF proof verification hash.
struct ProofVerificationHash_tag { static constexpr size_t Size = 16; };
using ProofVerificationHash = utils::ByteArray<ProofVerificationHash_tag>;

/// VRF proof scalar.
struct ProofScalar_tag { static constexpr size_t Size = 32; };
using ProofScalar = utils::ByteArray<ProofScalar_tag>;

/// VRF proof for the verifiable random function.
struct VrfProof {
    /// Gamma.
    ProofGamma Gamma;

    /// Verification hash.
    ProofVerificationHash VerificationHash;

    /// Scalar.
    ProofScalar Scalar;
};

Since we are creating a C wrapper for use in python - a dynamically typed language - we will drop the enhanced type safety of the VrfProof fields and simply use fixed size arrays:

struct CVrfProof {
    unsigned char Gamma[32];
    unsigned char VerificationHash[16];
    unsigned char Scalar[32];
};

CatapultGenerateVrfProof

The C++ declaration is:

/// Generates a verifiable random function proof from \a alpha and \a keyPair.
VrfProof GenerateVrfProof(const RawBuffer& alpha, const KeyPair& keyPair);

alpha is a RawBuffer, which is variable-sized buffer composed of two fields:

/// Data pointer.
T* pData;

/// Data size.
size_t Size;

In the C declaration, we can simply expand alpha into two parameters: const unsigned char* pointing to the data and unsigned int indicating the size of the data.

keyPair is a KeyPair instance, which is guaranteed to have a matching private key and public key. It is a bit clumsy to enforce this constraint in C. In the C-declaration, we could pass in both the private and public key separately. Alternatively, we could just pass in the private key and derive the public key from it. The latter comes at an additional performance cost - due to the key derivation - but guarantees the public and private keys will always match. For conciseness, we will use a single const unsigned char* fixed size buffer pointing to the private key.

VrfProof is the return value. In C, it is conventional to return large values via out parameters, so that the caller has full control over allocation decisions. Accordingly, in the C declaration, we will return the VrfProof via an out parameter. The out parameter will be of type CVrfProof. The function will not have any return value.

Putting that all together, the C declaration looks like:

PLUGIN_API
void CatapultGenerateVrfProof(
        const unsigned char* alpha,
        unsigned int alphaSize,
        const unsigned char* privateKey,
        struct CVrfProof* vrfProof);

The C implementation is fairly straightforward.

First, we need to prepare the C arguments to be able to call the C++ function:

  1. Wrap a RawBuffer around alpha and alphaSize
  2. Create a KeyPair from privateKey, which will derive the public key

Second, we need to call the C++ function.

Finally, we need to copy the C++ result into the vrfProof out parameter.

Altogether, the implementation looks like:

using namespace catapult::crypto;

// 1. wrap KeyPair around private key
auto cppKeyPair = KeyPair::FromPrivate(PrivateKey::FromBuffer({ privateKey, PrivateKey::Size }));

// 2. call c++ function
auto cppVrfProof = GenerateVrfProof({ alpha, alphaSize }, cppKeyPair);

// 3. copy result
std::memcpy(vrfProof->Gamma, cppVrfProof.Gamma.data(), cppVrfProof.Gamma.size());
std::memcpy(vrfProof->VerificationHash, cppVrfProof.VerificationHash.data(), cppVrfProof.VerificationHash.size());
std::memcpy(vrfProof->Scalar, cppVrfProof.Scalar.data(), cppVrfProof.Scalar.size());

VerifyVrfProof

The C++ declaration is:

/// Verifies verifiable random function proof (\a vrfProof) using \a alpha and \a publicKey.
Hash512 VerifyVrfProof(const VrfProof& vrfProof, const RawBuffer& alpha, const Key& publicKey);

vrfProof is a (C++) VrfProof instance that can be easily replaced with a CVrfProof parameter.

alpha will be expanded into two parameters: const unsigned char* pointing to the data and unsigned int indicating the size of the data.

publicKey will be replaced with a const unsigned char* fixed size buffer pointing to the public key.

The return value will be replaced with an out parameter. The out parameter will be a const unsigned char* fixed size buffer pointing to the resulting 64 byte hash.

Putting that all together, the C declaration looks like:

PLUGIN_API
void CatapultVerifyVrfProof(
        const struct CVrfProof* vrfProof,
        const unsigned char* alpha,
        unsigned int alphaSize,
        const unsigned char* publicKey,
        unsigned char* hash512);

The implementation follows the same template as above and looks like:

using namespace catapult::crypto;
using PublicKey = catapult::Key;

// 1. create VrfProof from CVrfProof
VrfProof cppVrfProof;
std::memcpy(cppVrfProof.Gamma.data(), vrfProof->Gamma, ProofGamma::Size);
std::memcpy(cppVrfProof.VerificationHash.data(), vrfProof->VerificationHash, ProofVerificationHash::Size);
std::memcpy(cppVrfProof.Scalar.data(), vrfProof->Scalar, ProofScalar::Size);

// - copy publicKey to ByteArray
PublicKey cppPublicKey;
std::memcpy(cppPublicKey.data(), publicKey, PublicKey::Size);

// 2. call c++ function
auto cppHash512 = VerifyVrfProof(cppVrfProof, { alpha, alphaSize }, cppPublicKey);

// 3. copy result
std::memcpy(hash512, cppHash512.data(), cppHash512.size());

GenerateVrfProofHash

The C++ declaration is:

/// Generates a verifiable random function proof hash from \a gamma.
Hash512 GenerateVrfProofHash(const ProofGamma& gamma);

gamma will be replaced with a const unsigned char* fixed size buffer pointing to the proof gamma.

The return value will be replaced with an out parameter. The out parameter will be a const unsigned char* fixed size buffer pointing to the resulting 64 byte hash.

Putting that all together, the C declaration looks like:

PLUGIN_API
void CatapultGenerateVrfProofHash(const unsigned char* gamma, unsigned char* hash512);

The implementation follows the same template as above and looks like:

using namespace catapult::crypto;

// 1. copy gamma to ByteArray
ProofGamma cppGamma;
std::memcpy(cppGamma.data(), gamma, ProofGamma::Size);

// 2. call c++ function
auto cppHash512 = GenerateVrfProofHash(cppGamma);

// 3. copy result
std::memcpy(hash512, cppHash512.data(), cppHash512.size());

Build Notes

CFFI works with both static and dynamic C libraries. We need to link against C++ libraries, which is something the C linker can't do. In order to work around that, we need to use a dynamic library. All the C++ dependencies will be resolved during the build of the dynamic library. CFFI will only need to link against the C-function wrappers, which it is able to do.

In the catapult client build system, we can build a dynamic library using the following instructions:

catapult_shared_library_target(catapult.cvrf)
target_link_libraries(catapult.cvrf catapult.crypto)

ℹī¸ We need to link against the catapult.crypto library because it contains the VRF functions we're calling!

In addition, we need to mark all the C functions as functions we want to export from the dynamic library. We use the PLUGIN_API macro for that, which can be found here.

When building the dynamic library, we need to make sure all the exported function are using the C-calling convention. In order to do that, we need to wrap them in an extern "C" block.

#ifdef __cplusplus
extern "C" {
#endif
...
#ifdef __cplusplus
}
#endif

Notice the extern "C" block is conditional and included only when building with a C++ compiler. A C compiler will always use C-calling convention, so the block is redundant (and unrecognized).

Building

To build from source, create a _build directory following this guide. For a better experience, in step 3, the final call to ninja (which builds the entire branch) can be omitted. Instead, use the command ninja catapult.cvrf (which will only build the VRF C interop DLL and its dependencies).

CFFI (Build)

Now that we have the C API, we need to use CFFI to produce a Python-callable wrapper.

First, we need to create an FFI builder

from cffi import FFI

ffi_builder = FFI()

For simplicity, we'll assume an environment variable exists that points to the catapult client source code.

catapult_client_root = Path(os.environ.get('CATAPULT_CLIENT_ROOT'))
catapult_default_bin_directory = catapult_client_root / '_build' / 'bin'

Next, we need to call set_source to point the builder to our code and its dependencies:

  1. We configure the name of the output python module to be _vrf (this name will be used in import statements in our python code).
  2. We include our shim header containing all C functions.
  3. We set up include and library directories relative to the catapult client root directory.
  4. We specify the name of the library containing our C functions (catapult.cvrf) to link against.
ffi_builder.set_source(
    '_vrf',
    r'''
        #include "VrfShim.h"
    ''',
    include_dirs = [
        catapult_client_root / 'examples' / 'vrfinterop' / 'cdll',
        catapult_client_root / 'src'
    ],
    library_dirs = [str(catapult_default_bin_directory)],
    libraries=['catapult.cvrf'],
    extra_link_args=extra_link_args)

ℹī¸ On certain *nix operating systems, we need to additionally set the RPATH so that dynamic libraries can be found at run time. To do so, we can use the following code:

if 'Darwin' == os.uname().sysname:
    extra_link_args += ['-rpath', str(catapult_default_bin_directory)]
    boost_lib_bin_directory = os.environ.get('BOOST_BIN_DIRECTORY', None)
    if boost_lib_bin_directory:
        extra_link_args += ['-rpath', str(boost_lib_bin_directory)]

BOOST_BIN_DIRECTORY should be set to the directory containing the boost dynamic libraries if they are not in the same directory as the catapult dynamic libraries.

Then, we need to specify the structs and functions we want to make callable from python. In our case, we want all of our structs and functions callable. To do this, we need to specify their declarations in cdef calls. Two calls are needed because the functions are dependent on the structures (i.e. they have CVrfProof parameters):

ffi_builder.cdef('''
    struct CVrfProof {
        unsigned char Gamma[32];
        unsigned char VerificationHash[16];
        unsigned char Scalar[32];
    };
''')

ffi_builder.cdef('''
    void CatapultGenerateVrfProof(
            const unsigned char* alpha,
            unsigned int alphaSize,
            const unsigned char* privateKey,
            struct CVrfProof* vrfProof);

    void CatapultVerifyVrfProof(
            const struct CVrfProof* vrfProof,
            const unsigned char* alpha,
            unsigned int alphaSize,
            const unsigned char* publicKey,
            unsigned char* hash512);

    void CatapultGenerateVrfProofHash(const unsigned char* gamma, unsigned char* hash512);
''')

Finally, we need to set the default script action to compile the CFFI module.

if '__main__' == __name__:
    ffi_builder.compile(verbose=True)

Running the python file should produce a handful of _vrf files that can be imported from other python scripts.

Building

In order to build the _vrf interop dynamic library, run the following commands:

cd examples/vrfinterop
pip install -r requirements.txt
CATAPULT_CLIENT_ROOT=../.. python -m _cffi.vrf_build

On success, you should see generated files starting with _vrf.

Python 🐍

In order to prove everything we did works, we will write a short python script that validates our VRF test vectors used to validate our client reference implementation. This is another good time to remind you to always use official test vectors!

For completeness, the test vectors are reproduced here:

TestCaseInput = namedtuple('TestCaseInput', ['private_key', 'alpha'])
TestCaseOutput = namedtuple('TestCaseInput', ['gamma', 'verification_hash', 'scalar', 'beta'])
TestCase = namedtuple('TestCase', ['input', 'output'])

test_cases = [
    TestCase(
        TestCaseInput('9D61B19DEFFD5A60BA844AF492EC2CC44449C5697B326919703BAC031CAE7F60', ''),
        TestCaseOutput(
            '9275DF67A68C8745C0FF97B48201EE6DB447F7C93B23AE24CDC2400F52FDB08A',
            '1A6AC7EC71BF9C9C76E96EE4675EBFF6',
            '0625AF28718501047BFD87B810C2D2139B73C23BD69DE66360953A642C2A330A',
            'A64C292EC45F6B252828AFF9A02A0FE88D2FCC7F5FC61BB328F03F4C6C0657A9D26EFB23B87647FF54F71CD51A6FA4C4E31661D8F72B41FF00AC4D2EEC2EA7B3'
        )
    ),
    TestCase(
        TestCaseInput('4CCD089B28FF96DA9DB6C346EC114E0F5B8A319F35ABA624DA8CF6ED4FB8A6FB', '72'),
        TestCaseOutput(
            '84A63E74ECA8FDD64E9972DCDA1C6F33D03CE3CD4D333FD6CC789DB12B5A7B9D',
            '03F1CB6B2BF7CD81A2A20BACF6E1C04E',
            '59F2FA16D9119C73A45A97194B504FB9A5C8CF37F6DA85E03368D6882E511008',
            'CDDAA399BB9C56D3BE15792E43A6742FB72B1D248A7F24FD5CC585B232C26C934711393B4D97284B2BCCA588775B72DC0B0F4B5A195BC41F8D2B80B6981C784E'
        )
    ),
    TestCase(
        TestCaseInput('C5AA8DF43F9F837BEDB7442F31DCB7B166D38535076F094B85CE3A2E0B4458F7', 'af82'),
        TestCaseOutput(
            'ACA8ADE9B7F03E2B149637629F95654C94FC9053C225EC21E5838F193AF2B727',
            'B84AD849B0039AD38B41513FE5A66CDD',
            '2367737A84B488D62486BD2FB110B4801A46BFCA770AF98E059158AC563B690F',
            'D938B2012F2551B0E13A49568612EFFCBDCA2AED5D1D3A13F47E180E01218916E049837BD246F66D5058E56D3413DBBBAD964F5E9F160A81C9A1355DCD99B453'
        )
    ),
]

In addition, we'll add a helper function that takes a buffer and returns a hex string:

def to_hex_string(buffer):
    return hexlify(bytes(buffer)).upper().decode('utf8')

Importing our CFFI produced module is really easy. ffi is used for memory management / interop while lib contains our imported functions.

from _vrf import lib, ffi

CatapultGenerateVrfProof

We need to use ffi to create a CVrfProof instance that we can pass to the function. We can extract the alpha and private_key from the TestCaseInput.

alpha = unhexlify(test_case.input.alpha)
private_key = PrivateKey(test_case.input.private_key)
vrf_proof = ffi.new('struct CVrfProof *');
lib.CatapultGenerateVrfProof(alpha, len(alpha), private_key.bytes, vrf_proof)

Notice that the CVrfProof fields are directly mapped into python and can be accessed directly. Knowing this, we can compare them to the expected outputs in TestCaseOutput:

assert test_case.output.gamma == to_hex_string(vrf_proof.Gamma)
assert test_case.output.verification_hash == to_hex_string(vrf_proof.VerificationHash)
assert test_case.output.scalar == to_hex_string(vrf_proof.Scalar)

CatapultVerifyVrfProof

We need to derive the public key from the private key.
In addition, we need to create a bytes placeholder that will hold the output hash:

public_key = KeyPair(private_key).public_key
proof_hash = bytes(64)
lib.CatapultVerifyVrfProof(vrf_proof, alpha, len(alpha), public_key.bytes, proof_hash);

We can compare the proof hash to the expected proof hash (beta):

assert test_case.output.beta == to_hex_string(proof_hash_out)

CatapultGenerateVrfProofHash

The vrf_proof Gamma field can be passed directly. Once again, we need a bytes placeholder that will hold the output hash.

proof_hash_2 = bytes(64)
lib.CatapultGenerateVrfProofHash(vrf_proof.Gamma, proof_hash_2)

As above, we can compare the proof hash to the expected proof hash (beta):

assert test_case.output.beta == to_hex_string(proof_hash_2)

Running

cd examples/vrfinterop
python -m example

ℹī¸ If you get an error like Library not loaded: '@rpath/libboost_date_time.dylib, you will need to rebuild the CFFI dynamic library with the BOOST_BIN_DIRECTORY environment variable set. And, you're probably running a MacOS!

Postscript

Now, you've learned how to call catapult client functions from python! 🎊

Think about what can you do with this great new power! But, remember: