COS 460/540 - Computer Networks

Project 4: BitClone (Blocks)

Specification

For the next two projects we will be creating a lightweight, stripped down, [BitCoin][1] clone called BitClone. The project is broken into two parts and focuses on the networking protocols not the cryptocurrency portions. In this first of two assignments, you will create the peer to peer network protocol. In the second part you will implement the distributed database protocol.

The Peer to Peer (P2P) protocol you need to implement for this assignment includes:

  • (3) Peer discovery
    • by command line parameter
    • by saved peers from last execution
  • (3) Connection to peers
    • version and verack messages
  • (3) Peer maintenance
    • getaddr and addr messages
    • ping and pong messages
    • periodic advertising (every 30 seconds)
  • (3) Additional/Management features
    • reject message to report client errors
    • message message to report diagnostics
  • (4) Block Inventory and Exchange
    • getblockinv message to request a block inventory
    • blockinv message to advertise or report inventory of blocks
    • getblocks message to request blocks
    • block message to advertise or send a block
    • tx message to advertise a transaction
  • (4) Block and Transaction structures

NOTE: (3) indicates these are required for Project #3, (4) indicates they are required for Project #4.

Reference Materials

There are a number of documents that will be helpful in understanding the network protocols used by both BitCoin and our project BitClone.

Protocol Overview

The network protocol you will be implementing is based on but not the same as [BitCoin][1]’s P2P protocol. BitCoin uses binary data exchange where you will implement a text-based protocol over TCP (the same as your earlier projects). You should read the BitCoin Protocol documentation and the BitCoin Developer Guide to get an idea of how the P2P aspects of the network work. Some of that documentation is paraphrased below.

Our BitClone protocol is text-based and exchanges messages terminated with the carriage return/line feed pair. Fields within each message are delimited with the vertical bar or pipe character ( ). The delimiter character is thus not allowed in any data fields.

A sample message might look like the following:

message|field-1|2468|192.168.1.2

The BitClone protocol is a Peer to Peer protocol and is NOT stop and wait. That means, you might send a message to the remote peer, but the immediate response is for a different “request.” This means you will need to keep track of any outstanding requests from your peer to others and note when they are completed. The protocol helps in that it is mostly stateless and situations where this tracking is needed are few (specifically in the initial peer connection handshake).

On startup you will connect to other known peers and exchange version and verack messages. Once both peers have sent their verack the connection is established. There are both ‘inbound’ and ‘outbound’ connections, inbound are connections where others have connected to you and outbound are where you initiated the connection. The only significant difference is who is responsible for the initial version message. Once the connection is established (verack messages exchanged) there is no difference.

After the connection is established, the peers will begin to exchange peer addresses via the getaddr and addr messages. You will send getaddr to the remote peer and they will likely be sending one to you one at about the same time. The remote will respond to your getaddr with one or more addr messages, which you will use to update your internal list of peers. At the same time you will be responding to the remote peer’s getaddr messages with addr messages containing addresses you know of.

When you receive new peer information, you should keep connecting to new peers until you have five approximately (5) active connections. If you lose connectivity to one, choose another and connect to it, to keep your connections constant as close to five as possible. You may include both inbound and outbound connections in this count.

After the initial flurry of getaddr and addr messages you must, every 30 seconds, send ping requests to your connected peers to ensure they are still active (and respond with pong messages). If a peer is inactive for 90 seconds, drop it from your list of peers until you see them with a more recent timestamp. You should be receiving ping messages every 30 seconds from your connected peers as well.

Peer Discovery

There are three methods that your code must have to do discover peers; (1) as a command line parameter (2) by reading the saved state from the last program execution and (3) getting peer addresses from known peers. The first two are addressed here and the third below in Peer Maintenance

On startup, you will need to read from the command line and your saved file an ip-address:port for each known peer. Where ip-address is either an IPv4 or IPv6 address and port is the TCP port number the remote peer is listening on.

When your program exits (or during its execution), it should save it’s known active peers. This saved list of peers should also be used at program startup to pre-populate the known peers.

The order of peer connections is undefined.

Connection to Peers

Connecting to peers is done over TCP to the ip-address:port specified for each peer. When creating an outbound connection you are responsible for sending a version message as the initial action. The remote peer will send a verack message in response. The remote peer will also respond with it’s own version message and expect you to respond with a verack message. No other messages may be exchanged on the connection until both peers have sent and received version and verack messages successfully.

Version Message

The version message must contain:

  • sender’s version number, a number
    • 3 for project 3
    • 4 for project 4
  • sender’s services, a bitfield
    • 00000001: peer to peer connectivity services (project 3)
    • 00000010: block chain / block services (project 4)
    • 00000100: block chain mining services (mines new blocks - not used)
  • current time, a number (the number of seconds since Jan 1, 1970)
  • recepient’s IP address and port, an ip-address:port (x.x.x.x,y)
  • sender’s IP address and port, an ip-address:port (x.x.x.x:y)
  • nonce, random number to prevent self-connections, a number
  • user-agent string describing the peer’s software, string (your @maine.edu username)
  • a block identifier, a number (unused in project 3)

Example:

version|3|1|1507490964|192.168.1.10:8090|130.111.135.12:2098|8192|houser|9000

In this example, in order:

  • version is the command
  • the first 3 represents version 3 of the protocol for project 3
  • the second 1 represents the lowest bit in the service field being set, meaning this peer offers peer to peer connectivity services only (e.g. implements the features required for project 3)
  • the current time in seconds since Jan 1, 1970
  • the intended recipient’s ip address and port
  • the senders ip address and port (external ip address)
  • a randomly generated nonce number
  • a unique string to represent this peer, your @maine.edu account name (e.g. houser)
  • a block identifier that represents the highest (largest) block this peer knows about. This is unused in project 3.

Verack Message

The verack message is a response to a version request and should follow it in a reasonable amount of time. The verack message must contain the following:

  • the nonce from the associated version message, a number

Example:

verack|9000

Peer Maintenance

Once your peer is connected to another peer and has exchanged version and verack messages it will need to retrieve a list of known peers from others. It will also have to provide it’s list of known peers. The two messages for this exchange are getaddr and addr. Peers also maintain periodic contact (at least every 30 seconds) with the ping and pong messages if there is no other activity.

Getaddr Message

The getaddr message requests the remote peer send a list of it’s known peers. The getaddr message has no parameters.

Example:

getaddr

Addr Message

The addr message transmits a list of known peers. This is typically (but not required to be) in response to a getaddr message. The addr message must contain the following fields:

  • The number of peers contained in the message
  • A list of peers of the format:
    • last time the peer was active, a number (the number of seconds since Jan 1, 1970)
    • the IP address and port of the peer, an ip-address:port

Example:

addr|2|1507490964|192.168.1.10:2989|1507490964|130.111.135.1:8767

Wile you could send one peer per addr message and send multiple messages to remote peers in response to a getaddr message, this will be considered a protocol violation if done repeatedly and your peer will be dropped from the network. addr messages are designed to include multiple peers and should do so.

Ping Message

The ping message is used to maintain connectivity and verify the remote peer is still active. Peers need to maintain activity to avoid being expired from the network. Using the ping message and it’s counterpart pong perform this service.

The ping message must contain the following fields:

  • a unique nonce, a number (randomly generated)

Example:

ping|23456

Pong Message

The pong message is a response to a ping request and should follow it in a reasonable amount of time. The pong message must contain the following:

  • the nonce from the associated ping message, a number

Example:

pong|23456

Additional/Management features

The protocol also contains commands/messages to accomodate request failures or invalid messages and diagnostic information these are as follows.

Reject Message

The reject message is sent by a peer when it receives and invalid, incorrect, or otherwise indeciperable message or request. A peer may be dropped if a significant number of incorrect messages are received.

The reject message must contain the following fields:

  • a response code, a number
    • 400-499 for client errors, e.g. bad request, bad message, bad formatting
    • 500-599 for server errors, e.g. server inoperable, database problems, etc.
  • a reason for the rejection, a string
  • additional data to be displayed to the user, a string

Example:

reject|400|Invalid command|The command "send me flowers" was not recognized.

Message Message

The message message is a general diagnostic communication mechanisim. It is used to convey information that is not part of the formal protocol. Examples may include debugging and informational information.

The message message must contain the following fields:

  • a response code, a number
    • 100-199 for informational messages (e.g. debugging)
  • short version of the message, a string
  • additional data to be displayed to the user, a string

Example:

message|100|peer statistics|43 requests, 10 rejections, 76% correct

Get Block Inventory Message

The getblockinv message is used to request an inventory of blocks from a peer. The inventory represents just the hash values for the blocks a peer has, not the entire blocks.

The getblockinv message must contain the following fields:

  • the md5 digest (hash) of the starting block, a string
    • the genisis block’s digest is block 0, use this to start at the beginning.
  • the md5 digest (hash) of the ending block, a string
    • use 0 to represent the last known block

Example: Request all blocks since 83832391027a1f2f4d46ef882ff3a47d:

getblockinv|83832391027a1f2f4d46ef882ff3a47d|0

The blockinv message is used to advertise or send an inventory of blocks to a peer. The inventory represents just the hash values for the blocks a peer has, not the entire blocks. This message is also used to advertise new blocks that have been mined and should be added to the blockchain.

The blockinv message must contain the following fields:

  • the number of block hashses in the message, a number
  • the md5 digests (hashes) of the blocks, a string

Example:

blockinv|3|bdc71fd90757d7c309b452fbb5bfb922|7b8e7bebb57a5b0c8281cefcc60f4885|68fe92042df93dbed116801362fc1c55

Get Block Message

The getblocks message is used to request blocks from a peer.

The getblocks message must contain the following fields:

  • the number of block hashses in the message, a number
  • the md5 digests (hashes) of the blocks, a string

Example:

getblocks|3|bdc71fd90757d7c309b452fbb5bfb922|7b8e7bebb57a5b0c8281cefcc60f4885|68fe92042df93dbed116801362fc1c55

The block message is used to send a block to a peer. A miner will collect unconfirmed transactions and bundle them into a block, thus a block consists of a bundle of transactions (see tx).

The block message must contain the following fields:

  • the block number representing the index in the blockchain, a number
  • time the block was created, a number (the number of seconds since Jan 1, 1970)
  • a unique nonce, a number (randomly generated to meet the mining requirements)
  • the md5 digest (hash) of the previous block, a string
  • the md5 digest (hash) of the this block, a string
  • the number of transactions in the block
  • the transactions base64 encoded as one element

Example:

block|0|1511980666|42|0|fd6b2d5319fa40bd41f410bed54525f2|2|OTAyMTB8YmRjNzFmZDkwNzU3ZDdjMzA5YjQ1MmZiYjViZmI5MjJ8SXQgd2FzIHRoZSBiZXN0IG9mIHRpbWVzIGl0IHdhcyB0aGUgd29yc3Qgb2YgdGltZXMsDQo3ODIwMTB8NjQ0NmMyOGY5M2JiNDI4Y2I0ZjFmMmIyYTU5MzllMTB8aXQgd2FzIHRoZSBhZ2Ugb2Ygd2lzZG9tLCBpdCB3YXMgdGhlIGFnZSBvZiBmb29saXNobmVzcywNCg==

The tx message is used to advertise a transaction to the system. A miner will collect unconfirmed transactions and bundle them into a block. When the tx advertiser sees the transaction has been incorporated into a block, it becomes a confirmed transaction.

Transactions should be forwarded to your connected peers. Don’t forward it back to the peer you received it from.

The tx message must contain the following fields:

  • a unique nonce to identify the transaction, a number (randomly generated)
  • the md5 digest (hash) of the transaction, a string
  • the transaction message, a string

Example:

tx|90210|bdc71fd90757d7c309b452fbb5bfb922|It was the best of times it was the worst of times,

tx|782010|6446c28f93bb428cb4f1f2b2a5939e10|it was the age of wisdom, it was the age of foolishness,

NOTE: We are using MD5 rather than a ‘more secure’ hash like SHA256 just because it’s shorter and implementations are widely available. You do not have to write your own MD5. Use a library that’s already available.

Block Structure

Additional Information

  • The server must accept as a configurable parameter (on the command line) the port number to listen on.
  • You must include the file named PLAYBOOK.md
  • PLAYBOOK.md has Your name
  • PLAYBOOK.md has what language you used
  • PLAYBOOK.md has a brief synopsis of your experience with the assignment (1-3 paragraphs).
  • PLAYBOOK.md has how to compile and execute your project.
  • You must not include any executable binary files. Submit only code.
  • You may include a script or batch file to compile and/or run your server. This must be documented in your PLAYBOOK.md if it is included.
  • You must submit your project through GitHub

Definitions

The IETF Best Practice Document RFC 2119 Key words for use in RFCs to Indicate Requirement Levels defines several keywords that are used in this assignment and throughout the course. Pay special attention to where they appear in the assignment.

Some keywords used in this assignment are as follows;

MUST: This word, or the terms REQUIRED or SHALL, mean that the definition is an absolute requirement of the specification.

SHOULD: This word, or the adjective RECOMMENDED, mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be under.

MAY: This word, or the adjective OPTIONAL, mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item.