Specification

For the next two projects we will be creating a lightweight, stripped down, [BitCoin][1] clone called BitClone. The project is broken into two parts and focuses on the networking protocols not the cryptocurrency portions. In this first of two assignments, you will create the peer to peer network protocol. In the second part you will implement the distributed database protocol.

You may work in teams of two on this and the next project. Please email or talk to me if you want to do this and we can setup GitHub to make it happen.

The Peer to Peer (P2P) protocol you need to implement for this assignment includes:

  • Peer discovery
    • by command line parameter
    • by saved peers from last execution
  • Connection to peers
    • version and verack messages
  • Peer maintenance
    • getaddr and addr messages
    • ping and pong messages
    • periodic advertising (every 30 seconds)
  • Additional/Management features
    • reject message to report client errors
    • message message to report diagnostics

Reference Materials

There are a number of documents that will be helpful in understanding the network protocols used by both BitCoin and our project BitClone.

Protocol Overview

The network protocol you will be implementing is based on but not the same as [BitCoin][1]’s P2P protocol. BitCoin uses binary data exchange where you will implement a text-based protocol over TCP (the same as your earlier projects). You should read the BitCoin Protocol documentation and the BitCoin Developer Guide to get an idea of how the P2P aspects of the network work. Some of that documentation is paraphrased below.

Our BitClone protocol is text-based and exchanges messages terminated with the carriage return/line feed pair. Fields within each message are delimited with the vertical bar or pipe character ( ). The delimiter character is thus not allowed in any data fields.

A sample message might look like the following:

message|field-1|2468|192.168.1.2

The BitClone protocol is a Peer to Peer protocol and is NOT stop and wait. That means, you might send a message to the remote peer, but the immediate response is for a different “request.” This means you will need to keep track of any outstanding requests from your peer to others and note when they are completed. The protocol helps in that it is mostly stateless and situations where this tracking is needed are few (specifically in the initial peer connection handshake).

On startup you will connect to other known peers and exchange version and verack messages. Once both peers have sent their verack the connection is established. There are both ‘inbound’ and ‘outbound’ connections, inbound are connections where others have connected to you and outbound are where you initiated the connection. The only significant difference is who is responsible for the initial version message. Once the connection is established (verack messages exchanged) there is no difference.

After the connection is established, the peers will begin to exchange peer addresses via the getaddr and addr messages. You will send getaddr to the remote peer and they will likely be sending one to you one at about the same time. The remote will respond to your getaddr with one or more addr messages, which you will use to update your internal list of peers. At the same time you will be responding to the remote peer’s getaddr messages with addr messages containing addresses you know of.

When you receive new peer information, you should keep connecting to new peers until you have five approximately (5) active connections. If you lose connectivity to one, choose another and connect to it, to keep your connections constant as close to five as possible. You may include both inbound and outbound connections in this count.

After the initial flurry of getaddr and addr messages you must, every 30 seconds, send ping requests to your connected peers to ensure they are still active (and respond with pong messages). If a peer is inactive for 90 seconds, drop it from your list of peers until you see them with a more recent timestamp. You should be receiving ping messages every 30 seconds from your connected peers as well.

Peer Discovery

There are three methods that your code must have to do discover peers; (1) as a command line parameter (2) by reading the saved state from the last program execution and (3) getting peer addresses from known peers. The first two are addressed here and the third below in Peer Maintenance

On startup, you will need to read from the command line and your saved file an ip-address:port for each known peer. Where ip-address is either an IPv4 or IPv6 address and port is the TCP port number the remote peer is listening on.

When your program exits (or during its execution), it should save it’s known active peers. This saved list of peers should also be used at program startup to pre-populate the known peers.

The order of peer connections is undefined.

Connection to Peers

Connecting to peers is done over TCP to the ip-address:port specified for each peer. When creating an outbound connection you are responsible for sending a version message as the initial action. The remote peer will send a verack message in response. The remote peer will also respond with it’s own version message and expect you to respond with a verack message. No other messages may be exchanged on the connection until both peers have sent and received version and verack messages successfully.

Version Message

The version message must contain:

  • sender’s version number, a number
    • 3 for project 3
    • 4 for project 4
  • sender’s services, a bitfield
    • 00000001: peer to peer connectivity services (project 3)
    • 00000010: block chain / block services (project 4)
    • 00000100: block chain mining services (mines new blocks - not used)
  • current time, a number (the number of seconds since Jan 1, 1970)
  • recepient’s IP address and port, an ip-address:port (x.x.x.x,y)
  • sender’s IP address and port, an ip-address:port (x.x.x.x:y)
  • nonce, random number to prevent self-connections, a number
  • user-agent string describing the peer’s software, string (your @maine.edu username)
  • a block identifier, a number (unused in project 3)

Example:

version|3|1|1507490964|192.168.1.10:8090|130.111.135.12:2098|8192|houser|9000

In this example, in order:

  • version is the command
  • the first 3 represents version 3 of the protocol for project 3
  • the second 1 represents the lowest bit in the service field being set, meaning this peer offers peer to peer connectivity services only (e.g. implements the features required for project 3)
  • the current time in seconds since Jan 1, 1970
  • the intended recipient’s ip address and port
  • the senders ip address and port (external ip address)
  • a randomly generated nonce number
  • a unique string to represent this peer, your @maine.edu account name (e.g. houser)
  • a block identifier that represents the highest (largest) block this peer knows about. This is unused in project 3.

Verack Message

The verack message is a response to a version request and should follow it in a reasonable amount of time. The verack message must contain the following:

  • the nonce from the associated version message, a number

Example:

verack|9000

Peer Maintenance

Once your peer is connected to another peer and has exchanged version and verack messages it will need to retrieve a list of known peers from others. It will also have to provide it’s list of known peers. The two messages for this exchange are getaddr and addr. Peers also maintain periodic contact (at least every 30 seconds) with the ping and pong messages if there is no other activity.

Getaddr Message

The getaddr message requests the remote peer send a list of it’s known peers. The getaddr message has no parameters.

Example:

getaddr

Addr Message

The addr message transmits a list of known peers. This is typically (but not required to be) in response to a getaddr message. The addr message must contain the following fields:

  • The number of peers contained in the message
  • A list of peers of the format:
    • last time the peer was active, a number (the number of seconds since Jan 1, 1970)
    • the IP address and port of the peer, an ip-address:port

Example:

addr|2|1507490964|192.168.1.10:2989|1507490964|130.111.135.1:8767

Wile you could send one peer per addr message and send multiple messages to remote peers in response to a getaddr message, this will be considered a protocol violation if done repeatedly and your peer will be dropped from the network. addr messages are designed to include multiple peers and should do so.

Ping Message

The ping message is used to maintain connectivity and verify the remote peer is still active. Peers need to maintain activity to avoid being expired from the network. Using the ping message and it’s counterpart pong perform this service.

The ping message must contain the following fields:

  • a unique nonce, a number (randomly generated)

Example:

ping|23456

Pong Message

The pong message is a response to a ping request and should follow it in a reasonable amount of time. The pong message must contain the following:

  • the nonce from the associated ping message, a number

Example:

pong|23456

Additional/Management features

The protocol also contains commands/messages to accomodate request failures or invalid messages and diagnostic information these are as follows.

Reject Message

The reject message is sent by a peer when it receives and invalid, incorrect, or otherwise indeciperable message or request. A peer may be dropped if a significant number of incorrect messages are received.

The reject message must contain the following fields:

  • a response code, a number
    • 400-499 for client errors, e.g. bad request, bad message, bad formatting
    • 500-599 for server errors, e.g. server inoperable, database problems, etc.
  • a reason for the rejection, a string
  • additional data to be displayed to the user, a string

Example:

reject|400|Invalid command|The command "send me flowers" was not recognized.

Message Message

The message message is a general diagnostic communication mechanisim. It is used to convey information that is not part of the formal protocol. Examples may include debugging and informational information.

The message message must contain the following fields:

  • a response code, a number
    • 100-199 for informational messages (e.g. debugging)
  • short version of the message, a string
  • additional data to be displayed to the user, a string

Example:

message|100|peer statistics|43 requests, 10 rejections, 76% correct

Additional Information

  • The server must accept as a configurable parameter (on the command line) the port number to listen on.
  • You must include the file named PLAYBOOK.md
  • PLAYBOOK.md has Your name
  • PLAYBOOK.md has what language you used
  • PLAYBOOK.md has a brief synopsis of your experience with the assignment (1-3 paragraphs).
  • PLAYBOOK.md has how to compile and execute your project.
  • You must not include any executable binary files. Submit only code.
  • You may include a script or batch file to compile and/or run your server. This must be documented in your PLAYBOOK.md if it is included.
  • You must submit your project through GitHub

Definitions

The IETF Best Practice Document RFC 2119 Key words for use in RFCs to Indicate Requirement Levels defines several keywords that are used in this assignment and throughout the course. Pay special attention to where they appear in the assignment.

Some keywords used in this assignment are as follows;

MUST: This word, or the terms REQUIRED or SHALL, mean that the definition is an absolute requirement of the specification.

SHOULD: This word, or the adjective RECOMMENDED, mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be under.

MAY: This word, or the adjective OPTIONAL, mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item.