The Python implementation of the libp2p networking stack 🐍 [under development]

Go to file

Stuckinaboot e66167fba4 Add quickstart for stream creation		2019-04-21 14:00:51 -04:00
assets	added to readme	2018-11-07 13:06:29 -05:00
examples/chat	fixing chat (#151 )	2019-04-18 15:56:02 -04:00
libp2p	Merge branch 'master' into peer_routing	2019-04-17 21:22:37 -04:00
tests	Update test_basic.py	2019-04-17 21:44:45 -04:00
.gitignore	Minor add to gitignore: pycharm	2018-10-21 11:18:35 -04:00
.pylintrc	Update unit tests from peer package (#103 )	2019-01-12 11:31:08 -05:00
.travis.yml	updated travis config	2019-04-17 21:03:35 -04:00
codecov.yml	Ignore kademlia folder in code coverage (#109 )	2019-01-28 16:06:03 -05:00
COPYRIGHT	Dual license (MIT+Apache2)	2018-11-16 11:55:12 -08:00
LICENSE-APACHE	Dual license (MIT+Apache2)	2018-11-16 11:55:12 -08:00
LICENSE-MIT	Dual license (MIT+Apache2)	2018-11-16 11:55:12 -08:00
README.md	Add quickstart for stream creation	2019-04-21 14:00:51 -04:00
requirements_dev.txt	keep lru_dict in setup.py only	2019-04-09 20:50:21 -04:00
setup.py	add dependencies	2019-04-17 21:25:02 -04:00

README.md

py-libp2p

WARNING

py-libp2p is an experimental and work-in-progress repo under heavy development. We do not yet recommend using py-libp2p in production environments.

Sponsorship

This project is graciously sponsored by the Ethereum Foundation through Wave 5 of their Grants Program.

Maintainers

The py-libp2p team consists of:

@zixuanzh @alexh @stuckinaboot @robzajac

Development

py-libp2p requires Python 3.7 and the best way to guarantee a clean Python 3.7 environment is with virtualenv

virtualenv -p python3.7 venv
. venv/bin/activate
pip3 install -r requirements_dev.txt
python setup.py develop

Testing

After installing our requirements (see above), you can:

cd tests
pytest

Note that tests/libp2p/test_libp2p.py contains an end-to-end messaging test between two libp2p hosts, which is the bulk of our proof of concept.

Quickstart Guide

This quickstart guide will teach you how to quickly get py-libp2p up and running, and how to take advantage of its various features. Since libp2p at its core is a distributed systems library, this quickstart guide will use explain all concepts with nodes. These two libp2p nodes that are labelled node1 and node2, respectively.

Creating a libp2p node

A libp2p node, at a high-level, is used for connecting and communicating with peers running libp2p. Connecting with peers implies opening a connection to a peer over which communication via streams (discussed in next section) can take place.

First, we create a libp2p instance on node1 and start listening on port 8000:

node1
-----
import multiaddr
from libp2p import new_node

host = await new_node()
await host.get_network().listen(multiaddr.Multiaddr("/ip4/127.0.0.1/tcp/8000"))

Then, we create a libp2p instance on node2 and start listening on port 8001

node2
-----
import multiaddr
from libp2p import new_node

host = await new_node()
await host.get_network().listen(multiaddr.Multiaddr("/ip4/127.0.0.1/tcp/8001"))

Now, we have two libp2p nodes. Next, let's make them connect.

Connecting two libp2p nodes

In order to connect node1 to node2, node1 must have node2 as a peer. This is so that the nodes know who they are connecting to. So, let's add node2 as a peer of node1 and let's add node1 as a peer of node2. TODO: dicuss p2p IDs

node1
-----
from libp2p.peer.peerinfo import info_from_p2p_addr

# Add node2 as a peer of node1
addr_of_node2 = multiaddr.Multiaddr("/ip4/127.0.0.1/tcp/8001/p2p/TODO")
info_node2 = info_from_p2p_addr(addr_of_node2)

# Connect node1 to node2
await node1.connect(info_node2)

Streams between two peers

The central component to the libp2p paradigm is the stream, which represents a channel for communication over an underlying connection. There can be multiple streams over the same underlying connection. Here, we will go over how to setup two nodes to communicate over streams. When we create a new stream, we need to specify what protocol we would like to communicate over. In order for the opposing node to accept the new stream request, the opposing node needs to support at least one of the protocols specified in the new stream creation message.

First, node2 creates a stream handler for each protocol it supports. The stream handler will be hit when a new stream, initiated by an outside node, is successfully created on that particular protocol. For now, node2 will only support '/foo/1.0.0'.

node2
-----
async def stream_handler(stream):
    read_data = await stream.read()
    read_str = read_data.decode()

    # Print read_str
    print(read_str)

node2.set_stream_handler("/foo/1.0.0", stream_handler)

Next, node1 creates a new stream to node2 by specifiying node2's peer ID and the list of protocol node1 is willing to communicate with node2 over. Since, node2 only has a stream handler '/foo/1.0.0', node1 and node2 will agree to communicate over '/foo/1.0.0'.

node1
-----
supported_protocols = ["/foo/1.0.0", "/bar/1.0.0"]
stream = await node1.new_stream(node_b.get_id(), supported_protocols)

# Print out protocol id so we can see which protocol we will be communicating over
print(stream.protocol_id)

# Write data to stream
encoded_str = "I <3 libp2p".encode()
await stream.write(encoded_str)

Woohoo! We have successfully written data from node1 to node2. Streams can be used for much more complex communication than just prints. Also, node1 and node2 can open many streams to each other using this same code but with different supported protocols and stream handlers.

Note: In order for node2 to open a stream to node1, node2 must have node1 as a peer (remember, we only added node2 as a peer of node1 earlier). -- TODO: update this statement if we change this

Floodsub between two peers

Feature Breakdown

py-libp2p aims for conformity with the standard libp2p modules. Below is a breakdown of the modules we have developed, are developing, and may develop in the future.

Legend: 🍏 Done 🍋 In Progress 🍅 Missing 🌰 Not planned

libp2p Node	Status
`libp2p`	🍏

Identify Protocol	Status
`Identify`	🍅

Transport Protocols	Status
`TCP`	🍋 tests
`UDP`	🍅
`WebSockets`	🍅
`UTP`	🍅
`WebRTC`	🍅
`SCTP`	🌰
`Tor`	🌰
`i2p`	🌰
`cjdns`	🌰
`Bluetooth LE`	🌰
`Audio TP`	🌰
`Zerotier`	🌰
`QUIC`	🌰

Stream Muxers	Status
`multiplex`	🍋 tests
`yamux`	🍅
`benchmarks`	🌰
`muxado`	🌰
`spdystream`	🌰
`spdy`	🌰
`http2`	🌰
`QUIC`	🌰

Protocol Muxers	Status
`multiselect`	🍏

Switch (Swarm)	Status
`Switch`	🍋 tests
`Dialer stack`	🌰

Peer Discovery	Status
`bootstrap list`	🍏
`Kademlia DHT`	🍅
`mDNS`	🍅
`PEX`	🌰
`DNS`	🌰

Content Routing	Status
`Kademlia DHT`	🍅
`floodsub`	🍅
`gossipsub`	🍅
`PHT`	🌰

Peer Routing	Status
`Kademlia DHT`	🍅
`floodsub`	🍅
`gossipsub`	🍅
`PHT`	🌰

NAT Traversal	Status
`nat-pmp`	🍅
`upnp`	🍅
`ext addr discovery`	🌰
`STUN-like`	🌰
`line-switch relay`	🌰
`pkt-switch relay`	🌰

Exchange	Status
`HTTP`	🌰
`Bitswap`	🌰
`Bittorrent`	🌰

Consensus	Status
`Paxos`	🌰
`Raft`	🌰
`PBTF`	🌰
`Nakamoto`	🌰

Explanation of Basic Two Node Communication

Core Concepts

(non-normative, useful for team notes, not a reference)

Several components of the libp2p stack take part when establishing a connection between two nodes:

Host: a node in the libp2p network.
Connection: the layer 3 connection between two nodes in a libp2p network.
Transport: the component that creates a Connection, e.g. TCP, UDP, QUIC, etc.
Streams: an abstraction on top of a Connection representing parallel conversations about different matters, each of which is identified by a protocol ID. Multiple streams are layered on top of a Connection via the Multiplexer.
Multiplexer: a component that is responsible for wrapping messages sent on a stream with an envelope that identifies the stream they pertain to, normally via an ID. The multiplexer on the other unwraps the message and routes it internally based on the stream identification.
Secure channel: optionally establishes a secure, encrypted, and authenticated channel over the Connection.
Upgrader: a component that takes a raw layer 3 connection returned by the Transport, and performs the security and multiplexing negotiation to set up a secure, multiplexed channel on top of which Streams can be opened.

Communication between two hosts X and Y

(non-normative, useful for team notes, not a reference)

Initiate the connection: A host is simply a node in the libp2p network that is able to communicate with other nodes in the network. In order for X and Y to communicate with one another, one of the hosts must initiate the connection. Let's say that X is going to initiate the connection. X will first open a connection to Y. This connection is where all of the actual communication will take place.

Communication over one connection with multiple protocols: X and Y can communicate over the same connection using different protocols and the multiplexer will appropriately route messages for a given protocol to a particular handler function for that protocol, which allows for each host to handle different protocols with separate functions. Furthermore, we can use multiple streams for a given protocol that allow for the same protocol and same underlying connection to be used for communication about separate topics between nodes X and Y.

Why use multiple streams?: The purpose of using the same connection for multiple streams to communicate over is to avoid the overhead of having multiple connections between X and Y. In order for X and Y to differentiate between messages on different streams and different protocols, a multiplexer is used to encode the messages when a message will be sent and decode a message when a message is received. The multiplexer encodes the message by adding a header to the beginning of any message to be sent that contains the stream id (along with some other info). Then, the message is sent across the raw connection and the receiving host will use its multiplexer to decode the message, i.e. determine which stream id the message should be routed to.