Derecho: Group communication at the speed of light

Version 0.9.1

Derecho is a new distributed system out of Cornell University

Running the Demo

Now that you’ve finished building the demo, let’s get into the test-2-nodes directory and try it out!

$ cd ~/sospdemo-0.9/test-2-nodes/

We’re going to launch two Derecho server processes to handle our requests, and use one gRPC client to issue requests to the servers. To make our lives easier, let’s do that in tmux. If you’re not familiar with tmux or would rather just have three different terminals, feel free to skip this part and just open more terminals!

$ tmux new-session \; split-window -h \; split-window -v \; select-layout tiled \; setw mouse on \; 

This monster command automatically launches tmux with three windows open, tiling them. It also makes your mouse work, which (for reasons passing understanding) is not enabled by default.

Let’s get this show on the road! In the top-left window, go ahead and cd n0; ./server. In the top-right window, run cd n1; ./server. In the bottom window, run cd client.

Running the servers

Now let’s wait for a bit, until the top windows have printed out something like:

Node N using core M
executing 'taskset --cpu-list 0 ../../build/src/sospdemo server'
[21:07:05.579077] [derecho_debug] [Thread 031992] [info] Derecho library running version 0.0.0 + 0 commits
//////////////////////////////////////////
FunctionTier listening on 127.0.0.1:28000
//////////////////////////////////////////
Finished constructing derecho group.
Press ENTER to stop.

Or something like

Node N using core M
executing 'taskset --cpu-list 0 ../../build/src/sospdemo server'
[16:23:15.385432] [derecho_debug] [Thread 031372] [info] Derecho library running version 0.0.0 + 0 commits
Finished constructing derecho group.
Press ENTER to stop.

The difference in outputs is due to the different roles: One of the servers functions as a function tier, receiving external RPC requests, while the other functions as a machine-learning backend, in our case using models to recognize images.

Running the client

Once our servers are up and happy, let’s go ahead and use the client to install the model. Switch your focus to the bottom terminal, the one whose working directory is in the client folder. First let’s run ls to make sure the models have been unpacked into your client directory:

$ ls
derecho.cfg  flower-model  flower-model.tar.gz  pet-model  pet-model.tar.gz  sospdemo

If you don’t see this, then you probably need to get the model files and unpack them into this directory. Instructions are back at Getting the files.

Let’s start off by checking our command usage:

$ ./sospdemo
Usage:./sospdemo <mode> <mode specific args>
The mode could be one of the following:
    client - the web client.
    server - the server node. Configuration file determines if this is a categorizer tier node or a function tier server. 
1) to start a server node:
    ./sospdemo server 
2) to perform inference: 
    ./sospdemo client <function-tier-node> inference <tags> <photo>
    tags could be a single tag or multiple tags like 1,2,3,...
3) to install a model: 
    ./sospdemo client <function-tier-node> installmodel <tag> <synset> <symbol> <params>
4) to remove a model: 
    ./sospdemo client <function-tier-node> removemodel <tag>

Looks like the first thing we’ll want to do is install a model. Let’s start with the flower-model:

$ ./sospdemo client 127.0.0.1:28000 installmodel 1 flower-model/*synset* flower-model/*symbol* flower-model/*params*

Use function tier node: 127.0.0.1:28000
return code:0
description:install model successfully.

Once that’s installed, we can start identifying some flowers!

$ ./sospdemo client 127.0.0.1:28000 inference 1 flower-model/flower-1.jpg

Use function tier node: 127.0.0.1:28000
photo description:rose

Looks like it works! If at any point these commands hang for more than about a minute, try killing the server processes and trying again. We’re running Derecho under extremely reduced resources, which means that sometimes we’ll hit our physical limits (mostly internal timeouts or failed allocations). In a production setting, this would crash the replica; in our demo setting, it’ll try to soldier on despite that—which only sometimes works.

What did we just do?

Congratulations! Your computer has now successfully run the demo code. That’s all we’ll be using this website for; tune in at the Derecho tutorial at SOSP on October 27th at about 4:30pm to learn more!

Last updated on 22 Oct 2019
Published on 22 Oct 2019
Edit on GitHub