Sunday, December 15, 2024

Distributed computing made easy




What is distributed computing
Distributed computing is two or more servers communicating for a common purpose. Typically, some tasks are divvied up between a number of computers, and they all work together to accomplish it. Note that "separate servers" may mean physically separate computers. It may also mean virtual servers such as Virtual Private Servers (VPS) or containers, that may share the same physical hardware, though they appear as separate computers on the network.

There are many reasons why you might need this kind of setup. It may be that resources needed to complete the task aren't all on a single computer. For instance, your application may rely on multiple databases, each residing on a different computer. Or, you may need to distribute requests to your application because a single computer isn't enough to handle them all at the same time. In other cases, you are using remote services (like a REST API-based for instance), and those by nature reside somewhere else.

In any case, the computers comprising your distributed system may be on a local network, or they may be worldwide, or some combination of those. The throughput (how many bytes per second can be exchanged via network) and latency (how long it takes for a packet to travel via network) will obviously vary: for a local network you'd have a higher throughput and lower latency, and for Internet servers it will be the opposite. Plan accordingly based on the quality of service you'd expect.
How servers communicate
Depending on your network(s) setup, different kinds of communication are called for. If two servers reside on a local network, then they would typically used the fastest possible means of communication. A local network typically means a secure network, because nobody else has access to it but you. So you would not need TSL/SSL or any other kind of secure protocol as that would just slow things down.

If two servers are on the Internet though, then you must use a secure protocol (like TSL/SSL or some other) because your communication may be spied on, or worse, affected by man-in-the-middle attacks.
Local network distributed computing
Most of the time, your distributed system would be on a local network. Such network may be separate and private in a physical sense, or (more commonly) in a virtual sense, where some kind of a Private Cloud Network is established for you by the Cloud provider. It's likely that separation is enforced by specialized hardware (such as routers and firewalls) and secure protocols that keep networks belonging to different customers separate. This way, a "local" network can be established even if computers on it are a world apart, though typically they reside as a part of a larger local network.

Either way, as far as your application is concerned, you are looking at a local network. Thus, the example here will be for such a case, as it's most likely what you'll have. A local network means different parts of your application residing on different servers will use some efficient protocol based on TCP/IP. One such protocol is FastCGI, a high-performance binary protocol for communication between servers, clients, and in general programs of all kinds, and that's the one used by Gliimly. So in principle, the setup will look like this (there'll be more details later):


Next, in theory you should have two servers, however in this example both servers will be on the same localhost (i.e. "127.0.0.1"). This is just for simplicity; the code is exactly the same if you have two different servers on a local network - simply use another IP (such as "192.168.0.15" for instance) for your "remote" server instead of local "127.0.0.1". The two servers do not even necessarily need to be physically two different computers. You can start a Virtual Machine (VM) on your computer and host another virtual computer there. Popular free software like VirtualBox or KVM Hypervisor can help you do that.

In any case, in this example you will start two simple application servers; they will communicate with one another. The first one will be called "local" and the other one "remote" server. The local application server will make a request to the remote one.
Local server
On a local server, create a new directory for your local application server source code:
mkdir ~/local_server
cd ~/local_server

and then create a new file "status.gliim" with the following:
 begin-handler /status public
     silent-header
     get-param server
     get-param days

     pf-out "/server/remote-status/days=%s", days to payload
     pf-out "%s:3800", server to srv_location

     new-remote srv location srv_location \
         method "GET" url-path payload \
         timeout 30

     call-remote srv
     read-remote srv data dt
     @Output is: [<<p-out dt>>]
 end-handler

The code here is very simple. new-remote will create a new connection to a remote server, running on IP address given by input parameter "server" (and obtained with get-param) on TCP port 3800. URL payload created in string variable "payload" is passed to the remote server. If it doesn't reply in 30 seconds, then the code would timeout. Then you're using call-remote to actually make a call to the remote server (which is served by application "server" and by request handler "remote-status.gliim" below), and finally read-remote to get the reply from it. For simplicity, error handling is omitted here, but you can easily detect a timeout, any network errors, any errors from the remote server, including error code and error text, etc. See the above statements for more on this.
Make and start the local server
Next, create a local application:
gg -k client

Make the application (i.e. compile the source code and build the native executable):
gg -q

Finally, start the local application server:
mgrg -w 2 client

This will start 2 server instances of a local application server.
Remote server
Okay, now you have a local server. Next, you'll setup a remote server. Login to your remote server and create a new directory for your remote application server:
mkdir ~/remote_server
cd ~/remote_server

Then create file "remote-status.gliim" with this code:
 begin-handler /remote-status public
     silent-header
     get-param days

     pf-out "Status in the past %s days is okay", days
 end-handler

This is super simple, and it just replies that the status is okay; it accepts the number of days to check for status and displays that back. In a real service, you might query a database to check for status (see run-query).
Make and start remote server
First create your application:
gg -k server

Then make your program:
gg -q

And finally start the server:
mgrg -w 2 -p 3800 server

This will start 2 daemon processes running as background servers. They will serve requests from your local server.

Note that if you're running this example on different computers, some Linux distributions come with a firewall, and you may need to use ufw or firewall-cmd to make port 3800 accessible here. Also if you're using SELinux on this server, you may either need to allow binding to port 3800, or make SELinux permissive (with "sudo setenforce 0").
Run distributed calls
There is a number of ways you can call the remote service you created. These are calls made from your local server, so change directory to it:
cd ~/local_server

Here's various way to call the remote application server:
  • Execute a command-line program on local server that calls remote application server:


    To do this, use "-r" option of gg utility to generate shell commands you can use to call your program:
    gg -r --req "/status/days=18/server=127.0.0.1" --exec

    Here, you're saying that you want to make a request "status" (which is in source file "status.gliim" on your local server). You are also saying that input parameter "days" should have a value of "18" and also that input parameter "server" should have a value of "127.0.0.1" - see get-param statements in the above file "status.gliim". If you actually have a different server with a different IP, use it instead of "127.0.0.1".
    The result will be:
    Output is: [Status in the past 18 days is okay]

    where the part in between "[..]" comes from the remote server, and the "Output is: " part comes from the command line Gliimly program you executed.
  • Call remote application server directly from a command-line program:


    Do this:
    gg -r --req "/status/days=18/server=127.0.0.1" --exec --service --remote="127.0.0.1:3800"

    The result is, as expected:
    Status in the past 12 days is okay

    In this case, the output comes straight from the remote server, so the "Output is: " part is missing. The above simply copies the output from a remote service to the standard output.
  • Use a command-line utility to contact local application server, which then calls the remote server, which replies back to local application server, which replies back to your command-line utility:


    You will use cgi-fcgi to do this:
    gg -r --req "/status/server=127.0.0.1/days=10" --exec --service

    The result is:
    Output is: [Status in the past 10 days is okay]

    which is what you'd expect. In this case we first send a request to your local application server, which sends it to a remote service, so there is "Output is: " output.
You have different options when designing your distributed systems, and this article shows how easy it is to implement them.