modified | Sunday 23 October 2022 |
---|
In A general idea of a peer to peer social network I presented some simple ideas that can work together to create a peer to peer social network.
In this post I want to get into more details. Imagine a simple application and its interface with the user and with other instances of itself.
PUB
and the private part is PRIV
and PUB
fingerprint is FP
.PUB
file as <FP>.pub
and his PRIV
as <PRIV>
without extension.PUB1
we can calculate the fingerprint FP1
and save it in identities/<FP1>.pub
FP1
with signature SG1
we can validate the content is really signed by that person using the public key value from the file. and if we failed to find the full key we can then ask for it from another instance.Now we’re done with the basic identities management. that previous structure will allow:
Now if we need these processes to exchange identities with each other we’ll need a public interface to ask each other for identities
FP1
but identities/<FP1>.pub
doesn’t exist. it’ll ask some other processes about this FP1 public key. simple. you get a bunch of content from people you know but the comments for example is from other people you don’t know. to get their public keys you ask the same user that gave you the content. over time you accumelate enough identities and you don’t have to ask for it. and we don’t have to transfer the full public key with every piece of content. that will save alot of bandwidth over time.Now couple users can have identities on their machines and when one machine doesn’t have the full public key for a fingerprint it’ll ask the others to provide it. the advantages I see for this appraoch is:
So far we have a process that listens on 2 ports one available only inside the machine for the administration and another for public consumption from another process. It uses the disk to store identities for current user and other users and can exchange it between each other.
Note that there are already an infrastructure for PGP keys exchange. that can be also utilized to exchange identities. I just want to keep it simple and limited to the features we need and not inherit a whole protocol for exchanging identities.
Now we get to the content itself. The social feed we want to share between users.
For sure we’ll need a place to save the data whatever it is. lets have a posts
directory and put everything there.
Our data is for sure split to units, each unit is a post, image, link, video…etc the usual social feed we’re used to. so lets save each in a separate file. the file name is the SHA1 sum of the content of this post.
For each post we need to make sure it’s created by an identity. so when it moves from one machine to the other it should have the proof that it originated from the key pair owner. so we’ll need the post content whatever the format it to be signed and the signature needs to reside in the same post file. So the file content should hold the post content, the public key fingerprint and the signature. this reminds be of JWT format where it’s formatted in 3 parts separated by a dot. So we can have each post file formatted as follow:
P1 = content of the post base64
FP = creator key fingerprint
SIG = P1 signed by the public key PUB of the fingerprint FP
File = P1.FP.SIG
If we set the file modification date to the post creation time itself we then can query the file system for files ordered by the modification time and we’ll get the latest posts. so if the client want to show the last 100 posts or so it can do this easily without reading the content of each file.
This means we’ll need the post content to have the created at
time. lets keep this in mind.
Now if we need to see posts only from a specific person. like when we get to his profile. we can’t do that without reading all files and check if it belong to this user. or the client can build an index for it. it would be nice if we can organize the files in sub directories each for an identity fingerprint. so lets add that. now we’ll have the structure as so posts/<FB>/<SHA1>
The client can still create custom indexes to make it faster to list posts. but it doesn’t have to be part of our structure. this is good enough I guess.
Now the content of the post itself. there are many formats we can use. I would have preferred XML but there is a stigma attached to it and most modern developers are used to something like JSON so be it. generating and parsing JSON isn’t a problem for anyone nowadays so we can go for it.
The attributes of this JSON should be different based on the what we need to share. so the simplist post will contain TEXT
and CREATED_AT
fields. a string and unix timestamp respectively.
If the user want to share an image we can have it in two different ways, either the a field IMAGE
with the URL of the image on the web. or IMAGE
field with the image content base64 encoded content. If we named the attributes with a suffix of the type we can support both. IMAGE_URL
means the attribute name is IMAGE
and the value must be treated as a URL. if it’s IMAGE_JPG
means the value is JPEG
format encoded as base64. then we can support more images and more formats IMAGE
in this case would be just an identifier and doesn’t matter if the name is 123_JPG
we already know the content is an image.
That means to support sharing a link we can just send it as ATTRIBUTE_URL
and the client can read the content of the URL and display it based on the returned file signature.
So we can generalize it even more. if we named the image IMAGE_BIN
instead of IMAGE_JPG
then we can also decode it from base64 and get the file type from the magic bytes and render it based on the type. if it’s an image we render it as image. audio rendered as audio player…etc
Other types can be another post. like when we share a post written by someone else to our own timeline. we can do that by supporting a *_POST
format where the content is another post content. with the file format P1.FP.SIG
. that means I can take a post from my friend and wrap it in another post with commentary and reshare it to someone else.
One thing that maybe we’ll need to enforce here is the CREATED_AT
attribute. As we need to know when the post was created. and to conform to the previous naming we can use CREATED_TIME
where _TIME
is a unix timestamp.
Rendering post content in clients can be a problem if we have many attributes. I thought we can enforce having the order of rendering in the post itself as a special attribute. but then that wouldn’t be fun. I mean if we left it then it would be fun to see how the designers decide to render posts based on the available attributes. and we’ll get lots of creativity out of that. so lets leave it be.
Now that we know how to save and retrieve posts. How are we gonna exchange them?
Ahmed
and Basant
and each of their processes knows each other IP addresses.Ahmed
process every 10 minutes can as Basant
process for new postsAhmed
knows the most recent post from Basant
fro her directory on his machine so he can always ask for posts after that itmestampBasant
process can do the same to get Ahmed
new posts.How are you doing?
and he respond with every thing happened since you last talked to each other. pretty simple.Basant
or Ahmed
Ahmed
is asking Basant
that he’s really talking to Basant
not a middle man. So we’ll need to encrypt the requests from Ahmed
to Basant
with Basant
public key. so only her can read the request. and do the same to the request. Basant
response needs to be encrypted with Ahmed
public key so only Ahmed
can read the response. even if there is a middleman he can’t decrypt them or read the content.Basant
will include it in the response to make sure we’re getting the response for that request not anything else.So now we have processes that can exchange social feed with each other. note that there is a format and specification for social feeds called ActivityPub that could be used. I rethought that in this post to develope the idea based on the reasoning in my head.
Now that we have posts exchanged between users. how about replies to posts. where one is commenting or replying to a post.
Ahmed
to Basant
so it needs to be also signed by Ahmed
like any post.TO_FINGERPRINT
that has Basant
fingerprint in it. and REPLY_POSTSHA
is the SHA1 of the post we’re replying to.posts
directory. that means we’ll propagate the comments like any post. and the client doesn’t have to show the comment unless he has the original post.TO_FINGERPRINT
and looking up their full public key and their profiles (we didn’t talk about profiles so far).TO_FINGERPRINT
to the user with that fingerprint leading to less discoverability but more privacy and less bandwidth.Basant
want to send comments of the post to Camal
for example we have 2 options. either return Ahmed
comment to Camal
or not. I think this is up to Ahmed
to decide not Basant
so I would say NO Basant
shouldn’t broadcast the comment. the comments was meant for her and if Ahmed
want to broadcast it he should do that himself. so If Camal
already knows Ahmed
. Ahmed can choose to broadcast comments to Camal
while syncing posts. then Camal can see the comment on the post.Basant
in this case.TO_FINGERPRINT
attribute but not REPLY_POSTSHA
to send a post to a specific person not a reply to a post. the client can separate posts from private message like that. and as the posts doesn’t propagate then the message won’t be sent to anyone else.TO_FINGERPRINT
attribute. we can say any post with a _FINGERPRINT
attribute is meant to be sent to that user. and while we’re at it any _POSTSHA
attribute is meant as a reply to the post with this post SHA1 value. and in this case you can write a comment/post/message and send it to multiple people. that will do it for group chat. people send posts to each other with their fingerprints in the message so the private message will have the same capability of a post.Now we get the the profiles. people pictures and names and other basic information they want to share with each other that’s persistent. how do we deal with them?
_PROFILENAME
and another for _BIRTHDATE
and another for _WORKPLACE
and so on._FINGERPRINT
list in it to address it to specific people._FINGERPRINT
to that person giving the user a finetuning for who sees what._BIRTHDATE
will allow clients to have a list of birthdays or friends phone numbers which other features can built on like reminder of birthdays or a phone book, or address bookThat’s basic usage. I guess. that covers basic usage. next we’ll have the issue of discoverability. how to know the network routes to each other. probably we can use mDNS for the same network. probably we can use webRTC infrastructure servers like STUN servers. The whole ICE will be useful in this case.
Let me know if I missed something or something can be taken out of this concept to make it simpler. it’s easy to add more stuff. but what I would like to do is a small consistent concept that as minimal as possible.