Among the advantages of a highly distributed system with endless edge points are that you can outsource data collection to a universe of locations, and even include them in your workflow, thereby expanding your operations. The challenges are when you have endpoints that contribute to your organization and systems, but you don’t exactly trust. They can be newcomers that you don’t know enough about, or entities with a history of misusing the data inclusion to your systems give them access to. You want the value they create, the information they amass and gather to be copied from the edge up the levels of your system, but you don’t want to give too much for that value or pay for it in the form of greater risk. Filtered replication is the art of enabling nontrusted edge points to access your system in a limited manner, replicating the information they produce in a nontrusted format.
Yesterday I posted about Parler banning and the likely impact of that, both legally and in terms of the technical details. My expectations is that new actors will step in to fill the existing demand created by the current social network account suspensions. I had spent some time thinking about the likely effects of this, and I think that it will lead to some interesting results.
A new social network will very likely rise as a result of those actions. That network would have to be resilient for de-platforming issues. That means that it cannot assume that it can run on any of the cloud services, at least not as normally understood by today’s standards. That means that we are likely to see one of two options:
- Fully distributed systems – independent nodes collaborating with one another to create a network. Each node may be host and operated independently. Similar to how torrents work and other fully distributed P2P systems.
- Distributed infrastructure – a set of servers that are running on behalf of a single entity, but are spread over multiple vendors and locations. The idea is that the shutdown of a single or multiple vendors will have little impact, because of distribution of effort.
The first option is probably something like Mastodon, but I would really like to see a return to blogs & RSS as the preferred social network. That has the advantage of a true distributed model without a single controlling actor. It is also much lower cost in terms of technology and complexity. Discovery of new blogs can be handled via recommendations, search, etc.
The reason I prefer this option is that I like to blog . More seriously, owning your own content and distribution platform has just become quite important. A blog is about as simple a piece of software as you can imagine. Consuming blogs is an act that require no publication of personal information, no single actor that can observe everything you do, etc.
I don’t know if this will be the direction, although it is my favorite one. It is possible that we’ll end up with Mastodon empire, with many actors creating networks of servers which may or may not be interconnected. I can see a future where you’ll have a network of dog owners vs. cat owners, but the two aren’t federated and there are isolated discussions between them.
Given that you could create links from one to the other, I don’t think we have to deal with total echo chambers. Consider a post in the cats social network: The dog owners are talking about the chore of having to go for walks at “dogs://social.media/walks-are-great”, that is so high maintenance, the silly buggers.
That would create separate communities, with their own rules and moderation. Consider this something like subreddits, but without the single organization that can enforce global rules.
The other alternative is that a social network would rise with a truly distributed backend that is resilient to de-platforming issues. From an outside perspective, this will present as something to the existing social networks. That has the advantage of requiring the least from users, but it is a non trivial technical challenge.
I prefer the first option, but I believe it is more likely we’ll end up with the second. The reason for that is monetization strategies. If you have a many different actors cooperating to create a network, there is a question on how you pay for that. The typical revenue model for social network is advertising. That doesn’t work so well where there isn’t a single actor that can sell ads (and track users).
That said, it would be much faster and easier to get started with the first option and it may be that we’ll end up there with the force of inertia.
I’m writing this post at a time when Donald Trump’s social media accounts were closed by Twitter, Facebook and across pretty much all popular social networks. Parler, an alternative social network has been kicked off the Apple and Google app stores and its AWS account was closed. It also appears that many vendors are dropping it and it will take significant time to get back online, if it is able to do so.
I’m not interested in talking about the reasons for this, mind. This is a hot topic political issue, in a country I’m not a citizen of, and I have no interest in getting bogged down with the political details.
I wish I could say that I had no dog in this fight, but I suspect that the current events will have a long term impact on the digital world. Note that all of those actions are taken by private companies, on their own volition. In other words, it isn’t a government or the courts demanding this behavior, but the companies’ own decision making process.
The thing that I can’t help thinking is that the current behavior by those companies is direct, blatant and very much short sighted. To start with, all of those companies are working on global scale, and they have just proven that they are powerful enough to rein in the President of the Unites States. Sure, he is at a lame duck status currently, but that is still something that upset the balance of power.
The problem with that is that while it would appear that the incoming US administration is favorable to this course of action, there are other countries and governments that are looking at this with concern. Poland is on track to pass a law prohibiting the removal of posts in social media that do not break local laws. Israel’s parliament is also considering a similar proposal.
In both cases, mind, these proposed laws got traction last year, before the current escalation in this behavior. I feel that more governments will now consider such laws in the near future, given the threat level that this represent to them. A politician is this day and age that doesn’t use social media to its fullest extent is going to be severely hampered. Both the Obama and the Trump campaigns were lauded for their innovative use of social media, for example.
There are also other considerations to ponder. One of the most costly portions of running a social network is the monitoring and filtering of posts. You have to take into account that people will post bile, illegal and obscene stuff. That’s expensive, and one of the reasons for vendors dropping of Parler was their moderation policies. That means that there is a big and expensive barrier in place for future social networks that try to grow.
I’m not sure how this is going to play out in the short term, to be honest. But in the long term, I think that there is going to be a big push, both legally and from a technical perspective to fill those holes. From a legal perspective, I would expect that many lawyers will make a lot of money on the fallout from the current events, just with regards to the banning of Parler. I expect that there are going to be a whole lot of new precedents, both in the USA and globally.
From a technical perspective, the technology to run a distributed social network exists. Leave aside currently esoteric choices such as social network on blockchain (there appears to be a lot of them, search for that, wow!), people can fall back to good old Blog & RSS to get quite a bit of traction. It wouldn’t take much to something that looks very similar to current social networks.
Consider RSS Bandit or Google Reader vs. Twitter or Facebook. There isn’t much that you’ll need to do to go from aggregation of RSS feeds to a proper social network. One advantage of such a platform, by the way, is that it allows (and encourage) thought processes that are longer than 140 characters. I dearly miss the web of the 2000s, by the way.
Longer term, however, I would expect a rise of distributed social networks that are composed of independent but cooperating nodes (yes, I’m aware of Mastodon, and I’m familiar with Gab breaking out of that). I don’t know if this will be based on existing software or if we’ll end up with new networks, but I think that the die has been cast in this regard.
That means that the next social network will have to operate under assumed hostile environment. That means running on multiple vendors, taking no single point of failure, etc.
The biggest issue with getting a social network off the ground is… well, network effects. You need enough people in the network before you start getting more bang for the buck. But right now, there is a huge incentive for such a network, given the migration of many users from the established networks.
Parler’s app has seen hundreds of thousands of downloads a day in the past week, before it was taken down from the app stores. Gab is reporting 10,000+ new users an hour and more users in the past two days than they had seen in the past two years.
There is a hole there that will be filled, I think. Who will be the winner of all those users, I don’t know, but I think that this will have a fundamental impact on the digital world.
On an otherwise uneventful morning, the life of the operations guy got… interesting.
What were supposed to be a routine morning got hectic because the database refused to operate normally. To be more exact, the database refused to load a file. RavenDB is generally polite when it run into issues, but this time, it wasn’t playing around. Here is the error it served:
---> System.IO.IOException: Could not set the size of file D:\RavenData\Databases\Purple\Raven.voron to 820 GBytes
---> System.ComponentModel.Win32Exception (665): The requested operation could not be completed due to a file system limitation
Good old ERROR_FILE_SYSTEM_LIMITATION, I never knew you, because we have never run into an error with this in the past.
The underlying reason was simple, we had a large file (820GB) that was too fragmented. At some point, the number of fragments of the file bypassed the maximum size of the file system.
The root cause here was probably backing up to the same drive as the database, which forced the file system to break the database file into fragements.
Just a reminder that there are always more layers into the system and that we need to understand them all when they break.
I want to comment on the following tweet:
A rather banal “thought experiment:”— Reginald Braithwaite (@raganwald) January 2, 2021
What if the only code we could review was tests and interface definitions?
What would that force us to specify at the interface and behaviour level, rather than just the implementation level?
When I read it, I had an immediate and visceral reaction. Because this is one of those things that sound nice, but is actually a horrible dystopian trap. It confused two very important concepts and put them in the wrong order, resulting in utter chaos.
The two ideas are “writing tests” and “producing high quality code”. And they are usually expressed in something like this:
We write tests in order to product high quality code.
Proper tests ensure that you can make forward progress without having regressions. They are a tool you use to ensure a certain level of quality as you move forward. If you assume that the target is the tests and that you’ll have high quality code because of that, however, you end up in weird places. For example, take a look at the following set of stairs. They aren’t going anywhere, and aside from being decorative, serves no purpose.
When you start considering tests themselves to be the goal, instead of a means to achieve it, you end up with decorative tests. They add to your budget and make it harder to change things, but don’t benefit the project.
There are a lot of things that you are looking for in code review that shouldn’t be in the tests. For example, consider the error handling strategy for the code. We have an invariant that says that exceptions may no escape our code when running in a background thread. That is because this will kill the process. How would you express something like that in a test? Because you end up with an error raised from a write to a file that happens when the disk is full that kills the server.
Another example is a critical piece of code that needs to be safely handle out of memory exceptions. You can test for that, sure, but it is hard and expensive. It also tend to freeze your design and implementation, because you are now testing implementation concerns and that make it very hard to change your code.
Reviewing code for performance pitfalls is also another major consideration. How do you police allocations using a test? And do that without killing your productivity? For fun, the following code allocates:
There are ways to monitor and track these kind of things, for sure, but they are very badly suited for repeatable tests.
Then there are other things that you’ll cover in the review, more tangible items. For example, the quality of error messages you raise, or the logging output.
I’m not saying that you can’t write tests for those. I’m saying that you shouldn’t. That is something that you want to be able to change and modify quickly, because you may realize that you want to add more information in a certain scenario. Freezing the behavior using tests just means that you have more work to do when you need to make the changes. And reviewing just test code is an even bigger problem when you consider that you need to consider interactions between features and their impact on one another. Not in terms of correctness, you absolutely need to test that, but in terms of behavior.
The interleaving of internal tasks inside of RavenDB was careful designed to ensure that we’ll be biased in favor of user facing operations, starving background operations if needed. At the same time, it means that we need to ensure that we’ll give the background tasks time to run. I can’t really think about how you would write a test for something like that without basically setting in stone the manner in which we make that determination. That is something that I explicitly don’t want to do, it will make changing how and what we do harder. But that is something that I absolutely consider in code reviews.
My talk to the Azure Israel about Cosmos DB is up (in Hebrew)…