Designing Distributed Systems

Share this post

Software is eating the world - in a distributed way

distributeddataengines.substack.com

Software is eating the world - in a distributed way

The alchemy your data team needs

Vipul Vaibhaw
Sep 27, 2021
Share this post

Software is eating the world - in a distributed way

distributeddataengines.substack.com

Understanding the landscape of distributed systems is quite a daunting task. This is one of the most under-hyped field of computer science today. I have worked in the field of Deep Learning too hence I know that the catalyst your startup or your research lab might need is a Distributed System!

I know that my readers here are busy building large scale applications and services. I still want to use the following quote as a caution for the write-up ahead -

“You can have a second computer once you’ve shown you know how to use the first one.” –Paul Barham


Marc Andreessen wrote his famous essay in 2011 talking about “why software is eating the world?” and today there is the same undercurrent in the industry.

Artificial Intelligence might take over the world but look around Distributed Systems have already engulfed the world. Distributed systems are going to be the playground which a universe-scale AI needs.

Modern computers today are super-fast and efficient. Once your team has build an efficient algorithm/deep learning network which is juicing out the silicon embedded in one chip, you are ready to look towards the usage of multiple computers to scale up.

In my honest opinion, building softwares and especially learning engines without the use of distributed computing ideas is not an option but a necessity -

refer to caption
Source - Wikipedia

We can clearly see that in recent years we have not seen major jumps in improvements of computing power in a single chip unlike old days. There are various ways to handle this issue -

  1. Performance Engineering - The lost art(I plan to write more about this, if you want me to the please reach out to me on Twitter or LinkedIn and say hi!)

  2. Distributed Systems



Let me get straight to the point now and highlight the benefits a distributed system might offer your team or your software -

Better Availability -

Unless you are building a database, you need your systems to be highly available! High uptime is one of the key differentiators of your software. Modern hardware is quite awesome and combined with top-notch service provided by AWS(or similar providers) the servers failure rate annually is less than 1%.

Distributed systems gives your software an ability to be managed in more lean and agile way. You can do A/B testing with ease, re-route traffic in case of any issues in a particular region of the world or just allow your user’s request to fail instead of giving back an extremely slow response. It also allows your system to be placed closer to the users.

Train your Deep Learning models quickly -

You want to ship things quickly and get stuff done whether you are a research lab or a startup or a fortune 500 company.

What is the point if the deep learning model with 90% accuracy is taking 1 week to train on a single machine? Yes you can attach 2 GPUs in a mother board, unless you have a crypto-mining machine your system usually has 2 PCIe slots.

Hence, you want your data science team to do experiments quickly and fail rather than sit idle for weeks in hopes of success.

The easiest way to achieve this is by training your models in distributed way. Following tools might help you out, please reach out if you need any assistance here -

  1. ray.io

  2. Distributed Pytorch

  3. Horovod.ai

“The Internet was done so well that most people think of it as a natural resource like the Pacific Ocean, rather than something that was man-made. When was the last time a technology with a scale like that was so error-free? The Web, in comparison, is a joke. The Web was done by amateurs.” - Alan Kay.

Better Scalability -

Since you have flexibility in your system, you can scale up components instead of scaling up the whole monolith software. Stateful systems are harder to scale that stateless systems.

Better Durability -

Distributed storage systems usually make multiple copies of the data, allowing a great deal of flexibility around cost, time-to-recovery, durability, and other factors. They can also be built to be extremely tolerant to correlated failures, and avoid correlation outright.

A distributed system doesn’t just protect you with disk failures it also hedges you against datacenter (or government regime) failures.


Share


No Free Lunch

Our friends from ML world might know about this, no optimisation algorithm is the best one!

Building a distributed system and operating them which is more efficient than your monolith system is a hard task. The field has seen tremendous improvements but often I find it hard to reason if some one throws Occam’s Razor argument about monolith system.

However, it is also hard to over look the massive advantages offered by distributed system - more availability, more durability, more efficiency, more scale.

The prediction is that in this decade the faster we progress in the field of AI and the more it comes out of the labs the more world will move towards distributed systems.


Please feel free to reach out to me if you are interested in building data systems at your organisation, I would be more than happy to help.

I hope that you enjoyed this read, if so then please share this article with your friends, let’s build a solid community. I will be back next week with another well thought/researched article delivered straight in your inbox. See you!

You can connect with me on Twitter or LinkedIn.

Share this post

Software is eating the world - in a distributed way

distributeddataengines.substack.com
Comments
TopNew

No posts

Ready for more?

© 2023 Vipul Vaibhaw
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing