Scalability

¡Supera tus tareas y exámenes ahora con Quizwiz!

Cache

...

Clones

...

Denormalization

...

File based caching

...

No joins

...

Overtime

...

Persistent

...

SQL tuning

...

Sharding

...

Web service

...

Cache

A simple key-value store and it should reside as a buffering layer between your application and your data storage.

External persistent cache

An external persistent cache will have better performance than an external database

Only if it's not in the cache should it then try to get the data from the main data source. Why should you do that?

Because a cache is lightning-fast. It holds every dataset in RAM and requests are handled as fast as technically possible.

Path 2 problem

But even if you successfully switch to the latest and greatest NoSQL database and let your app do the dataset-joins, soon your database requests will again be slower and slower. You will need to introduce a cache.

External

By external I mean that the data store does not reside on the application servers. Instead, it is somewhere in or near the data center of your application servers.

There are 2 patterns of caching your data. An old one and a new one:

Cached Database Queries Cached Objects

Redis

External persistent cache

Path 1

Hire a database administrator (DBA,) tell him to do master-slave replication (read from slaves, write to master) and upgrade your master server by adding RAM, RAM and more RAM.

Path 1 problem

In some months, your DBA will come up with words like "sharding", "denormalization" and "SQL tuning" and will look worried about the necessary overtime during the next weeks. At that point every new action to keep your database running will be more expensive and time consuming than the previous one.

What are the common in memory caches

Memcached or Redis

try to get the data from the main data source.

Only if data is not in the cache should

Now the required changes are more radical than just adding more cloned servers and may even require some boldness. In the end, you can choose from 2 paths:

Path #1 is to stick with MySQL and keep the "beast" running. Path #2 means to denormalize right from the beginning and include no more Joins in any database query.

Cache not to do

Please never do file-based caching, it makes cloning and auto-scaling of your servers just a pain.

Load balancer

Public servers of a scalable web service are hidden behind a load balancer. This load balancer evenly distributes load (requests from your users) onto your group/cluster of application servers.

Redis

Redis can do several hundreds of thousands of read operations per second when being hosted on a standard server. Also writes, especially increments, are very, very fast. Try that with a database!

Sessions

Sessions need to be stored in a centralized data store which is accessible to all your application servers. It can be an external database or an external persistent cache, like Redis

Cached objects

That's my strong recommendation and I always prefer this pattern. In general, see your data as an object like you already do in your code (classes, instances, etc.). Let your class assemble a dataset from your database and then store the complete instance of the class or the assembed dataset in the cache.

Cached Database Queries

That's still the most commonly used caching pattern.

But what about deployment? How can you make sure that a code change is sent to all your servers without one server still serving old code?

This tricky problem is fortunately already solved by the great tool Capistrano. It requires some learning, especially if you are not into Ruby on Rails, but it is definitely both the effort.

try to retrieve the data from your cache.

Whenever your application has to read data it should at first

Cache

With "cache" I always mean in-memory caches like Memcached or Redis.

Path 2

You can stay with MySQL, and use it like a NoSQL database, or you can switch to a better and easier to scale NoSQL database like MongoDB or CouchDB. Joins will now need to be done in your application code. The sooner you do this step the less code you will have to change in the future.

Path 1 solution

You might have been better off if you had chosen Path #2 while your dataset was still small and easy to migrate.

After "outsourcing" your sessions and serving the same codebase from all your servers, you can now create

an image file from one of these servers (AWS calls this AMI - Amazon Machine Image.). Use this AMI as a "super-clone" that all your new instances are based upon. Whenever you start a new instance/clone, just do an initial deployment of your latest code and you are ready!

First rule for scalability

every server contains exactly the same codebase and does not store any user-related data, like sessions or profile pictures, on local disc or memory.

The next time you run the query, you check at first the cache if there is already a result. This pattern has several issues. The main issue is

expiration

A hashed version of your query

is the cache key.

After following Part 2 of this series, you now have a scalable database solution. You have no fear of storing terabytes anymore and the world is looking fine. But just for you. Your users still have to suffer slow page requests when a lot of data is fetched from the database. The solution

is the implementation of a cache.

It is hard to delete a cached result when you cache

when you cache a complex query (who has not?).

The next time you run the query,

you first check if it is already in the cache.

When one piece of data changes (for example a table cell)

you need to delete all cached queries who may include that table cell.

Whenever you do a query to your database,

you store the result dataset in cache.

your servers can now horizontally scale and you can already serve thousands of concurrent requests. But somewhere down the road your application gets slower and slower and finally breaks down. The reason:

your database. It's MySQL, isn't it?


Conjuntos de estudio relacionados

Initiation Exam - Baylor University

View Set

ASVAB: Arithmetic Reasoning Questions, *THIS ONE* ASVAB 2020 Arithmetic Reasoning

View Set

introduction to film and television aesthetics

View Set

Ch. 12 -Caring for the Special Needs Child

View Set

Spanish 1 Semester 1 Exam Review guide

View Set