Reading notes of "Technical Architecture of Large Websites: Core Principles and Case Studies"

Database evolution

Read and write separation

That is, using the "master-slave hot backup" function that most mainstream databases have, the master database can synchronize data to the slave database. Therefore, write data to the master database and read data from the database. In addition, a special data access module should be used for data Operation to make the read and write separation of the database transparent to the application

Business sub-library

Deploy different business databases on different physical servers

Distributed database

The last resort for database splitting is to use multiple servers for database clustering, generally only when the data in a single table is very large

Application system shared business extraction

Use a single system to be responsible for the common business to reduce the number of database connections, and then call the common business through distributed services

Reverse proxy and nginx

Nginx supports load balancing and reverse proxy:

For example, tomcat for clustering, you can use the server node to configure load balancing for multiple tomcat accesses, such as:

upstream tomcatserver { 
    server 192.168.72.49:8082; 
    server 192.168.72.49:8081; 
} 
server { 
    listen 80; 
    server_name localhost; 
    main; 
    location / { 
        proxy_pass http://tomcatserver; 
        index index.html index.htm; 
    }      
}

Reverse proxy is generally used to accelerate website response (in addition, CDN can also be used for acceleration), that is, when a user accesses a website service, if nginx caches the resource requested by the user, it will be directly returned to the user.
To enable nginx's caching function, you need to add two lines here:

#This line means: define the cache storage directory and create it manually; the cache level means that the first level directory of the cache directory is 1 character, and the second level directory is 2 characters; the kernel is used to cache the data source data of the cache Space, when looking up the cache, first find the source data of the cached data from this kernel space, and then look up the cache in the corresponding directory; this line indicates: the maximum cache space; the cached data, which has not been accessed within 60 minutes Then delete 
proxy_cache_path /var/www/cache levels=1:2 keys_zone=mycache:20m max_size=2048m inactive=60m; #This 
line indicates: When creating a cache, some temporary file storage locations may be generated, and proxy_temp_path is automatically created 
. var/www/cache/tmp;

Distributed message queue

definition

In a single server, asynchrony can be achieved through multi-threaded shared memory queues. The thread in front of the business operation writes output to the queue, and the latter thread reads data from the queue for processing; in a distributed system, multiple Server clusters are asynchronous through distributed message queues. Distributed message queues can be regarded as distributed deployment of memory queues.

effect

Improve system availability When the
consumer server fails, the data will be stored and accumulated in the message queue server, the producer server can continue to process business requests, and the overall system performance is trouble-free. After the consumer server returns to normal, it continues to process the data of the message queue.
Speed up website response speed
After processing the business request, the producer server at the front end of the business process writes the data into the message queue. It can return without waiting for the consumer server to process, and the corresponding delay is reduced.
Elimination of concurrent access peaks.
User visits to the website are random. There are visit peaks and valleys. Even if the website is planned and deployed according to the general visit peaks, there will still be emergencies, such as promotional activities on shopping websites and hot events on Weibo. It will cause a sudden increase in concurrent access to the website, which may cause the entire website to be overloaded, correspondingly delayed, and even service downtime may occur in severe cases. Use the message queue to put the suddenly increased access request data into the message queue and wait for the consumer server to process it in turn, which will not cause too much pressure on the entire website load.

Optimize website performance program

Browser side

Use browser cache
Use page compression
Reasonable layout of pages
Reduce cookie transmission

Dynamic and static separation

Use CDN to distribute static content to the network service provider computer room closest to the user
Deploy a reverse proxy server in the central computer room to cache hot files

Application server side

Use distributed cache
Use local cache

other

Asynchronous: Use the message queue to support asynchronous operations, that is, send user requests to the message queue to wait for subsequent tasks to be processed, and the current request directly responds to the user
Cluster: For high concurrent requests, multiple application servers are formed into a cluster for common external access注意集群下，尽量不将重要的会话数据保存在当前服务器中，而是分布式缓存等地方，为了防止发生宕机，会话数据丢失

Code level

Use multithreading
Improve memory management

Database server side

Database: index, cache, SQL optimization, etc.
NOSQL optimizes data model, storage structure, scaling features, etc.

Availability

That is, it is generally available 7*24 hours, and mainstream websites generally meet the availability of more than 4 9s (ie 99.99%), which is called high availability; 2 9s are basically available; 3 9s are highly available; 5 9s are extremely high Available. The main means to achieve high availability is redundancy: server clusters, data hot standby, master-slave replacement

Scalability

Large-scale websites need to face a large number of users with high concurrent access and store massive amounts of data. It is impossible to process all user requests and store all data with only one server, so clusters appear. The so-called scalability is to alleviate the increasing pressure of concurrent user access and the increasing demand for data storage by continuously adding servers to the cluster.

The main criterion for measuring scalability is whether it is possible to build a cluster with multiple servers and whether it is easy to add new servers to the middle of the cluster. Whether adding a new server can provide services that are indistinguishable from the original server, and whether there is a limit to the total number of servers that can be accommodated in the cluster.

Achieve scalability:

For application server clusters: do not store data on the server (local cache), then all the servers used are peer-to-peer, that is, it has good scalability.
For a cache server cluster: adding a new server may cause cache routing to become invalid, which in turn will cause most of the cached data in the cluster to be inaccessible. The accessibility of cached data can be ensured by improving the routing algorithm, that is, the consistent hash algorithm is used to continue the upgrade: the physical node is decomposed into multiple virtual nodes to perform new node operations to ensure that the servers in the original cluster are more evenly affected.
For database clusters: Although relational databases support data replication, Zhu Cong hot backup, but it is difficult to achieve scalability, so it can only be achieved from outside the database, through routing partitions and other methods to deploy multiple database servers to form a cluster .

Scalability

The main criterion for measuring scalability is whether the website can achieve transparency and no impact on existing products when new business products are added. There is little coupling between different products, and changes to one product have no impact on another product.

Achieve scalability:

Event-driven architecture: Websites usually use message queues to implement message queues. User requests and other business events are structured into messages and posted to the message queue. The message handlers act as consumers to obtain messages from the consumption queue for processing. In this way, message generation and message processing are separated, and new message producer tasks or new message consumer tasks can be added transparently.
Distributed service: Separate business from reusable service and call it through the distributed service framework. New products can implement their own business logic by calling reusable services without any impact on existing products. When the reusable service is upgraded and changed, the application can also be transparently upgraded by providing multi-version services, without the need to force the application to synchronize changes.

Website optimization methods

front end

Optimize page html style
Take advantage of the concurrency and asynchronous features of the browser
Reduce http requests
Enable compression, enable GZip compression for HTML, CSS, and Javascript files
Css on the top of the page, Javascript on the bottom
Reduce cookie transmission
Browser caching strategy
Use CDN, reverse proxy

rear end

Cache

Frequently modified data : the read-write ratio is above 2:1, the cache is meaningful, that is, once written, it should be read at least twice before writing again
Access without hotspots : If access to data does not follow the 28th law, caching is meaningless
Data inconsistency and dirty reads : Two strategies, one is to set an expiration time to expire reasonably; the other is to update the cache immediately
Cache availability : Cache downtime, which has an impact on the database, requires attention. In this regard, there are cache hot backup and distributed cache. The former goes down and switches to another cache server 不推荐, while the latter establishes a cluster.
Cache warm-up : If there is no hot data in the newly-started cache system, the hot data should be loaded at startup. This pre-loading method is cache warm-up
Cache penetration : Data that does not exist for high-concurrency front-end requests should also be automatically added to the cache (the value can be null) to prevent pressure on the database.
Common architecture : One is to update the synchronization cache represented by jboss cache, each application server has a corresponding jboss cache, and each jboss cache updates and synchronizes data with each other; instead, it is a non-communication cache represented by memcached, under the cluster The mencached together form a big cache.

Cluster

Use load balancing for clustering

Asynchronous operation

Even if the message queue is used, the front-end request is sent to the application server, and the server is sent to the message queue server for one-by-one processing. The image is that the front-end users queue up to do (that is, there is waiting), and all are handed over to the back-end for queuing, and then The front end returns the message instantly.

Code optimization

Use multithreading
1. Reason: When waiting for IO operation, cpu can do other things; in the case of multiple cpu, if you want to maximize the use of cpu, you must start multithreading
2. Start = number of threads [task execution time / (-IO task execution time latency)] xCPU number of cores
  so that the scenes using IO time-consuming, CPU core multi case
3. Solve thread safety : design objects as stateless objects, use local objects (method internal objects), and use locks for concurrent access
Resource reuse
Reduce the creation and destruction of expensive resources, such as: database connections, network communication connections, threads, and complex objects. Therefore, there are two commonly used methods: singleton and object pool , such as: Service to Dao singleton, connection pool, Thread Pool
data structure
Garbage collection
Young Generation（Eden+From+To）+Old Geneartion
Objects are created in the Eden area, and Young GC is triggered when the area is full. The objects still in use in the Eden area are copied to the From area, and the area is full again. The objects still in use in the Eden area + From area are copied to the To area, and the area is full again. Then the Eden area and the To area are still being used objects copied to the From area, and so on. When the object exceeds a certain threshold and has not been released, it will be copied to Old Generation. If the Old Generation area is full, perform Full GC (full recovery, affecting performance), and the Full GC should be minimized

Storage performance optimization

Mechanical hard disk vs. solid-state hard
disk For random access, mechanical hard disk moves the head arm many times, and its performance is not strong; solid-state hard disk (SSD or Flash hard disk) uses silicon crystal storage with persistent memory, which can be accessed quickly and randomly like memory, so SSD performance Better
B+ tree vs. LSM number
To ensure that the data is still in order after being updated, inserted, and deleted, traditional relational databases use B+ trees. At present, many NoSQL products use LSM trees. LSM trees have better performance, allowing relational databases to use LSM numbers.

Session management

The Session replication
application server enables the session replication function of the web container. Each server synchronizes session data, which occupies a large amount of server and network resources on a large scale , so it is suitable for small-scale clusters .
Session bound to the
load balancing server always distributes the requests of the same IP to the same server, such as the ip_hash strategy of nginx load balancing . If the user's IP changes, the session may be lost, so cookie identification can be used to replace the IP (at this time, the load balancing server should work at the http protocol layer). But if the server goes down, the session stored on the machine will still disappear, so it is not recommended under high availability requirements .
Each time the cookie saves the session , the cookie carries the session for the request. Disadvantages: one is that the size of the cookie may be limited, the second is that carrying the session affects performance, and the third is that the user may close the cookie. So it is not recommended
The session server is推荐
divided into two types according to requirements: one is to package based on a distributed cache or database to meet the requirements of session storage and access; the other is to develop a special session service platform for single sign-on, etc.

Load balancing implementation method

HTTP redirection realizes load balancing. The
user request reaches the http redirect server (load balancer). The server calculates the actual target server that needs to be accessed, and returns it to the user through a 302 status code for redirection. The disadvantage is that the client needs two requests to complete the operation, the performance is poor , and the 302 redirection may cause the search engine to judge SEO cheating and reduce the search ranking.
DNS domain name resolution to achieve load balancing
In DNS configuration (such as Ali console), multiple A records are configured for the same domain name (A records are different target server IPs), and each domain name resolution will calculate a different real target IP based on the load balancing algorithm The address is returned to the client. The disadvantage is that the modification of the A record (if the server goes offline) takes a long time to take effect (a few minutes) due to caching , and the DSN control right is at the domain name service provider, which cannot be better improved and managed.
Reverse proxy realizes load balancing.
Reverse proxy is mostly used to separate website activity and static status, and reverse proxy some static resources without having to access the web server for fetching, improving response speed. The reverse proxy can also be used to achieve load balancing, that is, reverse proxy for "moving" things.
Reverse proxy servers such as apache and nginx provide load balancing functions: manage a group of web servers, and forward requests to different web servers according to the load balancing algorithm. The disadvantage is that all requests and responses have to go through the reverse proxy server, and performance becomes a bottleneck (application layer load balancing)
The
core of IP implementation of load balancing is to manipulate data packets, modify the destination IP address, and even the source address. When the request reaches the load balancing server, the target IP of the request packet is modified to the real target server obtained through calculation. In addition, modify the source address from the client ip to the load balancing server IP, so that the data response will return to the load balancing server, and then back to the client (or let the load balancing server be the gateway server of the real target server cluster, so that all response data Will also reach the load balancing server)
The data link layer achieves load balancing
similar to IP to achieve load balancing, but here is that the IP address is not modified during the distribution process, but the mac address, but here you need to configure the virtual IP of all the machines of the real target server cluster and the IP of the load balancing server In this way, the response data can be directly returned from the real target server to the client without address conversion by the load balancing server, forming a triangular transmission mode. In this way, the data response does not pass through the load balancing server, and the network bandwidth will be reduced. Do not become a bottleneck. This is also the most widely used load balancing method for large-scale websites at present. Open source products include: LVS for

Reading notes of "Technical Architecture of Large Websites: Core Principles and Case Studies"

Database evolution

Read and write separation

Business sub-library

Distributed database

Application system shared business extraction

Reverse proxy and nginx

Distributed message queue

definition

effect

Optimize website performance program

Browser side

Dynamic and static separation

Application server side

other

Code level

Database server side

Availability

Scalability

Scalability

Website optimization methods

front end

rear end

Cache

Cluster

Asynchronous operation

Code optimization

Storage performance optimization

Session management

Load balancing implementation method

Posted by Magallanes

You may like these posts

Post a Comment

0 Comments

Freelancer

AI writing tool

Money Robot Submitter

Most Popular

Tags

Categories

LINK

Recent Post

Popular Posts

Footer Menu Widget

Contact form