Design A Chat System

Ace your homework & exams now with Quizwiz!

Small group chat flow

- First, the message from User A is copied to each group member's message sync queue: one for User B and the second for User C. - On the recipient side, a recipient can receive messages from multiple users. Each recipient has an inbox (message sync queue) which contains messages from different senders.

Long polling Drawback

- Sender and receiver may not connect to the same chat server. HTTP based servers are usually stateless. If you use round robin for load balancing, the server that receives the message might not have a long-polling connection with the client who receives the message. - A server has no good way to tell if a client is disconnected. - It is inefficient. If a user does not chat much, long polling still makes periodic connections after timeouts.

relational databases or NoSQL databases for chat?

- The amount of data is enormous for chat systems. - Only recent chats are accessed frequently. Users do not usually look up for old chats. - Although very recent chat history is viewed in most cases, users might use features that require random access of data, such as search, view your mentions, jump to specific messages, etc. These cases should be supported by the data access layer. - The read to write ratio is about 1:1 for 1 on 1 chat apps. So it's better to use NoSQL database for this case.

How to handle User's Disconnections? We all wish our internet connection is consistent and reliable. However, that is not always the case; thus, we must address this issue in our design.

- When a user disconnects from the internet, the persistent connection between the client and server is lost. A naive way to handle user disconnection is to mark the user as offline and change the status to online when the connection re-establishes. - Heartbeat mechanism

What kind of protocols take part in chat systems

- when the sender sends a message to the receiver via the chat service, it uses HTTP protocol. - the receiver side is a bit more complicated. Since HTTP is client-initiated, it is not trivial to send messages from the server. Over the years, many techniques are used to simulate a server-initiated connection: polling, long polling, and WebSocket.

1 on 1 chat flow

1. User A sends a chat message to Chat server 1. 2. Chat server 1 obtains a message ID from the ID generator. 3. Chat server 1 sends the message to the message sync queue. 4. The message is stored in a key-value store. 5.a. If User B is online, the message is forwarded to Chat server 2 where User B is connected. 5.b. If User B is offline, a push notification is sent from push notification (PN) servers. 6. Chat server 2 forwards the message to User B. There is a persistent WebSocket connection between User B and Chat server 2.

What's a websocket?

A WebSocket is a persistent connection between a client and server. WebSockets provide a bidirectional, full-duplex communications channel that operates over HTTP through a single TCP/IP socket connection. At its core, the WebSocket protocol facilitates message passing between a client and server.

User Login Online presence

After a WebSocket connection is built between the client and the real-time service, user A's online status and last_active_at timestamp are saved in the KV store. The presence indicator shows the user is online after she logs in.

Online presence

An online presence indicator is an essential feature of many chat applications. Usually, you can see a green dot next to a user's profile picture or username.

How clients and servers communicate with each other in chat systems?

Clients do not communicate directly with each other. Instead, each client connects to a chat service, which supports all the necessary features. -Receive messages from other clients. - Find the right recipients for each message and relay the message to the recipients. - If a recipient is not online, hold the messages for that recipient on the server until she is online.

Chat System communications via WebSockets

HTTP is a fine protocol to use, but since WebSocket is bidirectional, there is no strong technical reason not to use it also for sending. By using WebSocket for both sending and receiving, it simplifies the design and makes implementation on both client and server more straightforward. Since WebSocket connections are persistent, efficient connection management is critical on the server-side.

Step 1 - Understand the problem and establish design scope

It is vital to agree on the type of chat app to design. In the marketplace, there are one-on-one chat apps like Facebook Messenger, WeChat, and WhatsApp, office chat apps that focus on group chat like Slack, or game chat apps, like Discord, that focus on large group interaction and low voice chat latency. The first set of clarification questions should nail down what the interviewer has in mind exactly when she asks you to design a chat system. At the very least, figure out if you should focus on a one-on-one chat or group chat app. Some questions you might ask are as follows: Candidate: What kind of chat app shall we design? 1 on 1 or group based? Interviewer: It should support both 1 on 1 and group chat. Candidate: Is this a mobile app? Or a web app? Or both?Interviewer: Both. Candidate: What is the scale of this app? A startup app or massive scale? Interviewer: It should support 50 million daily active users (DAU). Candidate: For group chat, what is the group member limit?Interviewer: A maximum of 100 people Candidate: What features are important for the chat app? Can it support attachment? Interviewer: 1 on 1 chat, group chat, online indicator. The system only supports text messages. Candidate: Is there a message size limit? Interviewer: Yes, text length should be less than 100,000 characters long. Candidate: Is end-to-end encryption required? Interviewer: Not required for now but we will discuss that if time allows. Candidate: How long shall we store the chat history?Interviewer: Forever.

Heartbeat mechanism to solve Users's Disconnections issue

Periodically, an online client sends a heartbeat event to presence servers. If presence servers receive a heartbeat event within a certain time, say x seconds from the client, a user is considered as online. Otherwise, it is offline.

Stateless Services

Stateless services are traditional public-facing request/response services, used to manage the login, signup, user profile, etc. Stateless services sit behind a load balancer whose job is to route requests to the correct services based on the request paths. These services can be monolithic or individual microservices.

HTTP 101 Switching Protocols

The HTTP 101 Switching Protocols response indicates that the server is switching to the protocol that the client requested in its Upgrade request header.

Message table for group chat

The composite primary key is (channel_id, message_id). Channel and group represent the same meaning here. channel_id is the partition key because all queries in a group chat operate in a channel.

Why WebSocket?

The idea of WebSockets was born out of the limitations of HTTP-based technology. With HTTP, a client requests a resource, and the server responds with the requested data. HTTP is a strictly unidirectional protocol — any data sent from the server to the client must be first requested by the client.

Stateful Service

The only stateful service is the chat service. The service is stateful because each client maintains a persistent network connection to a chat server. In this service, a client normally does not switch to another chat server as long as the server is still available. The service discovery coordinates closely with the chat service to avoid server overloading.

Service discovery

The primary role of service discovery is to recommend the best chat server for a client based on the criteria like geographical location, server capacity, etc. Apache Zookeeper is a popular open-source solution for service discovery. It registers all the available chat servers and picks the best chat server for a client based on predefined criteria.

What kind of storage we should use for Chat System?

Two types of data exist in a typical chat system. - The first is generic data, such as user profile, setting, user friends list. These data are stored in robust and reliable relational databases. - The second is unique to chat systems: chat history data. -> key/value storage like Cassandra.

Message synchronization across multiple devices

User A has two devices: a phone and a laptop. When User A logs in to the chat app with her phone, it establishes a WebSocket connection with Chat server 1. Similarly, there is a connection between the laptop and Chat server 1. Each device maintains a variable called cur_max_message_id, which keeps track of the latest message ID on the device. Messages that satisfy the following two conditions are considered as news messages: - The recipient ID is equal to the currently logged-in user ID. - Message ID in the key-value store is larger than cur_max_message_id. With distinct cur_max_message_id on each device, message synchronization is easy as each device can get new messages from the KV store.

WebSocket establishing connection

WebSocket connections are established by upgrading an HTTP request/response pair. A client that supports WebSockets and wants to establish a connection will send an HTTP request that includes a few required headers: Connection: Upgrade Upgrade: websocketSec-WebSocket-Version: 13 Sec-WebSocket-Key: q4xkcO32u266gldTuKaSOw== Once a client sends the initial request to open a WebSocket connection, it waits for the server's reply. The reply must have an HTTP 101 Switching Protocols response code. After the client receives the server response, the WebSocket connection is open to start transmitting data.

WebSocket Protocol

WebSockets begin life as a standard HTTP request and response. Within that request response chain, the client asks to open a WebSocket connection, and the server responds (if its able to). If this initial handshake is successful, the client and server have agreed to use the existing TCP/IP connection that was established for the HTTP request as a WebSocket connection. Data can now flow over this connection using a basic framed message protocol. Once both parties acknowledge that the WebSocket connection should be closed, the TCP connection is torn down.

User Logout Online presence

When a user logs out, the online status is changed to offline in the KV store. The presence indicator shows a user is offline.

Chat System High-level design

the chat system is broken down into three major categories: stateless services, stateful services, and third-party integration.

Зачем используется заголовок Sec-WebSocket-Key?

Он используется для защиты от фальшивых запросов на установку соединения с вебсокетами.

Long polling

Открыть соединение и ждать, пока сервер отдаст новую информацию. После получения данных или наступления таймаута — переоткрывать соединение. With long-polling, a client makes an HTTP request with a long timeout period, and the server uses that long timeout to push data to the client.

Polling

Самый расточительный способ — через небольшие промежутки времени опрашивать сервер о наличии новых сообщений. Это приведёт к большому количеству «холостых» запросов: фронтенд тратит ресурсы компьютера на установление и закрытие соединения, сервер тратит время на генерацию ответа (возможно, даже пустого).

WebSockets

Этот протокол устанавливает долгоживущее TCP-соединение и обеспечивает двустороннюю связь между сервером и клиентом: прокладывает своеобразный «канал», в который обе стороны могут отправлять сообщения и считывать их.


Related study sets

Breast Chapter 3: Breast Anatomy and Development

View Set

Smartbook: Chapter 3 Adjusting Accounts for Financial Statements

View Set

MIC 205 learnsmart: respiratory diseases

View Set

Module 1 Quiz | ITE-249-02 Introduction to Information Security

View Set

ACC 4100 Chapter 14 - Partnerships: Formation & Operation

View Set

NMNC 4320 Professional Nursing Concepts

View Set

History Final Exam - Identifications Week 11

View Set