Author: Leitao Guo (Database and middleware manager at iQIYI)
Transcreator: Caitin Chen; Editor: Tom Dewan
Finding the right database solution for your application is not easy. At iQIYI, one of the largest online video sites in the world, we're experienced in database selection across several fields: Online Transactional Processing (OLTP), Online Analytical Processing (OLAP), Hybrid Transaction/Analytical Processing (HTAP), SQL, and NoSQL.
Today, I'll share with you:
I hope this post can help you easily find the right database for your applications.
When choosing a database, different people use different criteria:
Database procurement staff pay more attention to purchase costs, including storage and network requirements.
Database administrators (DBAs) care about:
Operation and maintenance costs:
Service stability:
Performance:
Scalability: Whether it's easy to scale horizontally and vertically
Security: Whether it meets audit requirements and prevents SQL injections and information leakage
Application developers care about:
At iQIYI, we mainly use these databases:
Because there are so many types of databases at iQIYI, application developers might not know which database is suitable for their application scenario. Therefore, we categorized these databases by application scenario and database interface, and we built a matrix:
This matrix has these characteristics:
On the left
In the upper left corner
Databases support OLTP workloads and the SQL language. For example, MySQL supports different transaction isolation levels, high QPS, and low latency. We mainly use it to store transaction information and critical data, such as orders and VIP information.
In the lower left corner
We use NoSQL databases to optimize special scenarios. Generally, these databases have simple schemas or they are schemaless with high throughput and low latency. We mainly use them as caches or key-value (KV) databases.
On the right
All are OLAP big data analytical systems, such as ClickHouse and Impala. Generally, they support the SQL language and don't support transactions. They have good scalability and long response latency. We can add machines to enlarge data storage capacity, and the response delay is longer.
Around the two axes’ meeting point
These databases are neutral, and we call them HTAP databases, such as TiDB. When the amount of data is small, they have good performance. When the data size is large or the queries are complex, their performance is not bad. Generally, to meet different application needs, we use different storage engines and query engines.
I'd like to recommend our database selection trees. We developed these trees based on our DBAs’ and application developers’ experience.
When you select a relational database, you can:
Consider your data volume and database scalability.
Make a decision based on:
When we choose a NoSQL database, we must consider many factors to decide whether to use the primary-secondary framework, client sharding, distributed cluster, Couchbase, or HiKV.
I'd like to share with you some tips for selecting a database: