When to use NoSQL over SQL
Deciding between SQL and NoSQL is a important decision to make. While SQL databases like PostgreSQL can have unstructured data with JSON usually Binary JSON (BJSON) is used it can get troublesome when it gets deeply nested.
On the other hand most databases have relationships and can make NoSQL not applicable in most cases. Thus will talk when to SQL, NoSQL and in few use cases both polyglot approach can be optimal. In my example will use Go code snippets.
For the simplicity sake I will use same mutations and table content in SQL and NoSQL for comparison purposes.
When to use SQL
Structured Data: SQL databases are designed for structured data with well-defined schemas. If your data can be organized into tables with fixed columns and data types, SQL is a good choice.
ACID Compliance: If your application requires ACID (Atomicity, Consistency, Isolation, Durability) transactions, SQL databases are a better fit.
Scalability: SQL databases can handle vertical scaling (increasing server resources) effectively. If your application can run on a single server with sufficient resources, SQL may be suitable.
package main import ( "fmt" "sync" ) type User struct { ID int Balance float64 } // UserStore represents a simple in-memory user store. type UserStore struct { mu sync.Mutex users map[int]*User } func NewUserStore() *UserStore { return &UserStore{ users: make(map[int]*User), } } func (us *UserStore) GetUser(userID int) (*User, error) { us.mu.Lock() defer us.mu.Unlock() user, exists := us.users[userID] if !exists { return nil, fmt.Errorf("user not found") } return user, nil } func (us *UserStore) AddBalance(userID int, amount float64) error { us.mu.Lock() defer us.mu.Unlock() user, exists := us.users[userID] if !exists { return fmt.Errorf("user not found") } user.Balance += amount return nil } func (us *UserStore) SubtractBalance(userID int, amount float64) error { us.mu.Lock() defer us.mu.Unlock() user, exists := us.users[userID] if !exists { return fmt.Errorf("user not found") } if user.Balance < amount { return fmt.Errorf("insufficient balance") } user.Balance -= amount return nil } func main() { userStore := NewUserStore() // Create a user and add some initial balance. user := &User{ID: 1, Balance: 100.0} userStore.users[user.ID] = user // Add funds to the user's balance. err := userStore.AddBalance(user.ID, 50.0) if err != nil { fmt.Println("Error:", err) } // Subtract funds from the user's balance. err = userStore.SubtractBalance(user.ID, 30.0) if err != nil { fmt.Println("Error:", err) } // Get the user's balance. user, err = userStore.GetUser(user.ID) if err != nil { fmt.Println("Error:", err) } else { fmt.Printf("User %d Balance: %.2f\n", user.ID, user.Balance) } }
There is a lot going here(especially if not familiar with Go if so go learn it!).
But to cover it shortly User struct and UserStore is defined to manage users and their balances(both separate database tables with relationship). It It provides methods to add and subtract balances for a user while handling concurrency using a mutex.
You can scale this code vertically by optimizing it for better performance or by deploying it on a more powerful server to handle a larger number of users and balances.
While horizontal scaling is possible with SQL it can get problematic.
Let’s now create a semi-structured or unstructured data with SQL and show how it can get overcomplicated with an e-commerce system where you store product information in a PostgreSQL table. Each product has multiple attributes, and each attribute has options, and each option has sub-options, and so on, for five levels deep.
{ "product_name": "Example Product", "attributes": [ { "attribute_name": "Size", "options": [ { "option_name": "Small", "sub_options": [ { "sub_option_name": "Red", "sub_sub_options": [ { "sub_sub_option_name": "Stripe", "sub_sub_sub_options": [ { "sub_sub_sub_option_name": "Wool", "price": 19.99 }, { "sub_sub_sub_option_name": "Cotton", "price": 14.99 } ] } ] } ] } ] } ] }
In a real-world scenario, you would typically design your database schema to accommodate structured data like this with multiple related tables rather than deeply nesting data.
CREATE TABLE products ( id serial PRIMARY KEY, name varchar(255) NOT NULL ); CREATE TABLE attributes ( id serial PRIMARY KEY, product_id int REFERENCES products(id), name varchar(255) NOT NULL ); CREATE TABLE options ( id serial PRIMARY KEY, attribute_id int REFERENCES attributes(id), name varchar(255) NOT NULL ); CREATE TABLE sub_options ( id serial PRIMARY KEY, option_id int REFERENCES options(id), name varchar(255) NOT NULL, price numeric(10, 2) );
The code would look something like this to above table.
package main import ( "database/sql" "fmt" "log" _ "github.com/lib/pq" ) func main() { connStr := "user=username dbname=mydb sslmode=disable" db, err := sql.Open("postgres", connStr) if err != nil { log.Fatal(err) } defer db.Close() // Insert product data. _, err = db.Exec("INSERT INTO products (name) VALUES ($1)", "Example Product") if err != nil { log.Fatal(err) } // Insert attribute, option, and sub-option data. _, err = db.Exec(` INSERT INTO attributes (product_id, name) VALUES ($1, $2)`, 1, "Size") if err != nil { log.Fatal(err) } _, err = db.Exec(` INSERT INTO options (attribute_id, name) VALUES ($1, $2)`, 1, "Small") if err != nil { log.Fatal(err) } _, err = db.Exec(` INSERT INTO sub_options (option_id, name, price) VALUES ($1, $2, $3)`, 1, "Red", 19.99) if err != nil { log.Fatal(err) } // Query product data with attributes, options, and sub-options. rows, err := db.Query(` SELECT p.name AS product_name, a.name AS attribute_name, o.name AS option_name, so.name AS sub_option_name, so.price FROM products p JOIN attributes a ON p.id = a.product_id JOIN options o ON a.id = o.attribute_id JOIN sub_options so ON o.id = so.option_id WHERE p.id = $1`, 1) if err != nil { log.Fatal(err) } defer rows.Close() for rows.Next() { var productName, attributeName, optionName, subOptionName string var price float64 if err := rows.Scan(&productName, &attributeName, &optionName, &subOptionName, &price); err != nil { log.Fatal(err) } fmt.Printf("Product: %s, Attribute: %s, Option: %s, Sub-Option: %s, Price: %.2f\n", productName, attributeName, optionName, subOptionName, price) } if err := rows.Err(); err != nil { log.Fatal(err) } }
With JSONB it would maybe be more readable but quite a huge query.
package main import ( "context" "database/sql" "fmt" "log" _ "github.com/lib/pq" ) func main() { connStr := "user=username dbname=mydb sslmode=disable" db, err := sql.Open("postgres", connStr) if err != nil { log.Fatal(err) } defer db.Close() // Insert product data with JSONB. _, err = db.Exec(` INSERT INTO products (name, data) VALUES ($1, $2)`, "Example Product", ` { "attributes": [ { "name": "Size", "options": [ { "name": "Small", "sub_options": [ { "name": "Red", "price": 19.99 } ] } ] } ] } `) if err != nil { log.Fatal(err) } // Query product data with JSONB. rows, err := db.Query(` SELECT name, data FROM products WHERE id = $1`, 1) if err != nil { log.Fatal(err) } defer rows.Close() for rows.Next() { var productName string var jsonData string // Assuming your JSONB data is stored as text in the database. if err := rows.Scan(&productName, &jsonData); err != nil { log.Fatal(err) } // Parse the JSONB data into a struct or map for easier manipulation. var productData map[string]interface{} if err := json.Unmarshal([]byte(jsonData), &productData); err != nil { log.Fatal(err) } // Access and print product details. fmt.Printf("Product: %s\n", productName) attributes := productData["attributes"].([]interface{}) for _, attr := range attributes { attrMap := attr.(map[string]interface{}) fmt.Printf(" Attribute: %s\n", attrMap["name"].(string)) options := attrMap["options"].([]interface{}) for _, opt := range options { optMap := opt.(map[string]interface{}) fmt.Printf(" Option: %s\n", optMap["name"].(string)) subOptions := optMap["sub_options"].([]interface{}) for _, subOpt := range subOptions { subOptMap := subOpt.(map[string]interface{}) fmt.Printf(" Sub-Option: %s, Price: %.2f\n", subOptMap["name"].(string), subOptMap["price"].(float64)) } } } } if err := rows.Err(); err != nil { log.Fatal(err) } }
Another issue is if you want relationship data instead of JSON column you can do it but if you will want to add new input field it can be problematic as then you need to alter table every time unlike in NoSQL.
ALTER TABLE table_name ADD column_name datatype;
In most cases SQL solves most problems but with the e-commerce example the query and code gets quite complex and performance may increase when there is a lot of unstructured data, data without relationship.
When to use NoSQL
While NoSQL isn’t optimal if there is a lot of relationships and vertical scaling it has it’s set of benefits and use cas
Unstructured or Semi-Structured Data: NoSQL databases are more flexible when it comes to handling unstructured or semi-structured data, such as JSON or XML documents.
Horizontal Scalability: NoSQL databases are designed for horizontal scaling, making them a good choice for applications that need to distribute data across multiple servers or nodes.
Speed and Low Latency: NoSQL databases are often optimized for low-latency read and write operations, making them suitable for applications that require rapid data retrieval.
Other factors are High Write Throughput, Flexible Queries.
package main import ( "context" "fmt" "log" "go.mongodb.org/mongo-driver/bson" "go.mongodb.org/mongo-driver/mongo" "go.mongodb.org/mongo-driver/mongo/options" ) type User struct { ID int `bson:"_id"` Balance float64 `bson:"balance"` } type Transaction struct { ID int `bson:"_id"` UserID int `bson:"user_id"` Amount float64 `bson:"amount"` } func main() { // MongoDB connection settings clientOptions := options.Client().ApplyURI("mongodb://localhost:27017") client, err := mongo.Connect(context.Background(), clientOptions) if err != nil { log.Fatal(err) } defer client.Disconnect(context.Background()) // MongoDB database and collection db := client.Database("mydb") usersCollection := db.Collection("users") transactionsCollection := db.Collection("transactions") // Create a user and add some initial balance user := User{ID: 1, Balance: 100.0} _, err = usersCollection.InsertOne(context.Background(), user) if err != nil { log.Fatal(err) } // Add a transaction for the user transaction := Transaction{ID: 1, UserID: user.ID, Amount: 50.0} _, err = transactionsCollection.InsertOne(context.Background(), transaction) if err != nil { log.Fatal(err) } // Subtract funds from the user's balance transaction = Transaction{ID: 2, UserID: user.ID, Amount: -30.0} _, err = transactionsCollection.InsertOne(context.Background(), transaction) if err != nil { log.Fatal(err) } // Calculate the user's balance by summing transactions cursor, err := transactionsCollection.Find(context.Background(), bson.M{"user_id": user.ID}) if err != nil { log.Fatal(err) } defer cursor.Close(context.Background()) var totalBalance float64 for cursor.Next(context.Background()) { var t Transaction if err := cursor.Decode(&t); err != nil { log.Fatal(err) } totalBalance += t.Amount } if cursor.Err() != nil { log.Fatal(cursor.Err()) } fmt.Printf("User %d Balance: %.2f\n", user.ID, totalBalance) }
The NoSQL code snippet can do low relationship but when relationships increase scalability and maintainability can be troublesome. Let’s see an example of accessing deeply nested values for an e-commerce example same as in SQL example.
package main import ( "context" "fmt" "log" "go.mongodb.org/mongo-driver/bson" "go.mongodb.org/mongo-driver/mongo" "go.mongodb.org/mongo-driver/mongo/options" ) type Product struct { ID int `bson:"_id"` Name string `bson:"name"` Attributes []Attribute `bson:"attributes"` } type Attribute struct { Name string `bson:"name"` Options []Option `bson:"options"` } type Option struct { Name string `bson:"name"` SubOptions []SubOption `bson:"sub_options"` } type SubOption struct { Name string `bson:"name"` Price float64 `bson:"price"` } func main() { // MongoDB connection settings clientOptions := options.Client().ApplyURI("mongodb://localhost:27017") client, err := mongo.Connect(context.Background(), clientOptions) if err != nil { log.Fatal(err) } defer client.Disconnect(context.Background()) // MongoDB database and collection db := client.Database("mydb") productsCollection := db.Collection("products") // Query products with a specific sub-option filter := bson.M{ "attributes.options.sub_options": bson.M{ "$elemMatch": bson.M{ "name": "Red", "price": 19.99, }, }, } cursor, err := productsCollection.Find(context.Background(), filter) if err != nil { log.Fatal(err) } defer cursor.Close(context.Background()) for cursor.Next(context.Background()) { var product Product if err := cursor.Decode(&product); err != nil { log.Fatal(err) } fmt.Printf("Product: %s\n", product.Name) for _, attr := range product.Attributes { for _, opt := range attr.Options { for _, subOpt := range opt.SubOptions { if subOpt.Name == "Red" && subOpt.Price == 19.99 { fmt.Printf(" Attribute: %s\n", attr.Name) fmt.Printf(" Option: %s\n", opt.Name) fmt.Printf(" Sub-Option: %s, Price: %.2f\n", subOpt.Name, subOpt.Price) } } } } } if err := cursor.Err(); err != nil { log.Fatal(err) } }
In this example:
- We use the
$elemMatch
operator in the filter to find products where there is an attribute option with a sub-option matching the specified criteria. - We iterate through the cursor to retrieve and print the products that match the criteria.
This code allows you to query deeply nested data in a manner similar to your example SQL query but with MongoDB’s query syntax.
Remember that NoSQL flexible schema allows you to store and query data in a way that fits your application’s needs, including deeply nested structures.
Polyglot approach
A polyglot approach in the context of databases refers to using both SQL (relational) and NoSQL (non-relational) databases within a single application or system.
This approach is employed to take advantage of the strengths of each database type to address different data management requirements.
I will not dwell into this approach but I found a great YouTube video of implementing both approaches.
https://www.youtube.com/watch?v=4Jqf3DLHZ1o
Ultimately, the choice between SQL and NoSQL, or a combination of both, should be based on your application’s specific needs and requirements. Proper data modeling and planning are essential for effective database management and optimal application performance.