tl;dr
Heave is now Thread-Safe! A Major Milestone
I'm incredibly excited to announce that the Heave library has just hit a major milestone: the Catalog is now thread-safe! This has been on the roadmap for a while, and it marks a significant step forward in making Heave a more robust and versatile tool for your projects.
What Does "Thread-Safe" Mean for Heave?
At its core, this update introduces a Mutex that protects the Catalog's internal in-memory cache. This means you can now wrap a Catalog instance in an Arc (Atomically Reference Counted pointer) and share it among multiple threads without worrying about data races or inconsistent state.
Before this change, you would have had to manage your own locking mechanisms if you wanted to access a single Catalog from multiple threads. Now, it just works.
Here’s a quick example of what you can do now:
use heave::Catalog;
use std::sync::Arc;
use std::thread;
// Create a single, shareable Catalog instance
let catalog = Arc::new(Catalog::new("my_database.db"));
catalog.init().unwrap();
let mut handles = vec![];
// Spawn multiple threads to write data concurrently
for i in 0..10 {
let catalog_clone = Arc::clone(&catalog);
let handle = thread::spawn(move || {
// Each thread can safely call upsert
let product = Product { id: format!("product-{}", i), ... };
catalog_clone.upsert(product).unwrap();
});
handles.push(handle);
}
// Wait for all threads to finish
for handle in handles {
handle.join().unwrap();
}
// Persist all the changes from all threads in one go
catalog.persist().unwrap();
As you can see, each thread gets a reference to the same Catalog and can safely perform operations on it. The internal Mutex ensures that access to the in-memory item map is serialized, preventing chaos.
Thread Safety vs. Concurrency: A Deliberate Choice
It's important to clarify the distinction between the thread safety I've just added and full database concurrency.
This milestone is about making the in-memory Catalog safe to use from multiple threads. It serializes access to the internal HashMap, so only one thread can modify it at a time.
What it doesn't do (yet) is enable true parallel database operations. When you call persist(), the database write operations are still sequential within that single transaction. This was a deliberate design choice.
By focusing on thread safety first, I can provide a solid, safe foundation without introducing the complexity of managing a concurrent database pool. This approach allows me to defer and delegate concurrency to you, the user. You can build your own threading model (like the example above) and manage your workloads, while Heave ensures that the underlying data structure remains consistent. It keeps the library lean and flexible.
What’s Next?
Of course, I'm always thinking about the next steps. The most logical evolution from here is to improve database access performance.
I'm currently evaluating the integration of a database connection pool (like r2d2 or bb8 for rusqlite). This would allow for true concurrent database reads and could significantly speed up read-heavy workloads. However, I want to be thoughtful about this. Adding a connection pool introduces another layer of complexity, and I want to ensure it’s done in a way that feels right for Heave and doesn't compromise its simplicity.
For now, I'm happy with the balance we've struck.
Thank you for following the development of Heave! Your feedback is always welcome.
Happy coding!
tags: #rust, #database, #project:heave