RavenDB has been round for slightly over a decade and is presently utilized by hundreds of shoppers, together with fortune 500 firms like Toyota and Verizon, however it hasn’t at all times been clean crusing. Like another software program, RavenDB has had its rising pains, significantly within the early days when the corporate was made up of only a handful of builders.
Consumer suggestions has at all times performed an necessary position in our improvement. A lot of RavenDB’s options have been added as a consequence of in style demand, and our clients have helped us discover numerous obscure bugs by utilizing our software program in methods we by no means might have imagined.
Even detrimental suggestions has helped us develop. Thanks partially to our critics, lots of RavenDB’s weaknesses have became strengths.
“My angle is that for those who push me in the direction of one thing that you just assume is a weak point, then I’ll flip that perceived weak point right into a power.” – Michael Jordan
A few of RavenDB’s most notorious critiques assist inform the story.
Knowledge Backup and Restore Performance
When Octopus switched to RavenDB, the suggestions (in 2012) was very constructive. The one grievance that they had was with OS compatibility within the backup system. The system was fairly bare-bones on the time and there was numerous room for enchancment.
Backups at the moment are carried out robotically at user-defined intervals and will be restored no matter OS. There are additionally additional catastrophe restoration choices to guard your information which we’ll get into later.
Nevertheless, that article was only a prelude to this brutal follow-up in 2014.
Enhancing Database Usability, Documentation, and Studying Assets
The Octopus staff discovered that RavenDB was extremely opinionated and labored in mysterious methods. It was straightforward to make errors, there was little in the way in which of steering or clarification, and documentation was sadly missing. To make issues worse, most firms didn’t have RavenDB consultants on employees to deal with issues after they arose.
Lack of expertise/experience is a problem for any new expertise, and NoSQL databases are nonetheless (comparatively) new in comparison with their long-dominant relational counterparts. The answer is to make the software program intuitive, and straightforward to be taught and use.
On the time of those posts, we weren’t doing a very good job of that and Paul’s criticisms weren’t distinctive. In response to our customers’ suggestions, we went all out to show issues round.
We created the RavenDB studio: a GUI that allows you to monitor all facets of the database and carry out most features simply by pointing and clicking.
We added a wealth of documentation, and presently have a number of staff working full time including and updating content material. Our CEO Oren Eini (often known as Ayende) wrote a complete (free) e-book known as Inside RavenDB 4.0, which supplies a complete clarification of RavenDB and its underlying design philosophy. We additionally created a bootcamp program to assist new customers get began, so there’s no scarcity of high quality studying materials.
Updating the “Protected By Default” Strategy
A lot of the Octopus staff’s confusion and frustration was brought on by RavenDB’s “secure by default” coverage. It was a design philosophy aimed toward stopping builders from “hurting themselves” with dangerous code.
Queries per session have been restricted and end result units have been bounded, and although the bounds have been affordable, they weren’t made obvious. Exceeding the bounds didn’t trigger an error or notify the consumer: the request would merely fail quietly. This led to events the place code would work otherwise in improvement than it could in manufacturing, with no clear cause why.
There was detrimental suggestions from a number of sources concerning this and it led to a overview of our “secure by default” method.
End result units have been initially restricted to 128, however at the moment are unrestricted. It’s now as much as the consumer to code responsibly and use paging to maintain end result units small and keep away from efficiency loss.
Queries have been restricted to 30 per session as a result of with greater than that you just’re successfully performing a DOS assault on your self. Lazy requests can help you mix a number of calls into 1, so to get to 30 would take some very uncommon (or simply plain dangerous) code. Now you can alter this restrict (from the default 30) for these uncommon circumstances.
MapReduce outcomes have been restricted to fifteen per merchandise mapped as a result of problem of reserving reminiscence for fanout indexes. We have now since eliminated that restrict, however customers will now obtain a efficiency notification if greater than 1024 outcomes might be produced from a single doc.
Briefly, due to consumer suggestions, RavenDB is quite a bit much less opinionated than it was, and features extra consistent with consumer expectations.
Earlier than attending to the remainder of Paul’s criticisms, we have to clarify the largest elementary change RavenDB has gone by way of because the time of his publish.
RavenDB 4.0: A Big Leap Ahead
Many issues clients had with RavenDB early on have been just too elementary to repair with small patches. We wanted an enormous replace to fully overhaul RavenDB’s core techniques, and 4.0 was our alternative.
Our firm went by way of quite a bit in preparation for 4.0: the variety of builders on employees doubled inside a 12 months and 80% of the codebase was rewritten from the bottom up.
After which, RavenDB took flight.
(You possibly can see the entire Growth Roadmap right here.)
Lastly, we might ship on the enhancements our customers had lengthy been ready for.
Now with that context, let’s get again to Paul and Octopus. The final merchandise on the checklist above pertains to Paul’s feedback about ESENT-related indexing errors and poor efficiency.
Database Indexes and the Voron Storage Engine
We weren’t proud of ESENT both, however we have been caught with it. It was flakey, delicate, solely labored on Home windows, and sadly, the idea for all of RavenDB’s storage and indexing.
4.0 is the place we made our break. We switched from .NET Framework to .NET Core to transcend Home windows and we launched our personal custom-made information storage engine: Voron.
Voron is rather more secure and dependable than ESENT and is very optimized for our database mannequin. Its introduction resulted in speeds rising by orders of magnitude. With indexing rebuilt on this ideally suited basis, it grew to become one of many highlights of RavenDB.
Clarifying Eventual Consistency
As regards to indexing; in his article “Would I take advantage of RavenDB once more?,” Jeremy Miller talks in regards to the issues of eventual consistency. Any time information is modified there’s a wait earlier than related (and now stale) indexes are up to date to be according to the uncooked information. When you question an index that hasn’t been up to date you could obtain out-of-date info.
In RavenDB queries should be carried out towards indexes. You can not question uncooked information. This rule is for efficiency sake, since querying an index is way sooner, however what about stale indexes?
We do not really let indexes keep stale lengthy (they’re up to date after 5ms) and typically, they current no issues. Nevertheless, in conditions the place consistency is crucial, RavenDB has the solutions – you may examine for stale reads or make queries look ahead to indexes to be up to date.
Whereas we haven’t modified our method to eventual consistency, we’ve made it a lot clearer with our introductory and studying supplies.
MapReduce Customization
In his publish “Migrating from RavenDB to Cassandra,” Aaron Stannard takes problem with the way in which RavenDB implements MapReduce, with these two complaints:
-
RavenDB’s MapReduce system required re-aggregation “which works nice at low volumes however scales at an inverted ratio to information development.”
This can be a little complicated. RavenDB performs incremental aggregation and has by no means required re-aggregation. Consequently, efficiency is great no matter information development.
-
The MapReduce pipeline was too simplistic, which prevented them from performing extra in-depth {custom} queries.
This criticism had advantage, and we responded by opening the door for consumer customization. Now you may simply add your individual code to RavenDB’s map/cut back pipeline and make it do no matter you need.
We’re really actually pleased with our MapReduce system as a complete, so we created a whole webinar dedicated to it.
What’s Occurring With Sharding?
Aaron additionally had some very legitimate criticism of RavenDB’s sharding system:
“Raven’s sharding system is actually extra of a hack on the shopper stage which marries your community topology to your information, which is a extremely dangerous design selection…”
Aaron was proper on the cash – it wasn’t good. We determined to desert our client-side simulation of sharding and at the moment are working to introduce true sharding in our subsequent main replace: Model 6.0.
Introducing In-Reminiscence Testing
One other problem raised by Jeremy Miller was RavenDB’s lack of in-memory testing. On the time this was a spot in our function set, one we stuffed with a greater than enough resolution.
Like different techniques, RavenDB now supplies a totally in-memory mode for database testing. It’s blazingly quick, you may set it up with 1 line of code, and you need to use it for automated testing in a CI/CD pipeline.
In contrast to different techniques*, the in-memory model will behave identically to how your database does in manufacturing, making certain your outcomes are at all times legitimate.
*In Microsoft’s documentation for entity framework:
Database Excessive-Availability and Catastrophe Restoration
In his publish, “Preliminary ideas on Octopus Deploy 3.0 – from RavenDB again to SQL Server,“ Ian Paullin talks about his group’s lack of HA/DR options with RavenDB.
These have been critical points and we handled them accordingly.
Because the notes say, clustering supplies nice fault tolerance. One thing they don’t point out is that RavenDB makes clusters extraordinarily straightforward to arrange and handle.
RavenDB is finest run on clusters of a minimum of 3 nodes, that approach if one or two nodes fail your database will nonetheless keep up and operating. Computerized replication and task failover shield your information, whereas options like load balancing and concurrent reads and writes present important efficiency advantages.
RavenDB additionally makes use of hashing to maintain a watch out for information corruption. You possibly can exchange the affected arduous disk and replicate your database again to that node.
When you don’t have the information replicated to a different node or backed up, the Voron restoration software can nonetheless allow you to get it again.
Extra Programming Language Assist
In his publish “Ideas on MongoDB vs Conventional SQL and RavenDB,” John Culviner famous that RavenDB “suffered from bugs and a sub-par CLI/querying API (until you have been utilizing C#).”
This was a legitimate criticism on the time. Early in improvement, we centered totally on C#, and help for different languages lagged behind.
We’ve at all times obtained numerous suggestions on this problem, and we’re at all times making an attempt to accommodate. On the time of writing we now have shoppers for C#, Java, Node.js, Python, Ruby, and Go.
The Publish-4.0 Period
Unfavourable articles and critiques are quite a bit more durable to search out after the discharge of 4.0. Probably the most important critique comes from long-term consumer Alex Klause in his publish titled “RavenDB: Two Years of Ache and Pleasure.” Alex had numerous good to say, however there have been some disappointing negatives as properly.
His first criticism was a couple of lack of documentation and group. It’s true that RavenDB doesn’t have the largest group, although it has grown considerably because the time of the article. We offset this by offering intensive help on our group boards, by way of help packages out there to our clients, and by offering complete documentation and studying supplies.
The subsequent problem was the inexcusable variety of (rapidly fastened) bugs.
Sadly for Alex, he began utilizing RavenDB shortly earlier than our largest set of modifications ever. Within the interval he used RavenDB (v3.5 – v4.2), a set of previous bugs have been ready to be squashed, solely to get replaced in 4.0 by a complete new set simply ready to be found. It was a wild time, and issues have settled down quite a bit since then. Today, bugs are far much less widespread and much much less important.
Profiling RavenDB
As Alex famous, and as remains to be the case, RavenDB doesn’t have a single devoted profiler. As an alternative, it now has a number of profiling instruments, all of which you could find and use contained in the studio.
With these instruments, you may monitor and debug nearly each facet of your database’s operations in our intuitive and aesthetically pleasing GUI.
Raven Question Language (RQL)
Oren additionally addressed the feedback about RQL, however the confusion is comprehensible. A very good understanding of RQL requires a very good clarification, and as talked about our studying assets have been missing on the time. An excellent clarification can now be discovered in Oren’s e-book.