Posted on 1. February 2010

Multi-core scaling of the Revit Database

On a fairly regular basis I’m asked questions along these lines:

  • “We’ve been asking for multi-threaded Revit for years. How hard can it be? <Insert usual ADSK expletive>…”(1)
  • “Why doesn’t Revit utilise all my PC’s cores like <insert application>…”
  • “Revit isn’t a real database, and it’s slow according to our DB expert…(2)”
  • “If ArchiCAD can use multiple cores then there should be no reason Revit can’t do it.(3)”

These questions and Intel’s press release on their 48 core processor has prompted this post in an attempt to explain why I believe it is a difficult Software engineering project to make Revit performance scale with multiple cores.(4)

Do I think we’ll get a measure of scaling eventually? Yes. But it will probably happen progressively and with caution.

What is a database?

I’ll leave it to Wikipedia to describe in detail. I’m going to deliberately try and explain the problem without using computer science . However ;-), to set the scene…

Simplistically, a database comprises 2 major areas of functionality.

Transactional Storage. The ACID properties of the distributed Revit database and other aspects of database storage I don’t intend to discuss.

Querying. This is the aspect of the Revit database that effects the user experience the most when it comes to performance. Querying is used by an application for reading and writing/ updating of a database.

It is the querying of the Revit database I’m going to discuss in this post.

Setting the scene

All the features Revit users take for granted such as :

  • A window moving with it’s host wall when the wall is moved.
  • A wall locked to a grid moving with the location change of the grid.
  • A view title on a sheet updating when the view is renamed

They all are based around a single principle of a defined relationship. Take for example a window hosted by a wall. This can be represented as such:

RevitRel

Keep this in mind as I build up a picture of the problem Autodesk have with querying the Revit database with a multi-core architecture.

Doing some work

Say you have some work to do. There are 5 discreet tasks that can be represented as follows:

 FiveTasks

At the moment you are on your own, so you start with one and after finishing that you move onto the next one. But it’s too slow. The boss wants to speed things up so he brings in more people to each do a discreet task (5):

 FiveTasksFiveWorker

The boss is happy but he notes four of the workers are sitting there twiddling their thumbs waiting for the slowest task to finish so they can go home. So he splits that slow task up into discreet subtasks and gives the other workers one subtask each to do:

 FiveTasksSubFiveWorker

There’s some extra work required combining the discreet subtasks back together in a coordinated way which costs some time but overall the tasks are all completed in considerably less time.(6) Everyone goes home happy…

The Revit taskforce

The boss gives the workers some new tasks. However, these tasks are no longer discreet, each task depends on some aspect of another task:

 FievRelTasks

In other words each task has a relationship to another task.

So no task can be completed independently by a worker without coordinating and communicating with another worker. This coordinating takes planning,time, and a task may not be completed if the dependent task fails. More critically, throwing more workers at the tasks may not have the desired effect because they’ll spend more time communicating than doing any work on the task.

This in a nutshell is the problem for Autodesk with Revit. Except it looks more like this:

 RevitTF

Each task has multiple relationships to other tasks . So where do Autodesk start when throwing more workers at the tasks? Critically you don’t want to be doing a task more than once. And some tasks will be a priority in some contexts and not others. Revit has to manage this and work out the best route to finishing updating a project as quickly as possible. Put simply this is a massive mathematical and computer science project on many levels.

Do Autodesk have the resources to solve this problem? Yes.

Will they succeed? Sure.

Will it happen overnight? I doubt it.

Will they achieve a linear scaling of performance for n-cores? No, for reasons that I described above.

So the next time someone asks, I hope this post goes someway to helping you explain why it’s so difficult for Autodesk to scale Revit for multi-core CPU architecture.

Notes:

(1) Revit is multi-threaded but it doesn’t scale linearly with multi-cores across all areas of functionality. Some functionality and rendering will use multiple cores but the improvement will be barely noticeable for now ( with the exception of rendering).

(2) Revit is not built on a relational database object model. More like a object database model with some similarities to source code systems like CVS for replicating changes.

(3) The details would take another post but I don’t consider archiCAD a true BIM application. It took Revit for people to realise Buildings were designed by multi-disciplined teams and therefore a coordinated Building model for ALL disciplines would be the game changer that we now think of as BIM. One of the core differences between Revit and archiCAD can be explained by the principles discussed in this post.

(4) I have no access to the source code (obviously), I have had no discussions with the factory on this post. But I think I have an understanding of the issues they’re facing.

If you’re part of the DB team in the factory, please don’t laugh too loudly ;-)

(5) This is equivalent to raytrace rendering, video encoding etc where each thread can handle a discreet task and therefore you get close to n-core scaling.

(6) Simplistically this is what we get now with some aspects of Revit functionality.

Comments

Comments are closed