Michael Meeks: The spreadsheet is dead. Long live the spreadsheet!

by Guest Author on 7 May 2014, 09:15

Tags: AMD (NYSE:AMD)

Quick Link: HEXUS.net/qacduv

Add to My Vault: x

This is a guest blog by Michael Meeks, General Manager of Collabora Productivity and leader of the LibreOffice team. The views expressed in this blog are his alone.

Changing how Calc uses silicon

LibreOffice is an Open Source office suite used all over the world, on Windows, MacOS and Linux. Between January and October 2011, it was downloaded approximately 7.5 million times, about 15 million times in 2012 and around 25 million times in 2013. Counting our user-base is hard, but each week, another million unique IP addresses check in to see if there is a new version to download.

One of the key productivity applications in LibreOffice is Calc, a spreadsheet application. We’ve been doing some very heavy retooling of Calc behind the scenes of late, working with the LibreOffice community and AMD to make Calc one of the most capable and up-to-date productivity applications out there.

This rework has involved changing how Calc uses the silicon in the devices it runs on. In short, spreadsheet apps normally rely on the CPU. Actually, they should be smart about using the right processor, and for Calc and other spreadsheet programmes, that’s the GPU. And the results are stunning – a near seven times speed boost in benchmarking. Here’s the how and why.

Spreadsheets are bigger than ever

A significant portion of spreadsheet documents are created to help make business decisions, and they’re used for things the original creators of the format never imagined. Their performance has, for many years, been directly correlated to what you were doing with them, and how much CPU horsepower you had to hand. People’s understanding of what spreadsheets could do was limited to specific areas of the business, and access to large amounts of data to process was often limited, too. All of that is changing, and changing fast.

People are using spreadsheets differently now. Large chunks of data are easier to store – it may have been said in jest, but the idea that Big Data is data that has become simpler and cheaper to store than to throw away is true. Secondly, people are more aware of how they can take large data sets and smash them together to get a third set of usable information. To do this, they need a more powerful spreadsheet application that can do what modern working environments demand of it.

Looking back at an early spreadsheet from 3,000 BC – an obelisk – the data it contains (“don't fight the big guy”) is tiny – only four columns for a start – compared to modern spreadsheets. Excel 2010, for example, can handle 16,000 columns and 1,000,000 rows. When you add in formulae, pivot tables and the like, that’s an awful lot of information to handle and change.

Parallel processing on a GPU

When you look at applications like Microsoft Excel and Calc, the spreadsheet application in LibreOffice, elements of both of these can trace their lineage all the way back to Visicalc. CPU benchmarking has always involved testing the CPU with large spreadsheet transformations. But actually, the best processor for handling, say, formulae scattered over 16,000 x 10^6 cells is not the CPU. It’s the GPU. Doing this boosts both performance and, somewhat counter-intuitively, saves power at the same time. Graphics processors are extremely good at handling parallel processing tasks – certainly compared to CPUs. Because GPUs can do jobs in parallel at an optimal clock frequency, there’s a power saving involved as well.

When you get to a point where you have more than a hundred rows of data in a column, the GPU can start to help. These sorts of documents are actually quite common. Finance is always the example trotted out, but every day office applications can lead to documents that suffer from terrible performance. Take Human Resources, keeping track of staff attendance. A spreadsheet is created, then more and more data is added over time, with formulae extended to crunch it. The value of doing this might initially be to help the sales team understand whether people buy more red cars on a Tuesday, but the complexity quickly grows. Are people more likely to buy red cars with the performance pack and alloys on the Friday after pay day, for example? Doing this creates documents with thousands of rows and quite a bit of complexity.

The right technologies for the job

Reworking Calc has created other benefits. The team wrote a converter to turn your standard Formulae into OpenCL – the Open Compute Language. By doing this, we were able to make the move from CPU to GPU, but it also allows us to do something else: to take advantage of the next generation of processor designs using a Heterogeneous Systems Architecture (HSA) which allows a much more efficient OpenCL implementation in spreadsheets.

Anyway, thanks to OpenCL, these spreadsheet optimisations are now far more portable than before: there’s no need to write custom assembler for each CPU you are porting to. This benefit – and the other benefits I’ve described above – can be applied to many applications that do a lot of number crunching. The right processor for the job can be used, for starters – and porting that hard optimisation work to other platforms – tablets, or smartphones, for example is easier. Using OpenCL also improves the underlying code structure and boosts performance – whether a GPU is present or not.

What does this mean in the real world? Well, nearly a seven times faster performance on AMD versus Intel in a benchmark test1 based on real time analysis and visualisation of streaming stock quotes. Accomplishing a task in a seventh of the time sounds like the sort of performance benefit heavy spreadsheet users could appreciate.

All of this is why the contributors to Calc – including engineers from AMD and Collabora (and MultiCore Ware) – helped rework LibreOffice Calc to make best use of the best bit of the processor in your PC to handle spreadsheets. We later worked out it was the biggest core re-factoring of Calc code in over a decade, and as a result, Calc is faster and more powerful for everyone. There is also the lovely, cool feeling of the right compute unit in your computer doing the work not only more quickly, but not glowing quite so hot as well.



About Michael Meeks

Michael is an enthusiastic believer in Free Software. He is the General Manager of Collabora Productivity, leading our LibreOffice team, supporting customers, consulting on development alongside an extremely talented team. He serves as a member of the board of The Document Foundation, and the LibreOffice Engineering Steering Committee; in the past he served on ECMA/TC45 improving Microsoft's description of their format OOXML. Prior to this he was a Novell/SUSE Distinguished Engineer working on various pieces of Free Software infrastructure across the Linux desktop stack. Prior to that he worked on both hardware and software for real-time video editing at Quantel.



1 AMD A10-7850K APU gets up to 7X better OpenCL performance with LibreOffice Calc. AMD tests are performed on optimized AMD reference systems. PC manufacturers may vary their configuration yielding different results. Test project used LibreOffice Calc V4.2.0 to perform 21,000 calculations, and plot 1000 points for 21 different stocks. A desktop PC with AMD A10-7850K APU with AMD Radeon™ R7 Series graphics, 2x4GB DDR3-2133 RAM, video driver 13.300.0.0 - 09-Dec-2013, took 120 milliseconds with OpenCL on. A desktop PC configured with an Intel Core i5-4670K with Intel HD 4600 graphics, 2x4GB DDR3-1600, video driver 10.18.10.3345 - 30-Oct-2013, took 950 milliseconds with OpenCL™ on. Both systems used the same SSD hard drive and Window 8.1 build 9600. KVD-8



HEXUS Forums :: 5 Comments

Login with Forum Account

Don't have an account? Register today!
OK, maybe this is coming at it just from my little corner of the world, investment finance, mostly derivs, but this isn't going to help at all.

The bottleneck on some of these spreadsheets (and boy, I've seen them, the worst was in 2006, running on Office XP, a 285meg behemoth, I didn't know it would let them get so big.) it wasn't due to formula references. It was due to bespoke libraries that were called via Excel formula. These wouldn't run on a GPU.

Then we get to the fact that you want predictive branching for a lot of these formula, which GPUs suck at.

Can't help but think this is a bandwagon hop-on.
Interesting posting/article. :thumbsup:
TheAnimus
OK, maybe this is coming at it just from my little corner of the world, investment finance, mostly derivs, but this isn't going to help at all. … It was due to bespoke libraries that were called via Excel formula. These wouldn't run on a GPU. … Can't help but think this is a bandwagon hop-on.
If you're doing stuff outside of Calc then sure, it's pretty obvious that them optimising the Calc engine isn't going to make a blind bit of difference. Then again, if you're dependent on external libraries would you be looking at Calc as an Excel replacement? Probably not.

You're right that it's not going to noticeably help everyone - but on the other hand if they've done work on getting a better sheet engine then that's surely worth an “attaboy” on principle? Plus if they've got the ability to seamlessly select the “best” engine for the job - CPU or GPU - then that's also worthy of some praise surely?

I suspect that it's the engineering and science users that'll be able to make best use of the new features. And yes, I realise that AMD aren't being altruistic - they want to sell their APU's etc. On the other hand, the accusation of “bandwagoning” is maybe a bit harsh. Like I said, “we” get an improved engine out of it, so I'm not that bothered if AMD are driving it, I assume that given it's OpenCL based that it'll work fine on NVidia and Intel gear anyway. Not something you could say if NVidia were the partner because it's an easy assumption that they'd want CUDA used.

Problem is that Joe Public isn't that bothered about calc'ing mult-mega datasets, they just want their accounts to look pretty. And I'm afraid that Excel is still the best tool for the job for that kind of thing.
While it may not provide real-world performance increases for most, I applaud the use of GPGPU to start accelerating all floating point operations. The future will be much better for it once it becomes standard.
I am actually surprised that the Intel chip came off so badly.

There are some OpenCL benchmarks that Intel actually do rather well at. If you are crunching really really big data, then yes the Intel chip will lose as it just doesn't have the raw GPU horsepower, but the different cache structure on the Intel chip sometimes give it an advantage.

I suspect once AMD have done all the hard work, Intel will pop in with a bit of a tweak to get performance parity :D
Take Human Resources, keeping track of staff attendance. A spreadsheet is created, then more and more data is added over time, with formulae extended to crunch it. The value of doing this might initially be to help the sales team understand whether people buy more red cars on a Tuesday, but the complexity quickly grows. Are people more likely to buy red cars with the performance pack and alloys on the Friday after pay day, for example?

Why is Human Resources concerning itself with the cars that the company's staff buy? And what's so important about the the day of the week and the colour and options? ;o)