[slicer-users] Fwd: Re: [slicer-devel] Dicom Performance in Slicer3

Ron Kikinis kikinis at bwh.harvard.edu
Mon Mar 15 10:18:38 EDT 2010

Please chime in by adding comments to the wiki page mentioned below.

-------- Original Message --------
Subject: Re: [slicer-devel] Dicom Performance in Slicer3
Date: Mon, 15 Mar 2010 07:31:18 -0400
From: Steve Pieper <pieper at bwh.harvard.edu>
To: Andras Lasso <lasso at cs.queensu.ca>
CC: slicer-devel at bwh.harvard.edu

Hi Andras -

Great point - all of the implementation decisions should be driven by
what our users actually need and want.  There's a short section on
'Goals' at this page, maybe you could flesh it out with the user


So far, two approaches are being investigated to give the user the
feeling of instant gratification when loading the data: 1) Bill is
working with the FileWatcher library to monitor a directory for dicom
data that can be added to the database in the background and 2) if the
dicom data comes via a network operation, the updating the database can
be done automatically.


On M/14/10 9:44 PM, Andras Lasso wrote:
> Hi all,
> It is great to see the discussion about different implementation options and
> their potential performance, to be able to make any informed decision we
> need to know more about the clinical/user requirements. Even during the
> technology/toolkit feasibility study phase it would help to have some basic
> requirements.
> For example, an important requirement is that the patient browser shall be
> displayed immediately (let's say within 1 seconds, with all the already
> imported data). It doesn't seem feasible to have a data import
> (reading+parsing DICOM files) implementation that can provide this speed,
> therefore having a database seems to be inevitable. Even if parsing takes
> zero time, reading a couple of hundred bytes from tens of thousands of files
> (potentially right after a reboot) takes much more than a second.
> Steve,
> as you have the best insight into all of these efforts - could you please
> suggest a place where we could collect the related requirements (or where
> they are listed already)? I would be happy to contribute/review.
> thanks
> Andras
> -----Original Message-----
> From: slicer-devel-bounces at bwh.harvard.edu
> [mailto:slicer-devel-bounces at bwh.harvard.edu] On Behalf Of Mathieu Malaterre
> Sent: March-14-10 5:40 PM
> To: Steve Pieper
> Cc: slicer-devel at bwh.harvard.edu
> Subject: Re: [slicer-devel] Dicom Performance in Slicer3
> On Sun, Mar 14, 2010 at 5:03 PM, Steve Pieper<pieper at bwh.harvard.edu>
> wrote:
>> Hi Mathieu -
>> I didn't know about gdcmscanner - looks very useful!  In terms of what we
> we
>> are trying to do, there are several threads of activity going on at once
> and
>> I can try to summarize here.
>> On one hand, Bill has been trying to figure out why loading dicom files is
>> slow (or slowish) in the vtkITK reader used in slicer (that effort is
> where
>> this thread got started).  It appears that one core issue is that the same
>> files are parsed and sorted multiple times, and hence there is a goal to
>> parse once and store the results for later using sqlite.
> Ok. Please note that most system I know of (Linux and Windows) do
> file-caching. So reading the same file, twice in a row is a no-op the
> second time (see below how to flush file cache (*)).
>> At about the same time, several of us were in Heidelberg doing a hands-on
>> programming event (a.k.a. hackfest [1]) and found out that the MITK group
>> [2] already had been using the same sqlite approach in one of their tools.
>>   So the goal of the wiki on DICOM:Database [3] was to collect and exchange
>> information on this topic.
> Ok. I do not know sqlite3, but I'd be interested if anyone could tell
> me why my toy test is so slow (**)
>> The bigger picture is that the CTK group [4] aspires to provide a range of
>> software for medical imaging to fill in gaps between existing toolkits.
>   In
>> this specific case, a Qt-based dicom interface that emulates what Osirix
>> provides for Query-Retrieve and local database display.  Hence the
>> discussion on the wiki [5].
> Ok. This is really cool, since InVesalius, DeVide or VR-Render have
> each there own implementation.
>> Because of the query-retrieve goals, we have focused on dcmtk rather than
>> gdcm - however we have tried to keep the GUI and the dicom parsing
>> independent (they only communicate via the sqlite database) so the
> database
>> could be populated by any tool.
> I gave sqlite3 a try:
> (**)
> http://gdcm.svn.sf.net/viewvc/gdcm/trunk/Examples/Cxx/DumpToSQLITE3.cxx?view
> =markup
> But building sqlite db seems pretty slow at least on my laptop: 13s
> for the sqlite3 part, vs ~9s for grabing the data from the DICOM
> fileset (actually 1s when data is in memory):
> (*)
> sync&&  echo 3>  /proc/sys/vm/drop_caches
>   ./bin/DumpToSQLITE3 ~/CTK/dicoms
> Finished loading data from : 2324 files
> Time to scan DICOM files: 9
> Time to build SQLITE3: 13
> If I reexecute the same test again (so I take advantage of
> file-caching), it leads to:
>   ./bin/DumpToSQLITE3 ~/CTK/dicoms
> Finished loading data from : 2324 files
> Time to scan DICOM files: 1
> Time to build SQLITE3: 13
> What this means is -IMHO- that GDCM 2.x DICOM parsing seems good
> (under 1second for 2400 files). The tricky part is how to speed up the
> whole process. My understanding is that GDCM 2.x is still trying to
> read too much data and is doing too many disk access.
> As a first approach I used grep to check how bad we are doing in GDCM
> 2.x using grep:
> $ time grep  -m 1 DICM -r ~/CTK/dicoms>  log
> grep -m 1 DICM -r /home/mathieu/CTK/dicoms>  l  0.17s user 0.68s
> system 8% cpu 10.605 total
> So for each DICOM files, grep should return after reading only
> 128bytes since DICM is there. I even tried removing the ZIP and DLL
> files from this test:
> $ time find ~/Perso/Consult/dicoms -type f -not \( -name \*.ZIP -o
> -name \*.DLL -o -path \*SYNGO_FV\* \) -exec grep -m 1 DICM {} +>  l
> find ~/Perso/Consult/dicoms -type f -not \( -name \*.ZIP -o -name
> \*.DLL -o    0.11s user 0.67s system 8% cpu 9.232 total
> There is certainly an overhead introduced when spawning a new process
> in the find expression, but we are very close to that in GDCM. I'd
> interested if anyone had a tool to measure how much data has been read
> during the execution of a process. (thanks!)
> So I really do not know how to speed up this time of 8seconds for
> scanning the DICOM DataSet. The GDCM parsing only represent ~1s of
> user time.
> Regards,
slicer-devel mailing list
slicer-devel at bwh.harvard.edu
To unsubscribe: send email to 
slicer-devel-request at massmail.spl.harvard.edu with unsubscribe as the 

More information about the slicer-users mailing list