[slicer-users] Fwd: Re: [slicer-devel] Dicom Performance in Slicer3

Ron Kikinis kikinis at bwh.harvard.edu
Mon Mar 15 10:18:38 EDT 2010


Please chime in by adding comments to the wiki page mentioned below.
Ron

-------- Original Message --------
Subject: Re: [slicer-devel] Dicom Performance in Slicer3
Date: Mon, 15 Mar 2010 07:31:18 -0400
From: Steve Pieper <pieper at bwh.harvard.edu>
To: Andras Lasso <lasso at cs.queensu.ca>
CC: slicer-devel at bwh.harvard.edu

Hi Andras -

Great point - all of the implementation decisions should be driven by
what our users actually need and want.  There's a short section on
'Goals' at this page, maybe you could flesh it out with the user
perspective?

http://www.slicer.org/slicerWiki/index.php/DICOM:Interface

So far, two approaches are being investigated to give the user the
feeling of instant gratification when loading the data: 1) Bill is
working with the FileWatcher library to monitor a directory for dicom
data that can be added to the database in the background and 2) if the
dicom data comes via a network operation, the updating the database can
be done automatically.

-Steve

On M/14/10 9:44 PM, Andras Lasso wrote:
> Hi all,
>
> It is great to see the discussion about different implementation options and
> their potential performance, to be able to make any informed decision we
> need to know more about the clinical/user requirements. Even during the
> technology/toolkit feasibility study phase it would help to have some basic
> requirements.
>
> For example, an important requirement is that the patient browser shall be
> displayed immediately (let's say within 1 seconds, with all the already
> imported data). It doesn't seem feasible to have a data import
> (reading+parsing DICOM files) implementation that can provide this speed,
> therefore having a database seems to be inevitable. Even if parsing takes
> zero time, reading a couple of hundred bytes from tens of thousands of files
> (potentially right after a reboot) takes much more than a second.
>
> Steve,
> as you have the best insight into all of these efforts - could you please
> suggest a place where we could collect the related requirements (or where
> they are listed already)? I would be happy to contribute/review.
>
> thanks
> Andras
>
>
> -----Original Message-----
> From: slicer-devel-bounces at bwh.harvard.edu
> [mailto:slicer-devel-bounces at bwh.harvard.edu] On Behalf Of Mathieu Malaterre
> Sent: March-14-10 5:40 PM
> To: Steve Pieper
> Cc: slicer-devel at bwh.harvard.edu
> Subject: Re: [slicer-devel] Dicom Performance in Slicer3
>
> On Sun, Mar 14, 2010 at 5:03 PM, Steve Pieper<pieper at bwh.harvard.edu>
> wrote:
>> Hi Mathieu -
>>
>> I didn't know about gdcmscanner - looks very useful!  In terms of what we
> we
>> are trying to do, there are several threads of activity going on at once
> and
>> I can try to summarize here.
>>
>> On one hand, Bill has been trying to figure out why loading dicom files is
>> slow (or slowish) in the vtkITK reader used in slicer (that effort is
> where
>> this thread got started).  It appears that one core issue is that the same
>> files are parsed and sorted multiple times, and hence there is a goal to
>> parse once and store the results for later using sqlite.
>
> Ok. Please note that most system I know of (Linux and Windows) do
> file-caching. So reading the same file, twice in a row is a no-op the
> second time (see below how to flush file cache (*)).
>
>> At about the same time, several of us were in Heidelberg doing a hands-on
>> programming event (a.k.a. hackfest [1]) and found out that the MITK group
>> [2] already had been using the same sqlite approach in one of their tools.
>>   So the goal of the wiki on DICOM:Database [3] was to collect and exchange
>> information on this topic.
>
> Ok. I do not know sqlite3, but I'd be interested if anyone could tell
> me why my toy test is so slow (**)
>
>> The bigger picture is that the CTK group [4] aspires to provide a range of
>> software for medical imaging to fill in gaps between existing toolkits.
>   In
>> this specific case, a Qt-based dicom interface that emulates what Osirix
>> provides for Query-Retrieve and local database display.  Hence the
>> discussion on the wiki [5].
>
> Ok. This is really cool, since InVesalius, DeVide or VR-Render have
> each there own implementation.
>
>> Because of the query-retrieve goals, we have focused on dcmtk rather than
>> gdcm - however we have tried to keep the GUI and the dicom parsing
>> independent (they only communicate via the sqlite database) so the
> database
>> could be populated by any tool.
>
> I gave sqlite3 a try:
> (**)
> http://gdcm.svn.sf.net/viewvc/gdcm/trunk/Examples/Cxx/DumpToSQLITE3.cxx?view
> =markup
>
> But building sqlite db seems pretty slow at least on my laptop: 13s
> for the sqlite3 part, vs ~9s for grabing the data from the DICOM
> fileset (actually 1s when data is in memory):
>
> (*)
> sync&&  echo 3>  /proc/sys/vm/drop_caches
>   ./bin/DumpToSQLITE3 ~/CTK/dicoms
> Finished loading data from : 2324 files
> Time to scan DICOM files: 9
> Time to build SQLITE3: 13
>
> If I reexecute the same test again (so I take advantage of
> file-caching), it leads to:
>
>   ./bin/DumpToSQLITE3 ~/CTK/dicoms
> Finished loading data from : 2324 files
> Time to scan DICOM files: 1
> Time to build SQLITE3: 13
>
> What this means is -IMHO- that GDCM 2.x DICOM parsing seems good
> (under 1second for 2400 files). The tricky part is how to speed up the
> whole process. My understanding is that GDCM 2.x is still trying to
> read too much data and is doing too many disk access.
>
> As a first approach I used grep to check how bad we are doing in GDCM
> 2.x using grep:
>
> $ time grep  -m 1 DICM -r ~/CTK/dicoms>  log
> grep -m 1 DICM -r /home/mathieu/CTK/dicoms>  l  0.17s user 0.68s
> system 8% cpu 10.605 total
>
> So for each DICOM files, grep should return after reading only
> 128bytes since DICM is there. I even tried removing the ZIP and DLL
> files from this test:
>
> $ time find ~/Perso/Consult/dicoms -type f -not \( -name \*.ZIP -o
> -name \*.DLL -o -path \*SYNGO_FV\* \) -exec grep -m 1 DICM {} +>  l
> find ~/Perso/Consult/dicoms -type f -not \( -name \*.ZIP -o -name
> \*.DLL -o    0.11s user 0.67s system 8% cpu 9.232 total
>
> There is certainly an overhead introduced when spawning a new process
> in the find expression, but we are very close to that in GDCM. I'd
> interested if anyone had a tool to measure how much data has been read
> during the execution of a process. (thanks!)
>
> So I really do not know how to speed up this time of 8seconds for
> scanning the DICOM DataSet. The GDCM parsing only represent ~1s of
> user time.
>
> Regards,
_______________________________________________
slicer-devel mailing list
slicer-devel at bwh.harvard.edu
http://massmail.spl.harvard.edu/mailman/listinfo/slicer-devel
To unsubscribe: send email to 
slicer-devel-request at massmail.spl.harvard.edu with unsubscribe as the 
subject



More information about the slicer-users mailing list