very very very slow process

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

very very very slow process

Philippe Mailly
Hi,

We acquired two very powerful computers to run Imaris software : Dell
T7910 with 1 To of ram DDR4, 2 Xeon E5-2699 v4 (2.2GHz, 3.6GHz Turbo,
22C, 55Mo cache memory, 145W), 2 SSD drive of 1 To, NVIDIA P6000.

It seem that with java applications these computers are 10 time slower
that a Dell T7910 with 64 Go of ram, 2 Xeon E5-2630 (2.4 GHz). A 2D
median filter, size=2, on an image 2048*2048*6 take 55s on the "very
fast computer" compare to 3 sec on the "slower computer" !!!!! I tried
to reduce the memory in Fiji option without change. I have the same
results with Icy wich is also upon Java.

With NovaBench I got :

                         E5-2699            E5-2630

Global score     3723                    2982

CPU score        2498                    1939

What's wrong ????

Philippe

--
*Philippe Mailly*
/Phd, Research Engineer/
Imaging Core Facility
CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
11 place Marcelin Berthelot, 75005, PARIS, FRANCE

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process

Gabriel Landini
Are you sure you are comparing the same setup and the same image type?
Same version of Java?
Same version of IJ?
Same video drivers?
What OS?
How much memory and parallel threads have you set in the options?
Options>Memory & Threads...

What results do you get with this:
https://imagej.nih.gov/ij/source/ij/plugin/filter/Benchmark.java

I get:
ImageJ: 1.51u
OS : Linux 4.4.104-39-default
Java: 1.8.0_121, vm: 25.121-b13 Oracle Corporation
Benchmark best: 0.227
Benchmark worst: 0.26

Cheers

Gabriel

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process

Michael Schmid-3
In reply to this post by Philippe Mailly
Hi Philippe,

there is definitely something wrong with a median radius=2 taking 55
seconds on 2048*2048*6 pixels. It takes about 3-4 seconds on my
2011-vintage Core i5 notebook (32-bit image, 4 GB for ImageJ).
Please check the Edit>Options>Memory&Threads whether the number of
threads for parallelization is reasonable.

Is this a purely Fiji problem or does it also occur with plain ImageJ?
(maybe try with the Java that comes bundled with ImageJ)
Which Java version and operating system do you use?

Michael
________________________________________________________________


On 2018-02-09 17:16, Philippe Mailly wrote:

> Hi,
>
> We acquired two very powerful computers to run Imaris software : Dell
> T7910 with 1 To of ram DDR4, 2 Xeon E5-2699 v4 (2.2GHz, 3.6GHz Turbo,
> 22C, 55Mo cache memory, 145W), 2 SSD drive of 1 To, NVIDIA P6000.
>
> It seem that with java applications these computers are 10 time slower
> that a Dell T7910 with 64 Go of ram, 2 Xeon E5-2630 (2.4 GHz). A 2D
> median filter, size=2, on an image 2048*2048*6 take 55s on the "very
> fast computer" compare to 3 sec on the "slower computer" !!!!! I tried
> to reduce the memory in Fiji option without change. I have the same
> results with Icy wich is also upon Java.
>
> With NovaBench I got :
>
>                         E5-2699            E5-2630
>
> Global score     3723                    2982
>
> CPU score        2498                    1939
>
> What's wrong ????
>
> Philippe
>
> --
> *Philippe Mailly*
> /Phd, Research Engineer/
> Imaging Core Facility
> CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
> 11 place Marcelin Berthelot, 75005, PARIS, FRANCE
>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process

Philippe Mailly
In reply to this post by Gabriel Landini
With the same image 2048*2048*6, for all tests Memory= 785748Mb, CPU = 44

Image 1.51j8 java x64 1.8.0_112

Benchmark : 1.844s , 1.7 M pixel/s

Median : 1.47s


Fiji 1.51s java x64 1.80_162

benchmark : 2.4s

Median : 45.58s

Fiji 1.51S java x64 bundle 1.80_66

benchmark : 2.6s

Median : 42.44s

It's seem a Fiji probem ????

Philippe


Le 09/02/2018 à 17:27, Gabriel Landini a écrit :

> Are you sure you are comparing the same setup and the same image type?
> Same version of Java?
> Same version of IJ?
> Same video drivers?
> What OS?
> How much memory and parallel threads have you set in the options?
> Options>Memory & Threads...
>
> What results do you get with this:
> https://imagej.nih.gov/ij/source/ij/plugin/filter/Benchmark.java
>
> I get:
> ImageJ: 1.51u
> OS : Linux 4.4.104-39-default
> Java: 1.8.0_121, vm: 25.121-b13 Oracle Corporation
> Benchmark best: 0.227
> Benchmark worst: 0.26
>
> Cheers
>
> Gabriel
>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
*Philippe Mailly*
/Phd, Research Engineer/
Imaging Core Facility
CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
11 place Marcelin Berthelot, 75005, PARIS, FRANCE

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process

Philippe Mailly
Hi,
Some new tests.

If I run Fiji in 32bits the median filter speed is similar to other
computers.
In 64bits, all 2D filters (median, mean, variance ....) on a image stack
are very slow (40s)
Same type of filters in 3D (median3D, mean3D, variance 3D ...) take only 10s

Philippe

--
*Philippe Mailly*
/Phd, Research Engineer/
Imaging Core Facility
CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
11 place Marcelin Berthelot, 75005, PARIS, FRANCE

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Michael Schmid-3
Hi everyone,

after some off-list discussion and tests run by Philippe:

The problem of unexpectedly poor performance on a Xeon 44-core machine
only appears in Fiji, not in plain ImageJ.
The problem is not limited to the "RankFilters" (Process>Filters>Mean,
Median, Minimum, etc., everything that uses the "Circular Masks" for the
neighborhood) but one can see parallelization problems also in other
functions, though not as severe.

 From all the evidence, it seems that sometimes one core or a few cores
are not available for processing for rather long time (at least
milliseconds, probably they do something else during that period). This
has especially bad consequences for the current RankFilters because they
expect all threads to work continuously and eventually the other threads
have to wait if one thread (core) is inactive.
(Due to optimized memory access, the RankFilters parallelization
strategy has been very fast on machines like the Core i5 that were
popular at the time when I developed it, but it gets poor under such
circumstances).

So the question: What is different on Fiji vs. plain ImageJ, concerning
threads?

The problems could be explained e.g. slower switching between threads in
case one core has to handle several threads, or by some background
activity that does not happen in plain ImageJ.
I suspect different java options in the Launcher of Fiji vs. plain
ImageJ, but I know nothing about Fiji.


Michael
________________________________________________________________


On 13/02/2018 12:01, Philippe Mailly wrote:

> Hi,
> Some new tests.
>
> If I run Fiji in 32bits the median filter speed is similar to other
> computers.
> In 64bits, all 2D filters (median, mean, variance ....) on a image stack
> are very slow (40s)
> Same type of filters in 3D (median3D, mean3D, variance 3D ...) take only
> 10s
>
> Philippe
>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Krieger, Donald N.
A thought from the peanut gallery ...

This issue is very interesting and I think will be worth at least a technical note depending on what you find is the true cause.

I think it would be very cool to see a publication come directly out of the email list.

Best - Don

-----Original Message-----
From: ImageJ Interest Group [mailto:[hidden email]] On Behalf Of Michael Schmid
Sent: Thursday, February 15, 2018 9:26 AM
To: [hidden email]
Subject: Re: very very very slow process (what is different in Fiji?)

Hi everyone,

after some off-list discussion and tests run by Philippe:

The problem of unexpectedly poor performance on a Xeon 44-core machine only appears in Fiji, not in plain ImageJ.
The problem is not limited to the "RankFilters" (Process>Filters>Mean, Median, Minimum, etc., everything that uses the "Circular Masks" for the
neighborhood) but one can see parallelization problems also in other functions, though not as severe.

 From all the evidence, it seems that sometimes one core or a few cores are not available for processing for rather long time (at least milliseconds, probably they do something else during that period). This has especially bad consequences for the current RankFilters because they expect all threads to work continuously and eventually the other threads have to wait if one thread (core) is inactive.
(Due to optimized memory access, the RankFilters parallelization strategy has been very fast on machines like the Core i5 that were popular at the time when I developed it, but it gets poor under such circumstances).

So the question: What is different on Fiji vs. plain ImageJ, concerning threads?

The problems could be explained e.g. slower switching between threads in case one core has to handle several threads, or by some background activity that does not happen in plain ImageJ.
I suspect different java options in the Launcher of Fiji vs. plain ImageJ, but I know nothing about Fiji.


Michael
________________________________________________________________


On 13/02/2018 12:01, Philippe Mailly wrote:

> Hi,
> Some new tests.
>
> If I run Fiji in 32bits the median filter speed is similar to other
> computers.
> In 64bits, all 2D filters (median, mean, variance ....) on a image
> stack are very slow (40s) Same type of filters in 3D (median3D,
> mean3D, variance 3D ...) take only 10s
>
> Philippe
>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Saalfeld, Stephan
In reply to this post by Michael Schmid-3
Prime candidates:

* parallel garbage collection
* other JVM

Have you tried to limit the number of threads to 1 or 2 less then the
maximum number of processors if the number of available processors is
very large?

Thanks,
Stephan



On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:

> Hi everyone,
>
> after some off-list discussion and tests run by Philippe:
>
> The problem of unexpectedly poor performance on a Xeon 44-core
> machine 
> only appears in Fiji, not in plain ImageJ.
> The problem is not limited to the "RankFilters"
> (Process>Filters>Mean, 
> Median, Minimum, etc., everything that uses the "Circular Masks" for
> the 
> neighborhood) but one can see parallelization problems also in other 
> functions, though not as severe.
>
>  From all the evidence, it seems that sometimes one core or a few
> cores 
> are not available for processing for rather long time (at least 
> milliseconds, probably they do something else during that period).
> This 
> has especially bad consequences for the current RankFilters because
> they 
> expect all threads to work continuously and eventually the other
> threads 
> have to wait if one thread (core) is inactive.
> (Due to optimized memory access, the RankFilters parallelization 
> strategy has been very fast on machines like the Core i5 that were 
> popular at the time when I developed it, but it gets poor under such 
> circumstances).
>
> So the question: What is different on Fiji vs. plain ImageJ,
> concerning 
> threads?
>
> The problems could be explained e.g. slower switching between threads
> in 
> case one core has to handle several threads, or by some background 
> activity that does not happen in plain ImageJ.
> I suspect different java options in the Launcher of Fiji vs. plain 
> ImageJ, but I know nothing about Fiji.
>
>
> Michael
> ________________________________________________________________
>
>
> On 13/02/2018 12:01, Philippe Mailly wrote:
> >
> > Hi,
> > Some new tests.
> >
> > If I run Fiji in 32bits the median filter speed is similar to
> > other 
> > computers.
> > In 64bits, all 2D filters (median, mean, variance ....) on a image
> > stack 
> > are very slow (40s)
> > Same type of filters in 3D (median3D, mean3D, variance 3D ...) take
> > only 
> > 10s
> >
> > Philippe
> >
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

signature.asc (484 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Philippe Mailly
Yes as you can see on the plots.

Philippe


Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :

> Prime candidates:
>
> * parallel garbage collection
> * other JVM
>
> Have you tried to limit the number of threads to 1 or 2 less then the
> maximum number of processors if the number of available processors is
> very large?
>
> Thanks,
> Stephan
>
>
>
> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
>> Hi everyone,
>>
>> after some off-list discussion and tests run by Philippe:
>>
>> The problem of unexpectedly poor performance on a Xeon 44-core
>> machine
>> only appears in Fiji, not in plain ImageJ.
>> The problem is not limited to the "RankFilters"
>> (Process>Filters>Mean,
>> Median, Minimum, etc., everything that uses the "Circular Masks" for
>> the
>> neighborhood) but one can see parallelization problems also in other
>> functions, though not as severe.
>>
>>   From all the evidence, it seems that sometimes one core or a few
>> cores
>> are not available for processing for rather long time (at least
>> milliseconds, probably they do something else during that period).
>> This
>> has especially bad consequences for the current RankFilters because
>> they
>> expect all threads to work continuously and eventually the other
>> threads
>> have to wait if one thread (core) is inactive.
>> (Due to optimized memory access, the RankFilters parallelization
>> strategy has been very fast on machines like the Core i5 that were
>> popular at the time when I developed it, but it gets poor under such
>> circumstances).
>>
>> So the question: What is different on Fiji vs. plain ImageJ,
>> concerning
>> threads?
>>
>> The problems could be explained e.g. slower switching between threads
>> in
>> case one core has to handle several threads, or by some background
>> activity that does not happen in plain ImageJ.
>> I suspect different java options in the Launcher of Fiji vs. plain
>> ImageJ, but I know nothing about Fiji.
>>
>>
>> Michael
>> ________________________________________________________________
>>
>>
>> On 13/02/2018 12:01, Philippe Mailly wrote:
>>> Hi,
>>> Some new tests.
>>>
>>> If I run Fiji in 32bits the median filter speed is similar to
>>> other
>>> computers.
>>> In 64bits, all 2D filters (median, mean, variance ....) on a image
>>> stack
>>> are very slow (40s)
>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) take
>>> only
>>> 10s
>>>
>>> Philippe
>>>
>> --
>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
--
*Philippe Mailly*
/Phd, Research Engineer/
Imaging Core Facility
CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
11 place Marcelin Berthelot, 75005, PARIS, FRANCE

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

2DMedian timesFiji1.51u_44CPU.tif (293K) Download Attachment
2DMedian timesImageJ1.51u_44CPU.tif (293K) Download Attachment
3DMedian timesFiji1.51u_44CPU.tif (293K) Download Attachment
3DMedian timesImageJ1.51u_44CPU.tif (293K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Saalfeld, Stephan
Thanks!  Have you tried other garbage collectors?

./fiji -Xmx8g -Xincgc --
./fiji -Xmx8g -XX:+UseParallelGC --
./fiji -Xmx8g -XX:+UseConcMarkSweepGC --

The last one is my favorite for working with BDV and lazy caches, but
utilizes parallel threads, not sure what Fiji's defaults are at this
time.

Thanks,
Stephan


On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote:

> Yes as you can see on the plots.
>
> Philippe
>
>
> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
> >
> > Prime candidates:
> >
> > * parallel garbage collection
> > * other JVM
> >
> > Have you tried to limit the number of threads to 1 or 2 less then
> > the
> > maximum number of processors if the number of available processors
> > is
> > very large?
> >
> > Thanks,
> > Stephan
> >
> >
> >
> > On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
> > >
> > > Hi everyone,
> > >
> > > after some off-list discussion and tests run by Philippe:
> > >
> > > The problem of unexpectedly poor performance on a Xeon 44-core
> > > machine
> > > only appears in Fiji, not in plain ImageJ.
> > > The problem is not limited to the "RankFilters"
> > > (Process>Filters>Mean,
> > > Median, Minimum, etc., everything that uses the "Circular Masks"
> > > for
> > > the
> > > neighborhood) but one can see parallelization problems also in
> > > other
> > > functions, though not as severe.
> > >
> > >   From all the evidence, it seems that sometimes one core or a
> > > few
> > > cores
> > > are not available for processing for rather long time (at least
> > > milliseconds, probably they do something else during that
> > > period).
> > > This
> > > has especially bad consequences for the current RankFilters
> > > because
> > > they
> > > expect all threads to work continuously and eventually the other
> > > threads
> > > have to wait if one thread (core) is inactive.
> > > (Due to optimized memory access, the RankFilters parallelization
> > > strategy has been very fast on machines like the Core i5 that
> > > were
> > > popular at the time when I developed it, but it gets poor under
> > > such
> > > circumstances).
> > >
> > > So the question: What is different on Fiji vs. plain ImageJ,
> > > concerning
> > > threads?
> > >
> > > The problems could be explained e.g. slower switching between
> > > threads
> > > in
> > > case one core has to handle several threads, or by some
> > > background
> > > activity that does not happen in plain ImageJ.
> > > I suspect different java options in the Launcher of Fiji vs.
> > > plain
> > > ImageJ, but I know nothing about Fiji.
> > >
> > >
> > > Michael
> > > ________________________________________________________________
> > >
> > >
> > > On 13/02/2018 12:01, Philippe Mailly wrote:
> > > >
> > > > Hi,
> > > > Some new tests.
> > > >
> > > > If I run Fiji in 32bits the median filter speed is similar to
> > > > other
> > > > computers.
> > > > In 64bits, all 2D filters (median, mean, variance ....) on a
> > > > image
> > > > stack
> > > > are very slow (40s)
> > > > Same type of filters in 3D (median3D, mean3D, variance 3D ...)
> > > > take
> > > > only
> > > > 10s
> > > >
> > > > Philippe
> > > >
> > > --
> > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
> > --
> > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

signature.asc (484 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Philippe Mailly
In reply to this post by Philippe Mailly
Hi Greg,

Sorry I forgot the legend :

the macro from Michael

//test macro
setBatchMode(true);
nTries=40;
maxThreads=30;
threads=newArray(nTries);
times=newArray(nTries);
for (i=0; i<nTries; i++) {
nThreads=1+round((i/(nTries-1))*(i/(nTries-1))*(maxThreads-1)); // more
attempts with lower n
run("Memory & Threads...", "parallel="+nThreads+" keep run");
newImage("Untitled", "32-bit random", 4096, 4096, 6);
t1=getTime();
run("Median...", "radius=2 stack");
t2=getTime();
threads[i] = nThreads;
times[i] = t2-t1;
close();
}
setBatchMode(false);
Plot.create("Median times", "Number of threads", "time(ms)", threads,
times);
showMessage("Please reset Edit>Options>Memory&Threads to your favorite
settings");

2DMedian times Fiji1.51u_44CPU tested with Fiji (ij.jar 1.51u)
2DMedian times ImageJ1.51u_44CPU tested with ImageJ (ij.jar 1.51u)
3DMedian times Fiji1.51u_44CPU tested with Fiji (ij.jar 1.51u)
3DMedian times ImageJ1.51u_44CPU tested with ImageJ (ij.jar 1.51u)


Philippe
Le 15/02/2018 à 16:58, Gregory Jefferis a écrit :

> @Philippe
>
> What do the different plots correspond to? I see that Plot #3 takes
> off at 44 threads (the number of cores on your machine IIUC; physical
> or logical?).
>
> With Fiji have you tried starting from command line with:
>
> --default-gc
>
> Stephan also seemed to think the GC was suspicious.
>
> Best,
>
> Greg.
>
>> On 15 Feb 2018, at 15:50, Philippe Mailly
>> <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> Yes as you can see on the plots.
>>
>> Philippe
>>
>>
>> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
>>> Prime candidates:
>>>
>>> * parallel garbage collection
>>> * other JVM
>>>
>>> Have you tried to limit the number of threads to 1 or 2 less then the
>>> maximum number of processors if the number of available processors is
>>> very large?
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>>
>>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
>>>> Hi everyone,
>>>>
>>>> after some off-list discussion and tests run by Philippe:
>>>>
>>>> The problem of unexpectedly poor performance on a Xeon 44-core
>>>> machine
>>>> only appears in Fiji, not in plain ImageJ.
>>>> The problem is not limited to the "RankFilters"
>>>> (Process>Filters>Mean,
>>>> Median, Minimum, etc., everything that uses the "Circular Masks" for
>>>> the
>>>> neighborhood) but one can see parallelization problems also in other
>>>> functions, though not as severe.
>>>>
>>>>  From all the evidence, it seems that sometimes one core or a few
>>>> cores
>>>> are not available for processing for rather long time (at least
>>>> milliseconds, probably they do something else during that period).
>>>> This
>>>> has especially bad consequences for the current RankFilters because
>>>> they
>>>> expect all threads to work continuously and eventually the other
>>>> threads
>>>> have to wait if one thread (core) is inactive.
>>>> (Due to optimized memory access, the RankFilters parallelization
>>>> strategy has been very fast on machines like the Core i5 that were
>>>> popular at the time when I developed it, but it gets poor under such
>>>> circumstances).
>>>>
>>>> So the question: What is different on Fiji vs. plain ImageJ,
>>>> concerning
>>>> threads?
>>>>
>>>> The problems could be explained e.g. slower switching between threads
>>>> in
>>>> case one core has to handle several threads, or by some background
>>>> activity that does not happen in plain ImageJ.
>>>> I suspect different java options in the Launcher of Fiji vs. plain
>>>> ImageJ, but I know nothing about Fiji.
>>>>
>>>>
>>>> Michael
>>>> ________________________________________________________________
>>>>
>>>>
>>>> On 13/02/2018 12:01, Philippe Mailly wrote:
>>>>> Hi,
>>>>> Some new tests.
>>>>>
>>>>> If I run Fiji in 32bits the median filter speed is similar to
>>>>> other
>>>>> computers.
>>>>> In 64bits, all 2D filters (median, mean, variance ....) on a image
>>>>> stack
>>>>> are very slow (40s)
>>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...) take
>>>>> only
>>>>> 10s
>>>>>
>>>>> Philippe
>>>>>
>>>> --
>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>> --
>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>
>> --
>> *Philippe Mailly*
>> /Phd, Research Engineer/
>> Imaging Core Facility
>> CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
>> 11 place Marcelin Berthelot, 75005, PARIS, FRANCE
>>
>> --
>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>> <2DMedian timesFiji1.51u_44CPU.tif><2DMedian
>> timesImageJ1.51u_44CPU.tif><3DMedian
>> timesFiji1.51u_44CPU.tif><3DMedian timesImageJ1.51u_44CPU.tif>
>
> --
> Gregory Jefferis, PhD
> Division of Neurobiology
> MRC Laboratory of Molecular Biology
> Francis Crick Avenue
> Cambridge Biomedical Campus
> Cambridge, CB2 OQH, UK
>
> http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis
> http://jefferislab.org
> http://www.zoo.cam.ac.uk/departments/connectomics
>
>
>

--
*Philippe Mailly*
/Phd, Research Engineer/
Imaging Core Facility
CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
11 place Marcelin Berthelot, 75005, PARIS, FRANCE

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Philippe Mailly
In reply to this post by Saalfeld, Stephan
Hi Stephan,

The computer on which I run the test have 1To of RAM memory should I
test with  -Xmx8g or with -Xmx785g as it's configured in the Fiji memory
and Threads options ?

Philippe



Le 15/02/2018 à 17:02, Saalfeld, Stephan a écrit :

> Thanks!  Have you tried other garbage collectors?
>
> ./fiji -Xmx8g -Xincgc --
> ./fiji -Xmx8g -XX:+UseParallelGC --
> ./fiji -Xmx8g -XX:+UseConcMarkSweepGC --
>
> The last one is my favorite for working with BDV and lazy caches, but
> utilizes parallel threads, not sure what Fiji's defaults are at this
> time.
>
> Thanks,
> Stephan
>
>
> On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote:
>> Yes as you can see on the plots.
>>
>> Philippe
>>
>>
>> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
>>> Prime candidates:
>>>
>>> * parallel garbage collection
>>> * other JVM
>>>
>>> Have you tried to limit the number of threads to 1 or 2 less then
>>> the
>>> maximum number of processors if the number of available processors
>>> is
>>> very large?
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>>
>>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
>>>> Hi everyone,
>>>>
>>>> after some off-list discussion and tests run by Philippe:
>>>>
>>>> The problem of unexpectedly poor performance on a Xeon 44-core
>>>> machine
>>>> only appears in Fiji, not in plain ImageJ.
>>>> The problem is not limited to the "RankFilters"
>>>> (Process>Filters>Mean,
>>>> Median, Minimum, etc., everything that uses the "Circular Masks"
>>>> for
>>>> the
>>>> neighborhood) but one can see parallelization problems also in
>>>> other
>>>> functions, though not as severe.
>>>>
>>>>    From all the evidence, it seems that sometimes one core or a
>>>> few
>>>> cores
>>>> are not available for processing for rather long time (at least
>>>> milliseconds, probably they do something else during that
>>>> period).
>>>> This
>>>> has especially bad consequences for the current RankFilters
>>>> because
>>>> they
>>>> expect all threads to work continuously and eventually the other
>>>> threads
>>>> have to wait if one thread (core) is inactive.
>>>> (Due to optimized memory access, the RankFilters parallelization
>>>> strategy has been very fast on machines like the Core i5 that
>>>> were
>>>> popular at the time when I developed it, but it gets poor under
>>>> such
>>>> circumstances).
>>>>
>>>> So the question: What is different on Fiji vs. plain ImageJ,
>>>> concerning
>>>> threads?
>>>>
>>>> The problems could be explained e.g. slower switching between
>>>> threads
>>>> in
>>>> case one core has to handle several threads, or by some
>>>> background
>>>> activity that does not happen in plain ImageJ.
>>>> I suspect different java options in the Launcher of Fiji vs.
>>>> plain
>>>> ImageJ, but I know nothing about Fiji.
>>>>
>>>>
>>>> Michael
>>>> ________________________________________________________________
>>>>
>>>>
>>>> On 13/02/2018 12:01, Philippe Mailly wrote:
>>>>> Hi,
>>>>> Some new tests.
>>>>>
>>>>> If I run Fiji in 32bits the median filter speed is similar to
>>>>> other
>>>>> computers.
>>>>> In 64bits, all 2D filters (median, mean, variance ....) on a
>>>>> image
>>>>> stack
>>>>> are very slow (40s)
>>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...)
>>>>> take
>>>>> only
>>>>> 10s
>>>>>
>>>>> Philippe
>>>>>
>>>> --
>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>> --
>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html

--
*Philippe Mailly*
/Phd, Research Engineer/
Imaging Core Facility
CIRB, CNRS, UMR 7241/ INSERM 1050, Collège de France
11 place Marcelin Berthelot, 75005, PARIS, FRANCE

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Saalfeld, Stephan
-Xmx785g


On Thu, 2018-02-15 at 17:12 +0100, Philippe Mailly wrote:

> Hi Stephan,
>
> The computer on which I run the test have 1To of RAM memory should I 
> test with  -Xmx8g or with -Xmx785g as it's configured in the Fiji
> memory 
> and Threads options ?
>
> Philippe
>
>
>
> Le 15/02/2018 à 17:02, Saalfeld, Stephan a écrit :
> >
> > Thanks!  Have you tried other garbage collectors?
> >
> > ./fiji -Xmx8g -Xincgc --
> > ./fiji -Xmx8g -XX:+UseParallelGC --
> > ./fiji -Xmx8g -XX:+UseConcMarkSweepGC --
> >
> > The last one is my favorite for working with BDV and lazy caches,
> > but
> > utilizes parallel threads, not sure what Fiji's defaults are at
> > this
> > time.
> >
> > Thanks,
> > Stephan
> >
> >
> > On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote:
> > >
> > > Yes as you can see on the plots.
> > >
> > > Philippe
> > >
> > >
> > > Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
> > > >
> > > > Prime candidates:
> > > >
> > > > * parallel garbage collection
> > > > * other JVM
> > > >
> > > > Have you tried to limit the number of threads to 1 or 2 less
> > > > then
> > > > the
> > > > maximum number of processors if the number of available
> > > > processors
> > > > is
> > > > very large?
> > > >
> > > > Thanks,
> > > > Stephan
> > > >
> > > >
> > > >
> > > > On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > after some off-list discussion and tests run by Philippe:
> > > > >
> > > > > The problem of unexpectedly poor performance on a Xeon 44-
> > > > > core
> > > > > machine
> > > > > only appears in Fiji, not in plain ImageJ.
> > > > > The problem is not limited to the "RankFilters"
> > > > > (Process>Filters>Mean,
> > > > > Median, Minimum, etc., everything that uses the "Circular
> > > > > Masks"
> > > > > for
> > > > > the
> > > > > neighborhood) but one can see parallelization problems also
> > > > > in
> > > > > other
> > > > > functions, though not as severe.
> > > > >
> > > > >    From all the evidence, it seems that sometimes one core or
> > > > > a
> > > > > few
> > > > > cores
> > > > > are not available for processing for rather long time (at
> > > > > least
> > > > > milliseconds, probably they do something else during that
> > > > > period).
> > > > > This
> > > > > has especially bad consequences for the current RankFilters
> > > > > because
> > > > > they
> > > > > expect all threads to work continuously and eventually the
> > > > > other
> > > > > threads
> > > > > have to wait if one thread (core) is inactive.
> > > > > (Due to optimized memory access, the RankFilters
> > > > > parallelization
> > > > > strategy has been very fast on machines like the Core i5 that
> > > > > were
> > > > > popular at the time when I developed it, but it gets poor
> > > > > under
> > > > > such
> > > > > circumstances).
> > > > >
> > > > > So the question: What is different on Fiji vs. plain ImageJ,
> > > > > concerning
> > > > > threads?
> > > > >
> > > > > The problems could be explained e.g. slower switching between
> > > > > threads
> > > > > in
> > > > > case one core has to handle several threads, or by some
> > > > > background
> > > > > activity that does not happen in plain ImageJ.
> > > > > I suspect different java options in the Launcher of Fiji vs.
> > > > > plain
> > > > > ImageJ, but I know nothing about Fiji.
> > > > >
> > > > >
> > > > > Michael
> > > > > _____________________________________________________________
> > > > > ___
> > > > >
> > > > >
> > > > > On 13/02/2018 12:01, Philippe Mailly wrote:
> > > > > >
> > > > > > Hi,
> > > > > > Some new tests.
> > > > > >
> > > > > > If I run Fiji in 32bits the median filter speed is similar
> > > > > > to
> > > > > > other
> > > > > > computers.
> > > > > > In 64bits, all 2D filters (median, mean, variance ....) on
> > > > > > a
> > > > > > image
> > > > > > stack
> > > > > > are very slow (40s)
> > > > > > Same type of filters in 3D (median3D, mean3D, variance 3D
> > > > > > ...)
> > > > > > take
> > > > > > only
> > > > > > 10s
> > > > > >
> > > > > > Philippe
> > > > > >
> > > > > --
> > > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
> > > > --
> > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
> > --
> > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

signature.asc (484 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Philippe Mailly
Here the plots. However, the option -XX:+UseParallelGC -- doesn't work it can't create a java virtual machine ?

Philippe
 

----- Mail original -----
De: "Saalfeld, Stephan" <[hidden email]>
À: [hidden email]
Envoyé: Jeudi 15 Février 2018 17:29:03
Objet: Re: very very very slow process (what is different in Fiji?)

-Xmx785g


On Thu, 2018-02-15 at 17:12 +0100, Philippe Mailly wrote:

> Hi Stephan,
>
> The computer on which I run the test have 1To of RAM memory should I 
> test with  -Xmx8g or with -Xmx785g as it's configured in the Fiji
> memory 
> and Threads options ?
>
> Philippe
>
>
>
> Le 15/02/2018 à 17:02, Saalfeld, Stephan a écrit :
> >
> > Thanks!  Have you tried other garbage collectors?
> >
> > ./fiji -Xmx8g -Xincgc --
> > ./fiji -Xmx8g -XX:+UseParallelGC --
> > ./fiji -Xmx8g -XX:+UseConcMarkSweepGC --
> >
> > The last one is my favorite for working with BDV and lazy caches,
> > but
> > utilizes parallel threads, not sure what Fiji's defaults are at
> > this
> > time.
> >
> > Thanks,
> > Stephan
> >
> >
> > On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote:
> > >
> > > Yes as you can see on the plots.
> > >
> > > Philippe
> > >
> > >
> > > Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
> > > >
> > > > Prime candidates:
> > > >
> > > > * parallel garbage collection
> > > > * other JVM
> > > >
> > > > Have you tried to limit the number of threads to 1 or 2 less
> > > > then
> > > > the
> > > > maximum number of processors if the number of available
> > > > processors
> > > > is
> > > > very large?
> > > >
> > > > Thanks,
> > > > Stephan
> > > >
> > > >
> > > >
> > > > On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > after some off-list discussion and tests run by Philippe:
> > > > >
> > > > > The problem of unexpectedly poor performance on a Xeon 44-
> > > > > core
> > > > > machine
> > > > > only appears in Fiji, not in plain ImageJ.
> > > > > The problem is not limited to the "RankFilters"
> > > > > (Process>Filters>Mean,
> > > > > Median, Minimum, etc., everything that uses the "Circular
> > > > > Masks"
> > > > > for
> > > > > the
> > > > > neighborhood) but one can see parallelization problems also
> > > > > in
> > > > > other
> > > > > functions, though not as severe.
> > > > >
> > > > >    From all the evidence, it seems that sometimes one core or
> > > > > a
> > > > > few
> > > > > cores
> > > > > are not available for processing for rather long time (at
> > > > > least
> > > > > milliseconds, probably they do something else during that
> > > > > period).
> > > > > This
> > > > > has especially bad consequences for the current RankFilters
> > > > > because
> > > > > they
> > > > > expect all threads to work continuously and eventually the
> > > > > other
> > > > > threads
> > > > > have to wait if one thread (core) is inactive.
> > > > > (Due to optimized memory access, the RankFilters
> > > > > parallelization
> > > > > strategy has been very fast on machines like the Core i5 that
> > > > > were
> > > > > popular at the time when I developed it, but it gets poor
> > > > > under
> > > > > such
> > > > > circumstances).
> > > > >
> > > > > So the question: What is different on Fiji vs. plain ImageJ,
> > > > > concerning
> > > > > threads?
> > > > >
> > > > > The problems could be explained e.g. slower switching between
> > > > > threads
> > > > > in
> > > > > case one core has to handle several threads, or by some
> > > > > background
> > > > > activity that does not happen in plain ImageJ.
> > > > > I suspect different java options in the Launcher of Fiji vs.
> > > > > plain
> > > > > ImageJ, but I know nothing about Fiji.
> > > > >
> > > > >
> > > > > Michael
> > > > > _____________________________________________________________
> > > > > ___
> > > > >
> > > > >
> > > > > On 13/02/2018 12:01, Philippe Mailly wrote:
> > > > > >
> > > > > > Hi,
> > > > > > Some new tests.
> > > > > >
> > > > > > If I run Fiji in 32bits the median filter speed is similar
> > > > > > to
> > > > > > other
> > > > > > computers.
> > > > > > In 64bits, all 2D filters (median, mean, variance ....) on
> > > > > > a
> > > > > > image
> > > > > > stack
> > > > > > are very slow (40s)
> > > > > > Same type of filters in 3D (median3D, mean3D, variance 3D
> > > > > > ...)
> > > > > > take
> > > > > > only
> > > > > > 10s
> > > > > >
> > > > > > Philippe
> > > > > >
> > > > > --
> > > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
> > > > --
> > > > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
> > --
> > ImageJ mailing list: http://imagej.nih.gov/ij/list.html
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
--
Philippe Mailly
PhD, Ingénieur de Recherche CNRS
Plateforme d'imagerie du CIRB, CNRS UMR 7241 / INSERM U 1050
Collège de France
11 Place Marcelin Berthelot,
75005, Paris, France

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

2DMedian timesFiji1.51u_44CPU-Xmx780g-Xincgc.tif (293K) Download Attachment
2DMedian timesFiji1.51u_44CPU-Xmx780g-XXUseConcMarkSweepGC.tif (293K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Michael Schmid-3
In reply to this post by Saalfeld, Stephan
Hi everyone,

after a bit of further analysis of the Fiji vs. ImageJ performance by
Philippe (supplemented by some ideas of mine):

Parallelization performance is sometimes very different between plain
ImageJ and Fiji (this differs a lot between different filters/functions,
and strongly depends on the size of the data set). Many times, plain
ImageJ is better, sometimes Fiji.
Performance often differs between ImageJ and Fiji by factors of five or
more!

The garbage collector seems to make no or not much difference. It seems
that usually it does nothing - no wonder when processing half-GB image
stacks on a machine with TB memory.
Startup options in Fiji's jvm.cfg only specify 'server' mode, but also
plain ImageJ runs a 'server' JVM. So also the jvm.cfg isn't responsible
for the differences.

While the java version is the same for both, and both is Oracle java,
there is a difference between plain ImageJ and Fiji:
The java.home property is
for ImageJ: C:\Program Files\Java\jre1.8.0_162
for Fiji:   C:\Program Files\Java\jdk1.8.0_162\jre

Also, the library paths etc. are different (pointing to somewhere in the
respective java.home folder):
   java.ext.dirs
   sun.boot.library.path
   java.endorsed.dirs


Java experts out there, do you have any idea whether this could make the
difference how Java handles multithreading performance?
(e.g. things like scheduling different threads, etc.)
Are there java options to tweak it?


Michael
________________________________________________________________
On 15/02/2018 17:02, Saalfeld, Stephan wrote:

> Thanks!  Have you tried other garbage collectors?
>
> ./fiji -Xmx8g -Xincgc --
> ./fiji -Xmx8g -XX:+UseParallelGC --
> ./fiji -Xmx8g -XX:+UseConcMarkSweepGC --
>
> The last one is my favorite for working with BDV and lazy caches, but
> utilizes parallel threads, not sure what Fiji's defaults are at this
> time.
>
> Thanks,
> Stephan
>
>
> On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote:
>> Yes as you can see on the plots.
>>
>> Philippe
>>
>>
>> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
>>>
>>> Prime candidates:
>>>
>>> * parallel garbage collection
>>> * other JVM
>>>
>>> Have you tried to limit the number of threads to 1 or 2 less then
>>> the
>>> maximum number of processors if the number of available processors
>>> is
>>> very large?
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>>
>>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> after some off-list discussion and tests run by Philippe:
>>>>
>>>> The problem of unexpectedly poor performance on a Xeon 44-core
>>>> machine
>>>> only appears in Fiji, not in plain ImageJ.
>>>> The problem is not limited to the "RankFilters"
>>>> (Process>Filters>Mean,
>>>> Median, Minimum, etc., everything that uses the "Circular Masks"
>>>> for
>>>> the
>>>> neighborhood) but one can see parallelization problems also in
>>>> other
>>>> functions, though not as severe.
>>>>
>>>>    From all the evidence, it seems that sometimes one core or a
>>>> few
>>>> cores
>>>> are not available for processing for rather long time (at least
>>>> milliseconds, probably they do something else during that
>>>> period).
>>>> This
>>>> has especially bad consequences for the current RankFilters
>>>> because
>>>> they
>>>> expect all threads to work continuously and eventually the other
>>>> threads
>>>> have to wait if one thread (core) is inactive.
>>>> (Due to optimized memory access, the RankFilters parallelization
>>>> strategy has been very fast on machines like the Core i5 that
>>>> were
>>>> popular at the time when I developed it, but it gets poor under
>>>> such
>>>> circumstances).
>>>>
>>>> So the question: What is different on Fiji vs. plain ImageJ,
>>>> concerning
>>>> threads?
>>>>
>>>> The problems could be explained e.g. slower switching between
>>>> threads
>>>> in
>>>> case one core has to handle several threads, or by some
>>>> background
>>>> activity that does not happen in plain ImageJ.
>>>> I suspect different java options in the Launcher of Fiji vs.
>>>> plain
>>>> ImageJ, but I know nothing about Fiji.
>>>>
>>>>
>>>> Michael
>>>> ________________________________________________________________
>>>>
>>>>
>>>> On 13/02/2018 12:01, Philippe Mailly wrote:
>>>>>
>>>>> Hi,
>>>>> Some new tests.
>>>>>
>>>>> If I run Fiji in 32bits the median filter speed is similar to
>>>>> other
>>>>> computers.
>>>>> In 64bits, all 2D filters (median, mean, variance ....) on a
>>>>> image
>>>>> stack
>>>>> are very slow (40s)
>>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...)
>>>>> take
>>>>> only
>>>>> 10s
>>>>>
>>>>> Philippe
>>>>>
>>>> --
>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>> --
>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

Michael Schmid-3
Hi everyone (especially Fiji experts!),

the attachment shows an example of Fiji vs. plain ImageJ parallelization
performance. For this, I have already changed the RankFilters (which doe
the medain) to be less sensitive to cases where one thread gets stuck,
otherwise the Fiji performance with a large number of processors would
be even worse.

In both cases, it runs at the default priority=4.

Maybe the following information provides a few more clues:
- The problem is found for a Windows 10 machine (two Xeon E5-2699 v4,
2*22 cores + hyperthreading, 1 TB)
- No such problem on a slightly slower Windows 7 machine with Xeon
E5-2630v3 processors (max. 32 threads), 64 GB RAM.

It can't be a problem of the computer or Windows 10, however, because
plain ImageJ works well, the poor parallelization performance occurs
only under FIJI.

So again, what could be different between plain ImageJ and Fiji?
(At least in theory, thread scheduling, time slices, etc. should be done
by the operating system, so the Win 10 vs. Win 7 difference should
affect plain ImageJ and Fiji the same way).


Michael
________________________________________________________________


On 19/02/2018 16:46, Michael Schmid wrote:

> Hi everyone,
>
> after a bit of further analysis of the Fiji vs. ImageJ performance by
> Philippe (supplemented by some ideas of mine):
>
> Parallelization performance is sometimes very different between plain
> ImageJ and Fiji (this differs a lot between different filters/functions,
> and strongly depends on the size of the data set). Many times, plain
> ImageJ is better, sometimes Fiji.
> Performance often differs between ImageJ and Fiji by factors of five or
> more!
>
> The garbage collector seems to make no or not much difference. It seems
> that usually it does nothing - no wonder when processing half-GB image
> stacks on a machine with TB memory.
> Startup options in Fiji's jvm.cfg only specify 'server' mode, but also
> plain ImageJ runs a 'server' JVM. So also the jvm.cfg isn't responsible
> for the differences.
>
> While the java version is the same for both, and both is Oracle java,
> there is a difference between plain ImageJ and Fiji:
> The java.home property is
> for ImageJ: C:\Program Files\Java\jre1.8.0_162
> for Fiji:   C:\Program Files\Java\jdk1.8.0_162\jre
>
> Also, the library paths etc. are different (pointing to somewhere in the
> respective java.home folder):
>    java.ext.dirs
>    sun.boot.library.path
>    java.endorsed.dirs
>
>
> Java experts out there, do you have any idea whether this could make the
> difference how Java handles multithreading performance?
> (e.g. things like scheduling different threads, etc.)
> Are there java options to tweak it?
>
>
> Michael
> ________________________________________________________________
> On 15/02/2018 17:02, Saalfeld, Stephan wrote:
>> Thanks!  Have you tried other garbage collectors?
>>
>> ./fiji -Xmx8g -Xincgc --
>> ./fiji -Xmx8g -XX:+UseParallelGC --
>> ./fiji -Xmx8g -XX:+UseConcMarkSweepGC --
>>
>> The last one is my favorite for working with BDV and lazy caches, but
>> utilizes parallel threads, not sure what Fiji's defaults are at this
>> time.
>>
>> Thanks,
>> Stephan
>>
>>
>> On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote:
>>> Yes as you can see on the plots.
>>>
>>> Philippe
>>>
>>>
>>> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
>>>>
>>>> Prime candidates:
>>>>
>>>> * parallel garbage collection
>>>> * other JVM
>>>>
>>>> Have you tried to limit the number of threads to 1 or 2 less then
>>>> the
>>>> maximum number of processors if the number of available processors
>>>> is
>>>> very large?
>>>>
>>>> Thanks,
>>>> Stephan
>>>>
>>>>
>>>>
>>>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> after some off-list discussion and tests run by Philippe:
>>>>>
>>>>> The problem of unexpectedly poor performance on a Xeon 44-core
>>>>> machine
>>>>> only appears in Fiji, not in plain ImageJ.
>>>>> The problem is not limited to the "RankFilters"
>>>>> (Process>Filters>Mean,
>>>>> Median, Minimum, etc., everything that uses the "Circular Masks"
>>>>> for
>>>>> the
>>>>> neighborhood) but one can see parallelization problems also in
>>>>> other
>>>>> functions, though not as severe.
>>>>>
>>>>>    From all the evidence, it seems that sometimes one core or a
>>>>> few
>>>>> cores
>>>>> are not available for processing for rather long time (at least
>>>>> milliseconds, probably they do something else during that
>>>>> period).
>>>>> This
>>>>> has especially bad consequences for the current RankFilters
>>>>> because
>>>>> they
>>>>> expect all threads to work continuously and eventually the other
>>>>> threads
>>>>> have to wait if one thread (core) is inactive.
>>>>> (Due to optimized memory access, the RankFilters parallelization
>>>>> strategy has been very fast on machines like the Core i5 that
>>>>> were
>>>>> popular at the time when I developed it, but it gets poor under
>>>>> such
>>>>> circumstances).
>>>>>
>>>>> So the question: What is different on Fiji vs. plain ImageJ,
>>>>> concerning
>>>>> threads?
>>>>>
>>>>> The problems could be explained e.g. slower switching between
>>>>> threads
>>>>> in
>>>>> case one core has to handle several threads, or by some
>>>>> background
>>>>> activity that does not happen in plain ImageJ.
>>>>> I suspect different java options in the Launcher of Fiji vs.
>>>>> plain
>>>>> ImageJ, but I know nothing about Fiji.
>>>>>
>>>>>
>>>>> Michael
>>>>> ________________________________________________________________
>>>>>
>>>>>
>>>>> On 13/02/2018 12:01, Philippe Mailly wrote:
>>>>>>
>>>>>> Hi,
>>>>>> Some new tests.
>>>>>>
>>>>>> If I run Fiji in 32bits the median filter speed is similar to
>>>>>> other
>>>>>> computers.
>>>>>> In 64bits, all 2D filters (median, mean, variance ....) on a
>>>>>> image
>>>>>> stack
>>>>>> are very slow (40s)
>>>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...)
>>>>>> take
>>>>>> only
>>>>>> 10s
>>>>>>
>>>>>> Philippe
>>>>>>
>>>>> --
>>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>>> --
>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>
>> --
>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>
--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html

2DMedian_times_FIji_vs_ImageJ.png (27K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: very very very slow process (what is different in Fiji?)

ctrueden
In reply to this post by Michael Schmid-3
Hi Michael,

I'm sorry, but I don't have much insight into the problem, nor any time to
investigate it.

I agree with Stephan that the parameters used to launch the JVM are
paramount. I did not see any details posted about the exact invocations to
java that were used? If I were investigating this, I would definitely want
to do the benchmarks after invoking java directly, rather than via the
ImageJ launcher. We want to compare apples to apples.

It would also likely be valuable to profile the code to see where the
bottlenecks are, comparing plain IJ1 with IJ2. See
https://imagej.net/Profiling for some approaches.

Regards,
Curtis

--
Curtis Rueden
LOCI software architect - https://loci.wisc.edu/software
ImageJ2 lead, Fiji maintainer - https://imagej.net/User:Rueden
Did you know ImageJ has a forum? http://forum.imagej.net/


On Tue, Feb 20, 2018 at 7:53 AM, Michael Schmid <[hidden email]>
wrote:

> Hi everyone (especially Fiji experts!),
>
> the attachment shows an example of Fiji vs. plain ImageJ parallelization
> performance. For this, I have already changed the RankFilters (which doe
> the medain) to be less sensitive to cases where one thread gets stuck,
> otherwise the Fiji performance with a large number of processors would be
> even worse.
>
> In both cases, it runs at the default priority=4.
>
> Maybe the following information provides a few more clues:
> - The problem is found for a Windows 10 machine (two Xeon E5-2699 v4, 2*22
> cores + hyperthreading, 1 TB)
> - No such problem on a slightly slower Windows 7 machine with Xeon
> E5-2630v3 processors (max. 32 threads), 64 GB RAM.
>
> It can't be a problem of the computer or Windows 10, however, because
> plain ImageJ works well, the poor parallelization performance occurs only
> under FIJI.
>
> So again, what could be different between plain ImageJ and Fiji?
> (At least in theory, thread scheduling, time slices, etc. should be done
> by the operating system, so the Win 10 vs. Win 7 difference should affect
> plain ImageJ and Fiji the same way).
>
>
> Michael
> ________________________________________________________________
>
>
>
> On 19/02/2018 16:46, Michael Schmid wrote:
>
>> Hi everyone,
>>
>> after a bit of further analysis of the Fiji vs. ImageJ performance by
>> Philippe (supplemented by some ideas of mine):
>>
>> Parallelization performance is sometimes very different between plain
>> ImageJ and Fiji (this differs a lot between different filters/functions,
>> and strongly depends on the size of the data set). Many times, plain ImageJ
>> is better, sometimes Fiji.
>> Performance often differs between ImageJ and Fiji by factors of five or
>> more!
>>
>> The garbage collector seems to make no or not much difference. It seems
>> that usually it does nothing - no wonder when processing half-GB image
>> stacks on a machine with TB memory.
>> Startup options in Fiji's jvm.cfg only specify 'server' mode, but also
>> plain ImageJ runs a 'server' JVM. So also the jvm.cfg isn't responsible for
>> the differences.
>>
>> While the java version is the same for both, and both is Oracle java,
>> there is a difference between plain ImageJ and Fiji:
>> The java.home property is
>> for ImageJ: C:\Program Files\Java\jre1.8.0_162
>> for Fiji:   C:\Program Files\Java\jdk1.8.0_162\jre
>>
>> Also, the library paths etc. are different (pointing to somewhere in the
>> respective java.home folder):
>>    java.ext.dirs
>>    sun.boot.library.path
>>    java.endorsed.dirs
>>
>>
>> Java experts out there, do you have any idea whether this could make the
>> difference how Java handles multithreading performance?
>> (e.g. things like scheduling different threads, etc.)
>> Are there java options to tweak it?
>>
>>
>> Michael
>> ________________________________________________________________
>> On 15/02/2018 17:02, Saalfeld, Stephan wrote:
>>
>>> Thanks!  Have you tried other garbage collectors?
>>>
>>> ./fiji -Xmx8g -Xincgc --
>>> ./fiji -Xmx8g -XX:+UseParallelGC --
>>> ./fiji -Xmx8g -XX:+UseConcMarkSweepGC --
>>>
>>> The last one is my favorite for working with BDV and lazy caches, but
>>> utilizes parallel threads, not sure what Fiji's defaults are at this
>>> time.
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>> On Thu, 2018-02-15 at 16:50 +0100, Philippe Mailly wrote:
>>>
>>>> Yes as you can see on the plots.
>>>>
>>>> Philippe
>>>>
>>>>
>>>> Le 15/02/2018 à 15:43, Saalfeld, Stephan a écrit :
>>>>
>>>>>
>>>>> Prime candidates:
>>>>>
>>>>> * parallel garbage collection
>>>>> * other JVM
>>>>>
>>>>> Have you tried to limit the number of threads to 1 or 2 less then
>>>>> the
>>>>> maximum number of processors if the number of available processors
>>>>> is
>>>>> very large?
>>>>>
>>>>> Thanks,
>>>>> Stephan
>>>>>
>>>>>
>>>>>
>>>>> On Thu, 2018-02-15 at 15:25 +0100, Michael Schmid wrote:
>>>>>
>>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> after some off-list discussion and tests run by Philippe:
>>>>>>
>>>>>> The problem of unexpectedly poor performance on a Xeon 44-core
>>>>>> machine
>>>>>> only appears in Fiji, not in plain ImageJ.
>>>>>> The problem is not limited to the "RankFilters"
>>>>>> (Process>Filters>Mean,
>>>>>> Median, Minimum, etc., everything that uses the "Circular Masks"
>>>>>> for
>>>>>> the
>>>>>> neighborhood) but one can see parallelization problems also in
>>>>>> other
>>>>>> functions, though not as severe.
>>>>>>
>>>>>>    From all the evidence, it seems that sometimes one core or a
>>>>>> few
>>>>>> cores
>>>>>> are not available for processing for rather long time (at least
>>>>>> milliseconds, probably they do something else during that
>>>>>> period).
>>>>>> This
>>>>>> has especially bad consequences for the current RankFilters
>>>>>> because
>>>>>> they
>>>>>> expect all threads to work continuously and eventually the other
>>>>>> threads
>>>>>> have to wait if one thread (core) is inactive.
>>>>>> (Due to optimized memory access, the RankFilters parallelization
>>>>>> strategy has been very fast on machines like the Core i5 that
>>>>>> were
>>>>>> popular at the time when I developed it, but it gets poor under
>>>>>> such
>>>>>> circumstances).
>>>>>>
>>>>>> So the question: What is different on Fiji vs. plain ImageJ,
>>>>>> concerning
>>>>>> threads?
>>>>>>
>>>>>> The problems could be explained e.g. slower switching between
>>>>>> threads
>>>>>> in
>>>>>> case one core has to handle several threads, or by some
>>>>>> background
>>>>>> activity that does not happen in plain ImageJ.
>>>>>> I suspect different java options in the Launcher of Fiji vs.
>>>>>> plain
>>>>>> ImageJ, but I know nothing about Fiji.
>>>>>>
>>>>>>
>>>>>> Michael
>>>>>> ________________________________________________________________
>>>>>>
>>>>>>
>>>>>> On 13/02/2018 12:01, Philippe Mailly wrote:
>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>> Some new tests.
>>>>>>>
>>>>>>> If I run Fiji in 32bits the median filter speed is similar to
>>>>>>> other
>>>>>>> computers.
>>>>>>> In 64bits, all 2D filters (median, mean, variance ....) on a
>>>>>>> image
>>>>>>> stack
>>>>>>> are very slow (40s)
>>>>>>> Same type of filters in 3D (median3D, mean3D, variance 3D ...)
>>>>>>> take
>>>>>>> only
>>>>>>> 10s
>>>>>>>
>>>>>>> Philippe
>>>>>>>
>>>>>>> --
>>>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>>>>>
>>>>> --
>>>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>>>>
>>>>
>>> --
>>> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>>>
>>>
> --
> ImageJ mailing list: http://imagej.nih.gov/ij/list.html
>

--
ImageJ mailing list: http://imagej.nih.gov/ij/list.html