docs say return true on succesas, false on failure. adding a rect we
already added is not a failur. it's an optimization to a NOP. so fix.
this was brought up by and fixes T6669 ... but in the opposite way.
since this code's creation it seems that the internal int size was set to use
short in order to micro-optimize memory usage, while the api function parameters
used Eina_Rectangle which had a larger int size. when initializing the internal
rect struct, this would lead to overflows which resulted in broken tilers which
returned iterators with no valid rects after having valid rects added
test case: run weston-subsurfaces
@fix
so i have been doing some profiling on my rpi3 ... and it seems
memcmp() is like the number one top used function - especially running
e in wayland compositor mode. it uses accoring to perf top about 9-15%
of samples (samples are not adding up to 100%). no - i cant seem to
get a call graph because all that happens is the whole kernel locks up
solid if i try, so i can only get the leaf node call stats. what
function was currently active at the sample time. memcmp is the
biggest by far. 2-3 times anything else.
13.47% libarmmem.so [.] memcmp
6.43% libevas.so.1.18.99 [.] _evas_render_phase1_object_pro
4.74% libevas.so.1.18.99 [.] evas_render_updates_internal.c
2.84% libeo.so.1.18.99 [.] _eo_obj_pointer_get
2.49% libevas.so.1.18.99 [.] evas_render_updates_internal_l
2.03% libpthread-2.24.so [.] pthread_getspecific
1.61% libeo.so.1.18.99 [.] efl_data_scope_get
1.60% libevas.so.1.18.99 [.] _evas_event_object_list_raw_in
1.54% libevas.so.1.18.99 [.] evas_object_smart_changed_get
1.32% libgcc_s.so.1 [.] __udivsi3
1.21% libevas.so.1.18.99 [.] evas_object_is_active
1.14% libc-2.24.so [.] malloc
0.96% libevas.so.1.18.99 [.] evas_render_mapped
0.85% libeo.so.1.18.99 [.] efl_isa
yeah. it's perf. it's sampling so not 100% accurate, but close to
"good enough" for the bigger stuff. so interestingly memcmp() is
actually in a special library/module (libarmmem.so) and is a REAL
function call. so doing memcmp's for small bits of memory ESPECIALLY
when we know their size in advance is not great. i am not sure our own
use of memcmp() is the actual culprit because even with this patch
memcmp still is right up there. we use it for stringshare which is
harder to remove as stringshare has variable sized memory blobs to
compare.
but the point remains - memcmp() is an ACTUAL function call. even on
x86 (i checked the assembly). and replacing it with a static inline
custom comparer is better. in fact i did that and benchmarked it as a
sample case for eina_tiler which has 4 ints (16 bytes) to compare
every time. i also compiled to assembly on x86 to inspect and make sure
things made sense.
the text color compare was just comparing 4 bytes as a color (an int
worth) which was silly to use memcmp on as it could just cast to an
int and do a == b. the map was a little more evil as it was 2 ptrs
plus 2 bitfields, but the way bitfields work means i can assume the
last byte is both bitfields combined. i can be a little more evil for
the rect tests as 4 ints compared is the same as comparing 2 long
longs (64bit types). yes. don't get pedantic. all platforms efl works
on work this way and this is a base assumption in efl and it's true
everywhere worth talking about.
yes - i tried __int128 too. it was not faster on x86 anyway and can't
compile on armv7. in my speed tests on x86-64, comparing 2 rects by
casting to a struct of 2 long long's and comparing just those is 70%
faster than comapring 4 ints. and the 2 long longs is 360% faster than
a memcmp. on arm (my rpi3) the long long is 12% faster than the 4 ints,
and it is 226% faster than a memcmp().
it'd be best if we didnt even have to compare at all, but with these
algorithms we do, so doing it faster is better.
we probably should nuke all the memcmp's we have that are not of large
bits of memory or variable sized bits of memory.
i set breakpoints for memcmp and found at least a chunk in efl. but
also it seems the vc4 driver was also doing it too. i have no idea how
much memory it was doing this to and it may ultimately be the biggest
culprit here, BUT we may as well reduce our overhead since i've found
this anyway. less "false positives" when hunting problems.
why am i doing this? i'm setting framerate hiccups. eg like we drop 3,
5 or 10 frames, then drop another bunch, then go back to smooth, then
this hiccup again. finding out WHAT is causing that hiccup is hard. i
can only SEE the hiccups on my rpi3 - not on x86. i am not so sure
it's cpufreq bouncing about as i've locked cpu to 600mhz and it still
happens. it's something else. maybe something we are polling? maybe
it's something in our drm/kms backend? maybe its in the vc4 drivers or
kernel parts? i have no idea. trying to hunt this is hard, but this is
important as this is something that possibly is affecting everyone but
other hw is fast enough to hide it...
in the meantime find and optimize what i find along the way.
@optimize
applying this optimization to prevent the same rectangle from being added
or removed repeatedly in succession would result in the rejecting of successive
operations of the same type when the other operation occurred in between.
as an example:
add(0, 0, 100, 100)
del(0, 0, 100, 100)
add(0, 0, 100, 100)
should yield (0, 0, 100, 100), not zero rects and a failure to add the
second rect
this fixes a serious issue in enlightenment where stacking three windows
on top of each other with the first and third windows having the same geometry
would result in the top window receiving no input geometry (oops)
@fix
Summary:
change eina_tiler_intersection to return a NULL if intersection
of two tilers doesn't exist. and add test case to check it.
This doesn't break ABI/API as this call could already return a NULL value and it
should have been handled by the caller anyway. This just make an expected behavior
more correct.
Test Plan: run eina_suite after building eina test suite
Reviewers: cedric, raster, torori, devilhorns
Subscribers: cedric
Differential Revision: https://phab.enlightenment.org/D1205
Signed-off-by: Cedric BAIL <c.bail@partner.samsung.com>
Summary:
Fix invalid read on eina tiler reported by valgrind.
This revision will prevent access to data that was gained from eina iterator, after free of eina_iterator.
Test Plan:
1. Build enlightenment on devs/devilhorns/e_comp_wl branch with efl applyied this patch.
2. Run enlightenment with valgrind options.
3. build enlightenment with this patch
4. Run any wayland app on enlightenment
5. There will be no more invalid read message by valgrind.
Reviewers: cedric, devilhorns, raster, gwanglim, zmike
CC: cedric
Differential Revision: https://phab.enlightenment.org/D1080
Summary:
If one of the given tilers is empty, then crash could be occurred in internal while loop of eina_tiler_intersection.
And fix some memory leaks in eina_tiler_intersection and eina_tiler_equal.
Reviewers: devilhorns, raster, cedric, torori
CC: cedric
Differential Revision: https://phab.enlightenment.org/D996
Summary:
Support union, subtract, intersection, equal(comparison) between tilers.
@feature
Test Plan: Test with added test case(src/tests/eina/eina_test_tiler.c) and the example(src/examples/eina/eina_tiler_02.c)
Reviewers: gwanglim, devilhorns, raster, zmike, cedric
CC: cedric
Differential Revision: https://phab.enlightenment.org/D880