elm_suite would crash when CK_FORK=no is set, because evas was
badly initializing or shutting down. Note that elm_suite still
crashes with CK_FORK=no but valgrind doesn't complain.
This fixes invalid mouse cursor used when windows are
created.
Due to the changes in the border theme and the fact that
a border is now always created, the event region
"elm.resize.bl" contains the point (0,0) when the window
size itself is 1x1. As a consequence every EFL application
would permanently have a cursor like the resize bottom/left
handle.
This fixes that by properly checking whether the pointer is
inside an object based on the ins list, and not just the
object geometry.
@feature
See also: b735386a45
this refactors _evas_render_phase1_object_process() into a bunch of
sub functions with leaner code, some LIKELY/UNLIKELY hints etc. etc.
in the hope that we have better l1 instruction cache use when
executing. this actually measureably helps and drops the overhead of
this func ANd all its sub functions from (in my tests in enlightenment
compositing while a video plays) from about 13.2% of all cpu usage by
e to 10%. that's about a 25% drop in cost for passing through phase 1
of evas render... and thats a good thing.
and it also makes the code nicer and more broken up.
@optimize
we are passing the same things into every phase 1 process func - the
same ptrs to the same arrays of objects... why eat up valuable
registers with this? collect into context struct and just pass a ptr
to that. this also makes the code easier to read and maintain too so
bonus all over. also a tiny win in performance but i'd say its "within
error margins" (go from 11.48% to 11.42% overhead).
this tests rendering of images with border scaling if they are small
(smaller than 256x256) to reduce geometry. part of testing a cpu
reduction effort in gl engine by pre-rendering primitive objects to
buffers.
Summary:
In case of thread creation failure, shutdown logic will be stuck.
To prevent stuck, set exit variables to make thread_shutdown working
even if init fails.
Also modify init logics to return init result to a caller.
Reviewers: jypark, woohyun, cedric, jpeg
Subscribers: cedric
Differential Revision: https://phab.enlightenment.org/D4411
Note (@jpeg):
I have modified the patch just a little bit.
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
The proper way to use the pixel_get callback and dirty flag
is to also specify which exact region has been updated
with data_update_add.
Unfortunately many apps and even GLView are relying on
invalid behaviour that forced full redraw of the image
even though data_update_add was never called.
This amends c1a080f5e4
There is no dirty flag equivalent in EO as there is no
pixel_get callback defined (yet). One problem is that the GL
API is not defined, and may prove hard to define for bindings...
this fixes T4904
dif e pops up menus it uses zoomap to zoom an existing object in and
out by forcing it to be mapped and controlling the map geom based on
smart obj geom. this will do a minimum update region for the case
where a map just got turned on or off for an obj since last frame and
cut this out.
@optimize
this is how you would possibly use prepare stages for objects like
image objects by pre-rendering them to a buffer. this is not complete
and it's actually disabled right now, but it's to show how it might be
done. some more exploring is needed, but this is to share how it
might/should work.
preparing an object is a good idea. especially with gl. you want to do
texture uploads BEFORE using textures all in one batch. otherwise this
may mean the gl implementation has to make a copy of your data in a
tmp location then copy it in later when texture becomes "unused" as it
may be in use at the moment, or it may have to stall and wait.
i have seen somewhere around 7-10% speedups on nvidia and intel
drivers with this on given a very special test case i brewed up (1000
32x32 images where i change 1 pixel every frame). this should have
impact really when we are modifying textures a lot. this is all i've
implemented for now, but this should/would/could do much more like
re-order map, proxy renders to render FIRST in a pre-render list
instead of inline and to pre-render fbo/buffer content for complex
objects like text or textblock etc.
After a few patches trying to fix clipping of frame or
non-frame objects the icon finally ended up invisible. Even
if the elm_icon was marked as is_frame, its internal evas
object image would not have the flag set, thus it would be
clipped out.
Solution: Propagate the is_frame flag to all smart children,
not only when setting it but also when adding new members.
A hack with the API indicates that the frame edje is a very
special object that does not propagate the flag.
See also:
7ce79be1a10f6c33eff19c9c8809a7ac5ca9281c
Test cases (in WL or X with client-side decorations on):
1. elementary_test -to Animation
Resize the window to a small size (eg, 100x100) and observe the
balls overflowing outside the window content part. This tests
unclipped normal objects.
2. elementary_test "Window Plug" (requires also Window Socket)
Drag the handles outside the window, observe overflow in the
framespace area. This tests mapped images ('can_map').
3. elementary_test -to "Gesture Layer"
Drag a photo around. This tests non-image mapped objects.
NOTE: This test is badly broken!
This patch fixes both of those issues. I'm not sure what I'm
breaking, though.
Summary:
Before focusing an object, the intercept focus callback
is called. This callback may ask Evas to focus another object
instead, so it's necessary to check if the seat in question still
have a focused object event after a efl_canvas_object_seat_focus_del() call.
Reviewers: cedric, bdilly, barbieri, ProhtMeyhet, netstar
Subscribers: cedric, jpeg
Maniphest Tasks: T4864, T4886
Differential Revision: https://phab.enlightenment.org/D4396
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
When an object inside a genlist is masked, scrolling would
cause render issues as the mask is not redrawn on move (only
the clip geometry is marked as dirty and recalculated, the
mask pixels are assumed to be well prepared already). As a
result, masked objects in a genlist would not show up
properly once you start scrolling.
This fixes that by hacking into evas a safety test to avoid
unnecessary clipping, and by using parent masks even if they
are not the direct clipper.
Note that no_render is still quite broken (eg. a no_render
mask may cause major issues, even crashes).
This reverts 5917b49f59
Using the multi-seat support, Evas is able to handle multiple focused objects.
This implementation allows one focused object per seat.
This patch introduces new APIs and events to handle this new scenario,
while keeping compatible with the old focus APIs.
See T4749, 11b7cf6b72 introduced an issue and
e1e28ce320 fixed it but caused a massive
performance impact.
This should fix that. Thanks @zmike for the first patch.
Fixes T4840
Summary:
the canvas image is the only one presenting the load api, in all other
implementations you would only see error messages.
Reviewers: jpeg
Subscribers: cedric, raster
Differential Revision: https://phab.enlightenment.org/D4380
Signed-off-by: Cedric BAIL <cedric@osg.samsung.com>
so i have been doing some profiling on my rpi3 ... and it seems
memcmp() is like the number one top used function - especially running
e in wayland compositor mode. it uses accoring to perf top about 9-15%
of samples (samples are not adding up to 100%). no - i cant seem to
get a call graph because all that happens is the whole kernel locks up
solid if i try, so i can only get the leaf node call stats. what
function was currently active at the sample time. memcmp is the
biggest by far. 2-3 times anything else.
13.47% libarmmem.so [.] memcmp
6.43% libevas.so.1.18.99 [.] _evas_render_phase1_object_pro
4.74% libevas.so.1.18.99 [.] evas_render_updates_internal.c
2.84% libeo.so.1.18.99 [.] _eo_obj_pointer_get
2.49% libevas.so.1.18.99 [.] evas_render_updates_internal_l
2.03% libpthread-2.24.so [.] pthread_getspecific
1.61% libeo.so.1.18.99 [.] efl_data_scope_get
1.60% libevas.so.1.18.99 [.] _evas_event_object_list_raw_in
1.54% libevas.so.1.18.99 [.] evas_object_smart_changed_get
1.32% libgcc_s.so.1 [.] __udivsi3
1.21% libevas.so.1.18.99 [.] evas_object_is_active
1.14% libc-2.24.so [.] malloc
0.96% libevas.so.1.18.99 [.] evas_render_mapped
0.85% libeo.so.1.18.99 [.] efl_isa
yeah. it's perf. it's sampling so not 100% accurate, but close to
"good enough" for the bigger stuff. so interestingly memcmp() is
actually in a special library/module (libarmmem.so) and is a REAL
function call. so doing memcmp's for small bits of memory ESPECIALLY
when we know their size in advance is not great. i am not sure our own
use of memcmp() is the actual culprit because even with this patch
memcmp still is right up there. we use it for stringshare which is
harder to remove as stringshare has variable sized memory blobs to
compare.
but the point remains - memcmp() is an ACTUAL function call. even on
x86 (i checked the assembly). and replacing it with a static inline
custom comparer is better. in fact i did that and benchmarked it as a
sample case for eina_tiler which has 4 ints (16 bytes) to compare
every time. i also compiled to assembly on x86 to inspect and make sure
things made sense.
the text color compare was just comparing 4 bytes as a color (an int
worth) which was silly to use memcmp on as it could just cast to an
int and do a == b. the map was a little more evil as it was 2 ptrs
plus 2 bitfields, but the way bitfields work means i can assume the
last byte is both bitfields combined. i can be a little more evil for
the rect tests as 4 ints compared is the same as comparing 2 long
longs (64bit types). yes. don't get pedantic. all platforms efl works
on work this way and this is a base assumption in efl and it's true
everywhere worth talking about.
yes - i tried __int128 too. it was not faster on x86 anyway and can't
compile on armv7. in my speed tests on x86-64, comparing 2 rects by
casting to a struct of 2 long long's and comparing just those is 70%
faster than comapring 4 ints. and the 2 long longs is 360% faster than
a memcmp. on arm (my rpi3) the long long is 12% faster than the 4 ints,
and it is 226% faster than a memcmp().
it'd be best if we didnt even have to compare at all, but with these
algorithms we do, so doing it faster is better.
we probably should nuke all the memcmp's we have that are not of large
bits of memory or variable sized bits of memory.
i set breakpoints for memcmp and found at least a chunk in efl. but
also it seems the vc4 driver was also doing it too. i have no idea how
much memory it was doing this to and it may ultimately be the biggest
culprit here, BUT we may as well reduce our overhead since i've found
this anyway. less "false positives" when hunting problems.
why am i doing this? i'm setting framerate hiccups. eg like we drop 3,
5 or 10 frames, then drop another bunch, then go back to smooth, then
this hiccup again. finding out WHAT is causing that hiccup is hard. i
can only SEE the hiccups on my rpi3 - not on x86. i am not so sure
it's cpufreq bouncing about as i've locked cpu to 600mhz and it still
happens. it's something else. maybe something we are polling? maybe
it's something in our drm/kms backend? maybe its in the vc4 drivers or
kernel parts? i have no idea. trying to hunt this is hard, but this is
important as this is something that possibly is affecting everyone but
other hw is fast enough to hide it...
in the meantime find and optimize what i find along the way.
@optimize
Evil implementation of pipe() function uses sockets. Windows functions
"write", "read" and "close" doesn't works with sockets. In this commit
added macros, that replace "read" with "recv", "write" with "send" and
"close" with "closesocket".
@fix