|Age||Commit message (Collapse)||Author|
This should save a bit of memory for all image & text
objects. This exploits the previous patch for the post-render
job queue added to evas, and simplifies this bit of code.
This will be used by the filters
i found evas_common_draw_context_apply_cutouts() was procsessing 300+
cutouts and as it's O(n^2)/2 to try and merge adjacent rects for
cutouts this really performs like complete junk. we apply cutout rects
a LOT. this is not the best solution, but it's quick and much faster
than doing the clipouts which drop framerate to like 1-2fps or so in the
nasty case i say (tyls -m of photos in a dir with a 2160 high
this figures out the target area to limit the count of rects
significantly so O(n^2) is far far better when n is now < 10 most of
the time. and for the few operations where it's a high value this now
uses qsort to speed up merges etc. etc.
This fixes evas_object_image_save after changing the orientation
of an image in the GL engine. In SW engine the pixel data is rotated
in memory, so things worked fine from the beginning. In GL we may
have to go through loops and hoops in order to rotate and fetch the
data from the GL texture.
This should fix ce45d44.
This reverts commit 4e110a34bffe09abe4dd793f9ecf1cf3884ccf22.
Urgh. I'm stupid. Conceptually I still like the idea of the
region proxy, that only renders part of the source object
in a proxy surface. The problem right now is that the proxy
subrender mechanism renders to a surface that belongs to
the **source** and not the proxy object. As a consequence,
the region would become a property of the source, rather than
the proxy, which is not at all the intention of the original
patch. In other words, everything would fall apart if an object
has more than one proxy, for whatever reason.
I realized that when trying to actually test the region proxy.
It didn't work at all. Revert for now.
Only the size of events_whitelist isn't enough, because in some
cases the user may be disabling the usage of a specific seat.
Considering the following scenario, the issue will easy to understand:
- an application with two entries (one to be used by seat 1 and other
by seat 2)
- the first seat is announced - it is enabled for entry 1 and
disabled for entry 2
- the second seat is announced
Before second seat is announced, the first seat would be able
to input the entry 1, because the events_whitelist of such
object will continue empty.
So a flag will be used to identify an object with active
Reviewed By: iscaro
Differential Revision: https://phab.enlightenment.org/D4498
clean_layer is a bool
Always assume use_mapped_ctx as true, the caller of evas_render_mapped
must ensure that the context is suitable (so either clean or contains
the appropriate clip info).
Also pass do_async to mask_subrender. For now it will always be reset
to false as the SW engine requires sync render to convert from RGBA
to Alpha (not great). The upcoming GL async engine should be able to
render masks asynchronously without any problem.
It was never used, except in dubious situations (most likely a typo).
A clean context is now used in the top-most call to
This is for canvas_alpha_get. context is never used.
It's not used
It's not necessary.
This will allow partially rendering a proxy in a smaller image,
limited to the specified region. At the moment, this will allow
apps to create proxies of very large objects and let them deal
with the geometry & clipping.
This is not directly solving the issues with adding a filter
to textblock or the infinite page scrollers.
The Evas_Pointer_Data struct already contains a Efl.Input.Device pointer.
The hash implementation demonstrated that too much memory was being used
to store the Evas_Object_Pointer_Data. In order to reduce the memory usage
this patches now changes the Evas_Object_Pointer_Data storage to an Eina_Inlist and
now Massif profiles shows that the memory usage was drastically reduced.
With this new API one can block or unblock keyboard, mouse and
focus events that was originated from a seat. This is useful to
create applications that wants to establish some kind of seat segregation.
This prevent dead lock.
is still useful.
After @cedric's commit 6427c77707fb6116a98b we end up with E
not working in Xephyr, because evas_common_shutdown() is called
too many times (init_count == -1). So I'm being paranoid and
tracking whether Evas has initialized or not evas_common. That
way we end up with exactly the right number of inits. We even
reach 0 after E shuts down :)
This patch introduces possibility to enable key locks and modifers by seat.
It's very useful when the user has two keyboards attached to different seats.
This patch introduces the possibility to set the pointer mode and
query other properties like current position per pointer device.
The old API will still works, however it will only act on the default seat.
In some situations smart object resize could fall in an
infinite loop even though the size was stable.
Thanks @vtorri for the report!
evas_object_clip_recalc is big. it's fat. it shouldnt be inline. so
make it a real function. being inline just hurts performance by making
our code bigger, hurting l1 instruction prefetch and cache
performance. this function isn't small. it's huge and should not be
inline basically because of that reason.
also throw in some likely/unlikely hints etc.
part of rendering is figuring if obj is inside current geometry.
before we had to actuall poke around inside the object. this moves the
geometry into the active object array so the data is fecthed fast and
already there for filtering as this is the most likely thing to filter
out an object.
unfortunately this seems to have some bugsd and i'm baffled why, so
leave it there and ifdefed out for now for suture hunting.
evas render in phase1 in order to generate update rects, active,
render etc. object arrays has to walk every object in our tree. this
is a waste of time if we already have walked objects in a previous
frame if they havent changed, so cache this data in render cache in
smart objects to avoid re-walking and now just dumbly "memcpy" these
cached arrays into the master array. i have seen cpu usage by e drop
like about 15% in the sencarios i'm looking at "enlightenment
compositor with some window updating animation all the time, but most
other stuff being static).
this refactors _evas_render_phase1_object_process() into a bunch of
sub functions with leaner code, some LIKELY/UNLIKELY hints etc. etc.
in the hope that we have better l1 instruction cache use when
executing. this actually measureably helps and drops the overhead of
this func ANd all its sub functions from (in my tests in enlightenment
compositing while a video plays) from about 13.2% of all cpu usage by
e to 10%. that's about a 25% drop in cost for passing through phase 1
of evas render... and thats a good thing.
and it also makes the code nicer and more broken up.
this is how you would possibly use prepare stages for objects like
image objects by pre-rendering them to a buffer. this is not complete
and it's actually disabled right now, but it's to show how it might be
done. some more exploring is needed, but this is to share how it
for gl noscale buffers are texture atlases that are fbo's. the point
is never to scale or transofmr them but to render them pixel for pixel
and just store pre-rendered data where its cheaper to do this than
rebuild every time. this is the enigne infra for sw and gl with the gl
code... it SHOULD work... in theory...
this is to allow gl to specifically use an fbo as an atlas for these
kinds of buffers
preparing an object is a good idea. especially with gl. you want to do
texture uploads BEFORE using textures all in one batch. otherwise this
may mean the gl implementation has to make a copy of your data in a
tmp location then copy it in later when texture becomes "unused" as it
may be in use at the moment, or it may have to stall and wait.
i have seen somewhere around 7-10% speedups on nvidia and intel
drivers with this on given a very special test case i brewed up (1000
32x32 images where i change 1 pixel every frame). this should have
impact really when we are modifying textures a lot. this is all i've
implemented for now, but this should/would/could do much more like
re-order map, proxy renders to render FIRST in a pre-render list
instead of inline and to pre-render fbo/buffer content for complex
objects like text or textblock etc.
After a few patches trying to fix clipping of frame or
non-frame objects the icon finally ended up invisible. Even
if the elm_icon was marked as is_frame, its internal evas
object image would not have the flag set, thus it would be
Solution: Propagate the is_frame flag to all smart children,
not only when setting it but also when adding new members.
A hack with the API indicates that the frame edje is a very
special object that does not propagate the flag.
Using the multi-seat support, Evas is able to handle multiple focused objects.
This implementation allows one focused object per seat.
This patch introduces new APIs and events to handle this new scenario,
while keeping compatible with the old focus APIs.
This will be useful when Ecore_Evas is setting up the
This fixes the test "Window Socket" map resize.
I broke that in 40fec5f608da4c5188a2afc7cc293d4f
Makes me wonder if the previous position should be exposed
in EO as well.
so i have been doing some profiling on my rpi3 ... and it seems
memcmp() is like the number one top used function - especially running
e in wayland compositor mode. it uses accoring to perf top about 9-15%
of samples (samples are not adding up to 100%). no - i cant seem to
get a call graph because all that happens is the whole kernel locks up
solid if i try, so i can only get the leaf node call stats. what
function was currently active at the sample time. memcmp is the
biggest by far. 2-3 times anything else.
13.47% libarmmem.so [.] memcmp
6.43% libevas.so.1.18.99 [.] _evas_render_phase1_object_pro
4.74% libevas.so.1.18.99 [.] evas_render_updates_internal.c
2.84% libeo.so.1.18.99 [.] _eo_obj_pointer_get
2.49% libevas.so.1.18.99 [.] evas_render_updates_internal_l
2.03% libpthread-2.24.so [.] pthread_getspecific
1.61% libeo.so.1.18.99 [.] efl_data_scope_get
1.60% libevas.so.1.18.99 [.] _evas_event_object_list_raw_in
1.54% libevas.so.1.18.99 [.] evas_object_smart_changed_get
1.32% libgcc_s.so.1 [.] __udivsi3
1.21% libevas.so.1.18.99 [.] evas_object_is_active
1.14% libc-2.24.so [.] malloc
0.96% libevas.so.1.18.99 [.] evas_render_mapped
0.85% libeo.so.1.18.99 [.] efl_isa
yeah. it's perf. it's sampling so not 100% accurate, but close to
"good enough" for the bigger stuff. so interestingly memcmp() is
actually in a special library/module (libarmmem.so) and is a REAL
function call. so doing memcmp's for small bits of memory ESPECIALLY
when we know their size in advance is not great. i am not sure our own
use of memcmp() is the actual culprit because even with this patch
memcmp still is right up there. we use it for stringshare which is
harder to remove as stringshare has variable sized memory blobs to
but the point remains - memcmp() is an ACTUAL function call. even on
x86 (i checked the assembly). and replacing it with a static inline
custom comparer is better. in fact i did that and benchmarked it as a
sample case for eina_tiler which has 4 ints (16 bytes) to compare
every time. i also compiled to assembly on x86 to inspect and make sure
things made sense.
the text color compare was just comparing 4 bytes as a color (an int
worth) which was silly to use memcmp on as it could just cast to an
int and do a == b. the map was a little more evil as it was 2 ptrs
plus 2 bitfields, but the way bitfields work means i can assume the
last byte is both bitfields combined. i can be a little more evil for
the rect tests as 4 ints compared is the same as comparing 2 long
longs (64bit types). yes. don't get pedantic. all platforms efl works
on work this way and this is a base assumption in efl and it's true
everywhere worth talking about.
yes - i tried __int128 too. it was not faster on x86 anyway and can't
compile on armv7. in my speed tests on x86-64, comparing 2 rects by
casting to a struct of 2 long long's and comparing just those is 70%
faster than comapring 4 ints. and the 2 long longs is 360% faster than
a memcmp. on arm (my rpi3) the long long is 12% faster than the 4 ints,
and it is 226% faster than a memcmp().
it'd be best if we didnt even have to compare at all, but with these
algorithms we do, so doing it faster is better.
we probably should nuke all the memcmp's we have that are not of large
bits of memory or variable sized bits of memory.
i set breakpoints for memcmp and found at least a chunk in efl. but
also it seems the vc4 driver was also doing it too. i have no idea how
much memory it was doing this to and it may ultimately be the biggest
culprit here, BUT we may as well reduce our overhead since i've found
this anyway. less "false positives" when hunting problems.
why am i doing this? i'm setting framerate hiccups. eg like we drop 3,
5 or 10 frames, then drop another bunch, then go back to smooth, then
this hiccup again. finding out WHAT is causing that hiccup is hard. i
can only SEE the hiccups on my rpi3 - not on x86. i am not so sure
it's cpufreq bouncing about as i've locked cpu to 600mhz and it still
happens. it's something else. maybe something we are polling? maybe
it's something in our drm/kms backend? maybe its in the vc4 drivers or
kernel parts? i have no idea. trying to hunt this is hard, but this is
important as this is something that possibly is affecting everyone but
other hw is fast enough to hide it...
in the meantime find and optimize what i find along the way.
so since this uses new pos - cur pos to move BY x pixels... there is
an issue that if the move of the obj ends up re-moving the current obj
TO the same pos. it moved BY the same delta again thus racing ahead.
not great. because x/y is not stored until the call stack returns to
after the smart move func and the pos set storce the new position in
the object struct. the easiest way atm until we have relative
positioning etc. is to store this in the smart obj and use the delta
at that time of the call then store it immediately so a recursion ends
up with a delta of 0 if its the same pos, so go back to where we were.
this fixes a nasty issue i spotted in fileselector that made filesel
icons race along when scrolling 2x as fast as everything else. oddly i
couldnt see this in any other widget that scrolled when i looked...
which is odd, but... either way a nasty issue... subtle... and now
fixed. never saw this before so this seems new.
In case of a mapped object (eg. when applying a map to a window
in wayland compositor), the canvas and output coordinates are
not meant to be the same.
In EO land, applications should instead use the common interface
Efl.Input.Interface.pointer_xy.get (on the canvas).
Reviewers: cedric, jpeg
Subscribers: cedric, jpeg
Differential Revision: https://phab.enlightenment.org/D4350
Reviewers: cedric, jpeg
Subscribers: cedric, jpeg
Differential Revision: https://phab.enlightenment.org/D4346
This is an override of efl_gfx_size_set. Same as before, the
order of operations matter so it is possible that a corner
case will break. In particular, legacy code was:
- smart resize (do stuff), super, super, super
- evas object resize
The new code is more like:
- super, super, super, evas object resize
- do stuff
But unfortunately this broke elm_widget (read: all widgets) as
the internal resize was done before the object resize. So,
inside the resize event cb, the resize_obj size would not match
the smart object size. >_<
Similarly to group_color_set, group_clip_[un]set should not
exist and should be a result of efl_super and inheritance.
This patch also removes clip_unset from the EO API and keeps
only clip_set(NULL). The reason is that it will avoid bad overrides
of clip_unset() vs. clip_unset(NULL). This also simplifies the code
a bit. Ideally we should be able to reintroduce clip_unset in EO
if we can have a "@final" tag (like java's final keyword), to
This is a poor man's solution to get rid of group functions such
as clip_set, clip_unset, color_set, etc... See the following
This API needs to be EAPI for elementary but shouldn't be used
outside EFL. This is required purely for legacy compatibility.
Here's the call flow, inside show(obj):
1. if (intercept_show(obj)) return;
3. do other stuff
This should when done enable the possibility for multi screen in wayland
along with remote display, wireless display and screencasting.
evas_render should not be going and doing evas_object_clip_set () at all.
The framespace clip should be enforced at RENDER time.
This brings support for the eo api for external buffers (like
the old data_set / data_get). The new API now works with slices
The internal code still relies on the old cs.data array for
YUV color conversion. This makes the code a little bit too
complex to my taste.
Tested with expedite for RGBA and YUV 422 601 planar, both
SW and GL engines (x11).