If anything in the canvas needs redraw and a snapshot object
happens to intersect with the update region then it was redrawn,
even if all objects below it hadn't changed. This has an insane
performance impact when you apply a blur filter on the snapshot
object. Walking the object list will always be cheaper than
rendering the snapshot!
Note: Added a FIXME comment and forced clean_them to be true
because some odd behaviour happens when breaking with GDB and
the array snapshot_objects keeps growing at each frame (I guess
only if we miss a frame or something like that).
This reuses the existing mask infrastructure, but adds a color
flag to use the whole RGBA range, rather than just the Alpha
channel.
Filters are still very slow (glReadPixels and non-optimized use of
GL buffers...), but this is progress :)
The mouse cursor in a text entry tends to not disappear even when
the mouse moves out of the entry. This seems to happen more when
the cursor was visible for a single frame only (although I'm not
100% sure about this condition).
One important difference with previous versions of EFL is that
the cursor is now part of the theme, so it is an image object
and not set by the compositor (it looks vastly different).
Anyway, when processing the list of pending_objects, we look at
the flags render_pre and rect_del which were (re)set during the
previous frame. Those flags are then (re)set during phase 1 which
happens after processing the pending objects list... only if
needed. So, phase 1 sets the condition to invalidate the current
lists of objects but that condition is checked for before phase 1.
This patch adds a check on delete_me which should hopefully make
it a rare enough case, for performance, but still force correct
rendering.
This is all spaghetti code, sorry if this explanation also reads
like pasta.
Note that exactness tests may still be broken because earlier
versions of EFL simply did not have the cursor inside the canvas
itself.
Fixes T5231
If object's parent has map and object also has map, the evas
clip would be applied twice.
The context already applied clip area when drawing on map_surface.
So don't need more clipping when drawing map_image.
Also, make sure to apply the framespace clip when drawing the map
surface onto the final canvas. Thanks @jiin.moon for the initial
patch (see D4694).
@fix
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
several calls, specifically evas_object_change_reset,
evas_object_cur_prev, and evas_object_clip_changes_clean that are
called directly or indirectly as part of evas render on at least every
active object if not more, were doing full eo obj lookups when their
calling functions already all had the eo protected data looked up.
tha's silly and just adds overhead we don't need. my test dropped
_eo_obj_pointer_get overhead in perf profiles from 4.48% to 2.65%. see:
4.48% libeo.so.1.18.99 [.] _eo_obj_pointer_get
4.23% libevas.so.1.18.99 [.] evas_render_updates_internal
2.61% libevas.so.1.18.99 [.] evas_render_updates_internal_loop
1.68% libeo.so.1.18.99 [.] efl_data_scope_get
1.57% libc-2.24.so [.] _int_malloc
1.42% libevas.so.1.18.99 [.] evas_object_smart_changed_get
1.09% libevas.so.1.18.99 [.] evas_object_clip_recalc.part.37
1.08% libpthread-2.24.so [.] pthread_getspecific
1.05% libevas.so.1.18.99 [.] efl_canvas_object_class_get
1.01% libevas.so.1.18.99 [.] evas_object_cur_prev
0.99% libeo.so.1.18.99 [.] _efl_object_event_callback_legacy_call
0.87% libevas.so.1.18.99 [.] _evas_render_phase1_object_ctx_render_cache_append
0.82% libpthread-2.24.so [.] pthread_mutex_lock
0.81% libevas.so.1.18.99 [.] _evas_render_phase1_object_process
0.79% libc-2.24.so [.] _int_free
vs now the improved:
4.82% libevas.so.1.18.99 [.] evas_render_updates_internal
3.44% libevas.so.1.18.99 [.] evas_render_updates_internal_loop
2.65% libeo.so.1.18.99 [.] _eo_obj_pointer_get
2.22% libc-2.24.so [.] _int_malloc
1.46% libevas.so.1.18.99 [.] evas_object_smart_changed_get
1.04% libeo.so.1.18.99 [.] _efl_object_event_callback_legacy_call
1.03% libevas.so.1.18.99 [.] _evas_render_phase1_object_ctx_render_cache_append
0.97% libeina.so.1.18.99 [.] eina_chained_mempool_malloc
0.93% libevas.so.1.18.99 [.] evas_object_clip_recalc.part.37
0.92% libpthread-2.24.so [.] pthread_mutex_lock
0.91% libevas.so.1.18.99 [.] _evas_render_phase1_object_process
0.84% libc-2.24.so [.] _int_free
0.84% libevas.so.1.18.99 [.] evas_object_cur_prev
0.83% libeina.so.1.18.99 [.] eina_chained_mempool_free
0.80% libeo.so.1.18.99 [.] efl_data_scope_get
of course other things "increase their percentage" as oe overhead now
dropped, and things seem to move around a bit, but it does make sense
to do this with no downsides i can see as we already are accessing the
protected data ptr in the parent func.
so proxies just rendered nothing when used in terminology. they used
to work for the tab switcher (ctl+shift+home). this now works again.
there is a good chance this may break something else though... what i
can't seem to find...
this fixes T5131
i found evas_common_draw_context_apply_cutouts() was procsessing 300+
cutouts and as it's O(n^2)/2 to try and merge adjacent rects for
cutouts this really performs like complete junk. we apply cutout rects
a LOT. this is not the best solution, but it's quick and much faster
than doing the clipouts which drop framerate to like 1-2fps or so in the
nasty case i say (tyls -m of photos in a dir with a 2160 high
terminal).
this figures out the target area to limit the count of rects
significantly so O(n^2) is far far better when n is now < 10 most of
the time. and for the few operations where it's a high value this now
uses qsort to speed up merges etc. etc.
@optimize
This reverts commit 4e110a34bf.
Urgh. I'm stupid. Conceptually I still like the idea of the
region proxy, that only renders part of the source object
in a proxy surface. The problem right now is that the proxy
subrender mechanism renders to a surface that belongs to
the **source** and not the proxy object. As a consequence,
the region would become a property of the source, rather than
the proxy, which is not at all the intention of the original
patch. In other words, everything would fall apart if an object
has more than one proxy, for whatever reason.
I realized that when trying to actually test the region proxy.
It didn't work at all. Revert for now.
After a previous patch, mask_subrender worked differently, and
didn't reset the object's cache clip color. This made evas_suite
fail. But also it seems some other issues creeped in and it was
necessary to fix the test case by adding data_updates (mistake!)
and removing an invalid draw call.
This fixes a rare crash in the SW engine when a masked mask is
to be rerendered. The clip adds more safety as the lower render
draw functions assume it is properly set.
Here's the situation:
1. A container (genlist) has a mask, M0.
2. An item I0 inside this container uses a proxy P0 as render object
rather than the item directly (eg. for zooming in/out).
3. An element E0 inside this item has another mask, M1.
Theory:
1. The proxy surface for P0 is rendered, and M1 is applied to
the element E0.
2. The proxy P0 is rendered on the canvas, with M0 applied.
Practice:
1. The element E0 is prepared for rendering, this triggers
a mask subrender for M1.
2. M1 is rendered with M0 as a prev mask, then kept in cache and
not redrawn (no geometry change, etc...)
3. When P0's surface is rendered, M1's surface is the result of M1+M0.
4. When P0 is drawn on screen, we can see the effect of M1+M0 as
P0's geometry might be different from the item's I0.
Solution:
Discard prev masks and force a mask redraw when we're inside a
proxy. Ideally we should detect if the prev mask belongs to the
insides of the proxy or not.
Problems:
_mask_apply_inside_proxy() is definitely not correct, but it's
not easy to test it. Anyway I believe that in order to properly
implement all of this, we need to rethink evas_render and
the draw context. Non-primary render surfaces (maps, proxies,
masks, filters, ...) should be rendered with a clean context
and clipping, masking, etc should be computed appropriately.
Always assume use_mapped_ctx as true, the caller of evas_render_mapped
must ensure that the context is suitable (so either clean or contains
the appropriate clip info).
Also pass do_async to mask_subrender. For now it will always be reset
to false as the SW engine requires sync render to convert from RGBA
to Alpha (not great). The upcoming GL async engine should be able to
render masks asynchronously without any problem.
This can avoid some invalid render from evas_render_mapped in
a rare case. I'm not sure about the conditions but I know for
sure that at the moment mask_subrender should be only rendering
plain old images into a dedicated surface. So no need for
evas_render_mapped here.
It was never used, except in dubious situations (most likely a typo).
A clean context is now used in the top-most call to
evas_render_updates_internal_loop.
This makes code shorter and easier to read (imo).
Also introduce ENCTX for the engine context. It's used in a couple
places and I believe it's just wrong - but works because the engine
context and the current context are the same.
This will allow partially rendering a proxy in a smaller image,
limited to the specified region. At the moment, this will allow
apps to create proxies of very large objects and let them deal
with the geometry & clipping.
This is not directly solving the issues with adding a filter
to textblock or the infinite page scrollers.
@feature
There were some obj->map->surface validation check
but final map drawing was in the out of the surface valid scope.
Actually, this change does nothing but logically this change makes sense.
This reverts commit bba368cf79.
if this is causing test suite fails ( i saw no actual visual problems
tho in apps or e etc.)... then revert. sadness. :(
remove some extra looping and if checkign that is taken care of
already and just is pointless extra checks in the code creating
overhead. tiny amounts, but the amount of meaty speedups lef it
running low, so profiling, reading, working and repeating.
@optimize
evas_object_clip_recalc was already called ... multiple times in
pending and phase1 on all objects, so there is no value in calling it
again and again in later evbas render phases when it's already been
done.
this and moving this to a real func sees evas_object_clip_recalc usage
in perf drop from 1.8% to 1.4% or so of total perf sample time. tiny
win, but we're at the point where i can't find big meaty wins, so i'm
looking for a string of small wins to add up.
@optimize
evas_object_clip_recalc is big. it's fat. it shouldnt be inline. so
make it a real function. being inline just hurts performance by making
our code bigger, hurting l1 instruction prefetch and cache
performance. this function isn't small. it's huge and should not be
inline basically because of that reason.
also throw in some likely/unlikely hints etc.
@optimize
part of rendering is figuring if obj is inside current geometry.
before we had to actuall poke around inside the object. this moves the
geometry into the active object array so the data is fecthed fast and
already there for filtering as this is the most likely thing to filter
out an object.
unfortunately this seems to have some bugsd and i'm baffled why, so
leave it there and ifdefed out for now for suture hunting.
in much of phase1 we already know the evas object protected data ptr,
so dont scope data get it or even pass the eo obj id around as we can
get it from obj->object
_evas_render_is_relevant() needs the obj protected data, so it gets
scrop data, but the only place it is called already has this pointer,
so avoid an extra lookup.
@optimize
evas render in phase1 in order to generate update rects, active,
render etc. object arrays has to walk every object in our tree. this
is a waste of time if we already have walked objects in a previous
frame if they havent changed, so cache this data in render cache in
smart objects to avoid re-walking and now just dumbly "memcpy" these
cached arrays into the master array. i have seen cpu usage by e drop
like about 15% in the sencarios i'm looking at "enlightenment
compositor with some window updating animation all the time, but most
other stuff being static).
@optimize
these objects don't actually produce - or should produce update
regions etc. etc. as the objects that are clipped should produce those.
they are not active objects. so skip them very early after just
ensuring they are in delete objects if needed.
this refactors _evas_render_phase1_object_process() into a bunch of
sub functions with leaner code, some LIKELY/UNLIKELY hints etc. etc.
in the hope that we have better l1 instruction cache use when
executing. this actually measureably helps and drops the overhead of
this func ANd all its sub functions from (in my tests in enlightenment
compositing while a video plays) from about 13.2% of all cpu usage by
e to 10%. that's about a 25% drop in cost for passing through phase 1
of evas render... and thats a good thing.
and it also makes the code nicer and more broken up.
@optimize
we are passing the same things into every phase 1 process func - the
same ptrs to the same arrays of objects... why eat up valuable
registers with this? collect into context struct and just pass a ptr
to that. this also makes the code easier to read and maintain too so
bonus all over. also a tiny win in performance but i'd say its "within
error margins" (go from 11.48% to 11.42% overhead).
Summary:
In case of thread creation failure, shutdown logic will be stuck.
To prevent stuck, set exit variables to make thread_shutdown working
even if init fails.
Also modify init logics to return init result to a caller.
Reviewers: jypark, woohyun, cedric, jpeg
Subscribers: cedric
Differential Revision: https://phab.enlightenment.org/D4411
Note (@jpeg):
I have modified the patch just a little bit.
Signed-off-by: Jean-Philippe Andre <jp.andre@samsung.com>
this fixes T4904
dif e pops up menus it uses zoomap to zoom an existing object in and
out by forcing it to be mapped and controlling the map geom based on
smart obj geom. this will do a minimum update region for the case
where a map just got turned on or off for an obj since last frame and
cut this out.
@optimize
this is how you would possibly use prepare stages for objects like
image objects by pre-rendering them to a buffer. this is not complete
and it's actually disabled right now, but it's to show how it might be
done. some more exploring is needed, but this is to share how it
might/should work.
preparing an object is a good idea. especially with gl. you want to do
texture uploads BEFORE using textures all in one batch. otherwise this
may mean the gl implementation has to make a copy of your data in a
tmp location then copy it in later when texture becomes "unused" as it
may be in use at the moment, or it may have to stall and wait.
i have seen somewhere around 7-10% speedups on nvidia and intel
drivers with this on given a very special test case i brewed up (1000
32x32 images where i change 1 pixel every frame). this should have
impact really when we are modifying textures a lot. this is all i've
implemented for now, but this should/would/could do much more like
re-order map, proxy renders to render FIRST in a pre-render list
instead of inline and to pre-render fbo/buffer content for complex
objects like text or textblock etc.
After a few patches trying to fix clipping of frame or
non-frame objects the icon finally ended up invisible. Even
if the elm_icon was marked as is_frame, its internal evas
object image would not have the flag set, thus it would be
clipped out.
Solution: Propagate the is_frame flag to all smart children,
not only when setting it but also when adding new members.
A hack with the API indicates that the frame edje is a very
special object that does not propagate the flag.
See also:
7ce79be1a10f6c33eff19c9c8809a7ac5ca9281c
Test cases (in WL or X with client-side decorations on):
1. elementary_test -to Animation
Resize the window to a small size (eg, 100x100) and observe the
balls overflowing outside the window content part. This tests
unclipped normal objects.
2. elementary_test "Window Plug" (requires also Window Socket)
Drag the handles outside the window, observe overflow in the
framespace area. This tests mapped images ('can_map').
3. elementary_test -to "Gesture Layer"
Drag a photo around. This tests non-image mapped objects.
NOTE: This test is badly broken!
This patch fixes both of those issues. I'm not sure what I'm
breaking, though.
When an object inside a genlist is masked, scrolling would
cause render issues as the mask is not redrawn on move (only
the clip geometry is marked as dirty and recalculated, the
mask pixels are assumed to be well prepared already). As a
result, masked objects in a genlist would not show up
properly once you start scrolling.
This fixes that by hacking into evas a safety test to avoid
unnecessary clipping, and by using parent masks even if they
are not the direct clipper.
Note that no_render is still quite broken (eg. a no_render
mask may cause major issues, even crashes).
This reverts 5917b49f59
This patch fixes an issue where border icons were missing when running
EFL Wayland client applications. This also fixes the issue where
softcursor mouse pointers would not draw over bottom window border.
Signed-off-by: Chris Michael <cp.michael@samsung.com>
Due to commit 7ce79be1a1, EFL Wayland
Client applications stopped rendering their window icons. This was due
to the code which tried to detect if an object is in framespace.
Previous version would just check the obj->is_frame flag ... which is
insufficient to determine if an object is in Framespace. This commit
fixes that issue by checking an objects geometry also.
@fix
Signed-off-by: Chris Michael <cp.michael@samsung.com>
i don't know for sure if this fixes T4103 but in theory i think it
might given a reading of the backtrace and a guess at what might
happen, so try this fix. it doesn't hurt and can only help.
@fix
Scenario:
smart {
text
proxy -> text, src_invisible
}
proxy -> smart
What we should see:
smart {
(blank)
proxy -> text
}
proxy -> {
(blank)
proxy -> text
}
What we saw:
smart {
(blank)
proxy -> text
}
proxy -> {
text
proxy -> text
}
Solution:
Check in evas render, when we're inside a proxy render, and the
proxy src_invisible flag is on (evas_object_source_visible_set(0),
that we're rendering the object itself to its proxy surface. If not,
it means we're rendering another proxy surface, ie. a parent smart
object's proxy surface.
Still loving evas render.
Fixes T4006.
@fix
So... I had issues with evas-fb engine which was massively leaking,
one image per frame.
After investigating a bit with @cedric on IRC, the reference count
of the cache entries was always 2 before the engine dropped.
So, for each frame with an animation, we could never drop a cache
entry, leading to a trumendous amount of memory leaking.
Now for non-async rendering, we copy the behaviour of
evas_render_pipe_wakeup() which is called in async-mode,
and actually drops a reference in the cache entry.
Fixes T3763
ssometimes the evas render updates are a list of Render_Updates
structs ... sometimes Eina_Rectangles. this is horrible and i think a
bug turns up (but its not reproducable on linux - just bsd) with an
invalid free ... likely because we free() a ptr from the mem pool
eina_rectangle gets rects from. thats most likely the cause of
https://phab.enlightenment.org/T3226 - but as i can't know for sure,
this is a guess, but readiong the code i see posible vectors of
problemss here ... maybe.
so this redoes the update rects to ALWAYS be Render_Updates struct
and appropriately returns correct structures etc. etc. in api which
demand a list of Eina_Rectangles there.
pending testing on foreign sysstems to confirm this by @netstar
@fix
Setting the no-render flag on an elm widget had no effect,
as it was not properly propagated to its children. This should
fix that, but I'm not a fan of the solution.
Fixes T3371
I just ran my script (email to follow) to migrate all of the EFL
automatically. This commit is *only* the automatic conversion, so it can
be easily reverted and re-run.
Previously, _evas_render_can_use_overlay would segfault here when
trying to make use of an Evas_Video_Surface. This is because eo_tmp
variable was never reassigned to be the smart parent before we tried
to get eo data from it.
@fix
NB: Thanks Frenchie !! ;)
Signed-off-by: Chris Michael <cpmichael@osg.samsung.com>
this is not ideal since it triggers a client-side rerender of every object
which was clipped to the master clip (double render) and then this ends up
forcing the server to rerender the same area twice as well
not only that, it causes all surface damages to to be the size of the entire
window - framespace for every frame
@fix
while not occurring immediately before flush as in sync rendering, this
is functionally close enough that it will serve the purpose for which the
callback was intended, namely receiving a callback that occurs after render
update calculations have occurred but before flush happens
@fix
ref cbb447c878
This reverts commit cbb447c878.
1. this is wrong because evas_render_pipe_wakeup() is being called IN
THE RENDER THREAD. it... SENDS a wakeup back to the mainloop with
evas_async_events_put(data, 0, NULL, evas_render_async_wakeup);
and you can see that evas_render_async_wakeup() calls
evas_render_wakeup() and in evas_render_wakeup() flush pre/post are
called, but since the trhead does the flush we cant realyl call
before/after, but it retains order... IF there are updates (haveup).
so calling these callbacks FROM a thread is now leading to apps
mysteriously exiting. this is mucho bad. just at random i now have my
terminals exiting.
This time it's only about performance. We seem to be setting the
changed flag too often, which might trigger unnecessary redraws.
- map flag is set if there is currently a map AND it's not an image
object (because images can map themselves)
- hmap flag is set if there was a map before
So, map != hmap does not imply a transition between a mapped and
non-mapped state. Add an extra check before marking the clip
as dirty and changed.
While cached surfaces is a topic we're discussing recently,
this code is dead right now, and we will have to redesign the
buffer caching better to handle proxies, maps, smart objects, etc...
When the no_render flag was set on a proxy source, the object would
not be visible, but it would also not render inside the proxy surface,
which completely beats the purpose of this flag. This patch makes
the objects render inside a proxy surface.
evas_object_clipees_has is far cheaper than evas_object_clipees_get in case of checking if
clipees exist or not. This should improve the performance in case of large set of clipees.
@fix
As spotted by @FurryMyad I inverted the logic for source_clip.
This should restore the proper behaviour while keeping my previous
fixes working. See D2940.
This is an ugly hack to fix an issue reported in D3114. I don't
understand how the proposed patch could even fix anything given
the current situation.
Test case:
- Create edje object with textblock inside
- Clip out edje object (--> all children become not visible)
- Take textblock from edje and set it as source of a proxy
- Mark proxy as source_clip
Result: Nothing visible.
Expected: Proxy should contain the textblock object, since
source_clip means we ignore the edje object's clipper, and
only care about the textblock's clipper (entire canvas).
Here's what was happening:
- During a first pass, textblock is not visible, cur->cache.clip
is calculated, marked as clip_use=1 with geom 0,0 0x0
- In a second pass, the proxy is rendered, which needs to draw
the textblock in a surface. But cache.clip was used and it was
wrong.
Solution:
- Ignore cache.clip when rendering inside a proxy. I'm pretty
sure there are other instances where cache.clip will still
be a problem.
Problem: textblock never called relayout since it was not
visible.
Conclusion: cache.clip needs to die. It's a legacy optimization
that now causes more issues than it fixes.